Happy birthday, Donald Knuth! and lamp-trolls
Today (January 10th) is Donald Knuth’s 85th birthday. Happy birthday, Dr. Knuth!
A few months ago I was reading Knuth’s Selected Papers on Computer Languages, which contains an interesting paper titled “Efficient Coroutine Generation of Constrained Gray Sequences” written on the occasion of Ole-Johan Dahl’s seventieth birthday (2001). The paper describes an algorithm for generating Gray sequences — that is, sequences of bit-strings such that each bit-string differs from the previous bit-string in only one position. For example, we can run through all three-bit bit-strings in a Gray sequence as follows:
000
, 001
, 011
, 010
, 110
, 111
, 101
, 100
(Notice that this is also a Gray cycle, because the head and tail of the sequence differ by only
one bit. So you can loop around to the beginning — or, prefix another bit, toggle that bit, count
back down, and ta-da, you have a four-bit Gray cycle! If this recursive construction reminds you of
the Towers of Hanoi, that’s not a coincidence.
The above Gray sequence also represents a Hamiltonian path
among the eight corners of a unit cube, starting at \((0,0,0)\) = 000
.)
To generate the above Gray sequence of bit-strings, Knuth uses a “family of coroutines,” which he personifies as “an array of friendly [Norwegian] trolls.”
Each troll carries a lamp that is either off or on; he also can be either awake or asleep. Initially all the trolls are awake, and all their lamps are off.
A troll may be “poked.” (Pokes always come from one’s right neighbor.)
When an awake troll is poked, he toggles his lamp and then goes to sleep.
When a sleeping troll is poked, he wakes up and then pokes his left neighbor.
In pseudo-Algol — including Algol’s 1-based indexing and parens-less function-call syntax,
but writing yield
where Knuth wrote return
—
Boolean coroutine poke[k];
while true do begin
// the troll is now awake
a[k] := 1 − a[k]; // toggle the lamp
yield true; // and we're done being poked
// the troll is now asleep
if k > 1 then
yield poke[k−1] // poke neighbor, and we're done
else
yield false; // we're done
end.
We’ll have a “driver” program that calls poke[n]
in a loop, until it returns false
:
that signifies that the leftmost troll would have poked its left neighbor if it had one,
but it doesn’t; which means we’ve finished iterating this particular Gray sequence.
I thought this would be a neat way to experiment with C++20 coroutines. Notice that Knuth’s trolls
don’t appear quite as well-structured as “generators” (Python yield
, C++23 std::generator<T>
),
but also aren’t as symmetric as the coroutines used in Knuth’s elevator simulator
(TAOCP volume 1, section 2.2.5). A troll can’t go poke any arbitrary
troll in line; each troll only pokes its neighbor, and the pokee’s poke
routine will eventually
“return” control to the poker (not to anybody else). In other words, we still have a stack discipline.
Each troll in the stack does maintain its own state, i.e., its lamp; but that’s not unusual either —
that’s just saying that the trolls are objects with member data. The unusual thing about these
trolls, compared to your average C++ object, is that the troll’s “wakefulness” state resembles control flow
more than data.
Translated into C++14 objects
Suppose our driver function looks like this:
class Troll;
void driver(int n) {
auto lamps = std::deque<bool>(n, false);
auto trolls = std::vector<Troll>(n);
for (int i=0; i < n; ++i) {
trolls[i] = Troll::make(i > 0 ? &trolls[i-1] : nullptr, &lamps[i]);
}
while (trolls[n-1].poke()) {
printf("Lamps are: ");
for (bool b : lamps) {
printf("%c", (b ? '1' : '0'));
}
printf("\n");
}
}
Then we could implement class Troll
in C++14 as follows:
struct Troll {
Troll *left_neighbor = nullptr;
bool *lamp = nullptr;
bool is_asleep = false;
bool poke() {
if (!is_asleep) {
is_asleep = !is_asleep;
*lamp = !*lamp;
return true;
} else {
is_asleep = !is_asleep;
return (left_neighbor ? left_neighbor->poke() : false);
}
}
static Troll make(Troll *t, bool *lamp) {
return Troll{t, lamp, false};
}
};
But notice that we have to manage the is_asleep
flag manually. Knuth’s coroutine
specification manages that state automatically, simply by “remembering” where it left off
after each yield
. The troll’s “wakefulness” state seems to correspond more to a position in code
than to a piece of bits-and-bytes data. We might refer to it as a “suspend point,” rather
than as a piece of “state” per se.
Also, please note that while this particular species of troll has only two suspend points
(so we can get away with a bool is_asleep
), Knuth’s actual paper immediately proceeds to describe
two more species of troll with six and eight suspend points, respectively; and then starts combining
and crossbreeding them in clever ways. Representing suspend points as data doesn’t
scale very well.
Translated into (idiosyncratic) C++20 coroutines
So let’s simplify our Troll
using C++20 coroutine syntax.
struct Troll : TrollBase<Troll> {
static Troll make(Troll *left_neighbor, bool *lamp) {
while (true) {
*lamp = !*lamp;
co_yield true; // and be asleep
co_yield (next_troll ? next_troll->poke() : false); // and be awake
}
}
};
This C++20 code matches Knuth’s pseudo-Algol, line for line. Of course, we cheated a little bit:
we hid some really icky stuff in that CRTP
base class TrollBase
! Here’s the icky stuff:
template<class Derived>
struct TrollBase {
struct promise_type;
using handle_t = std::coroutine_handle<promise_type>;
struct promise_type {
Derived get_return_object() { Derived d; d.coro_ = handle_t::from_promise(*this); return d; }
auto initial_suspend() { return std::suspend_always(); }
auto final_suspend() noexcept { return std::suspend_never(); }
auto yield_value(bool b) noexcept {
value_ = b;
return std::suspend_always();
}
void unhandled_exception() {}
bool value_ = false;
};
explicit TrollBase() = default;
TrollBase(TrollBase&& rhs) noexcept : coro_(std::exchange(rhs.coro_, nullptr)) {}
void operator=(TrollBase rhs) noexcept { std::swap(coro_, rhs.coro_); }
~TrollBase() { if (coro_) coro_.destroy(); }
bool poke() {
coro_.resume(); // should modify value_
return coro_.promise().value_;
}
private:
handle_t coro_ = nullptr;
};
Notice that the only “data member” in this whole contraption is handle_t coro_
.
The C++14 version’s data member is_asleep
became the coroutine’s suspend point.
The data members left_neighbor
and lamp
became local variables
of the coroutine Troll::make
, which means they live (not on the stack, not inside the
Troll
object, but) inside the Troll
’s coroutine frame, which is heap-allocated and
managed by the coro_
handle. We don’t think of this Troll
as “having” that “state”
anymore; instead, the Troll
is just a piece of code, a coroutine, which manages its
local variables in a natural way.
By the way, I don’t claim that this implementation of TrollBase
is great code. I hacked it
into shape by cannibalizing one of the generator
implementations from Quuxplusone/coro.
To see the entire C++20 program in action (and some of the more complicated species of troll), check out Quuxplusone/KnuthElevator on GitHub.
Translated into Python
Peter Brady did a very clean Python version of this code. His version of the simple troll and its driver boils down to just this:
def poke(k, lights):
coro = poke(k-1, lights) if k > 1 else None
while True:
lights[k-1] = 1 - lights[k-1]
yield True
yield next(coro) if coro else False
def driver(n):
lights = [0] * n
coro = poke(n, lights)
while next(coro):
print(*lights, sep='')
This version benefits from the structure of the troll array: each troll in this version
ends up privately owning its left neighbor, as opposed to the C++ version where we have an array of
Troll
objects each of whom holds a non-owning pointer to its left neighbor. Either way is fine,
since we only ever poke the rightmost troll.
Future directions
The trolls above produce Gray sequences that visit all \(2^n\) \(n\)-bit bit-strings. But Knuth’s goal
is to produce constrained Gray sequences: sequences that visit only the bit-strings in a certain set,
where the set of bit-strings to be toured is defined in terms of the relations between the bits. For example, we
might say “You should visit any string \(b_0 b_1 b_2 b_3\) where \(b_0\leq b_2\),” that is, you must visit 0101
and 0010
, but you must never visit 1000
or 1101
. That’s one example of a constraint. Knuth proves that any
satisfiable set of constraints can always be represented as a totally acyclic directed graph (or “acyclic poset”)
on \(n\) nodes. Knuth gives several special-case examples (besides the special “no constraint” case we explored here)
before describing a divide-and-conquer solution to the entire problem.
Both Peter and I only went as far as Knuth’s “fence digraph” special case (the species of troll he names nudge
),
discussed on page 10 of Knuth’s paper. But the paper really takes off
around page 14, where Knuth starts describing how to compose multiple coroutines together. It would be very
interesting to see what that would look like in C++.
It would also be interesting to see a C++ solution modeled on Peter’s Python solution, using C++23’s std::generator
.
Would that be more, or less, comprehensible to the average programmer than the array-of-Troll
version?
Is there some way to preserve the array of Troll
objects, but still implement Troll
in terms of std::generator
?