In support of P1485 “Better keywords for coroutines”

Antony Polukhin’s P1485 “Better keywords for coroutines” is an almost-slam-dunk paper in my opinion. It names a glaringly obvious problem, proposes a clean solution, and incidentally solves some other minor problems along the way. I suspect it is going to need all the help it can get, just to get discussed before C++2a is released. (And after C++2a is released, of course it will be too late to change anything.)

To take some examples from Yakk’s StackOverflow explanation of C++2a coroutines… here’s the “before” syntax (current Working Draft syntax):

generator<int> get_integers(int start, int step) {
    for (int current = start; true; current += step) {
        co_yield current;
    }
}

And here’s the “after” (P1485’s proposed syntax):

generator<int> get_integers(int start, int step) async {
    for (int current = start; true; current += step) {
        yield current;
    }
}

Yakk points out that

A function becomes a coroutine by having [a keyword such as co_await, co_yield, or co_return] in its body. So [without close inspection of every line of the body] they are indistinguishable from functions.

With Antony’s P1485 syntax, this (minor?) problem goes away. With P1485, coroutines are distinguished from non-coroutines at a fixed place: to the right of the ) in their function header. You don’t have to spend O(n) time scanning the function body before deciding what color function you’re dealing with. This has benefits for humans, of course; but it might also be a benefit to compilers, who don’t like to use O(n)-lookahead algorithms where O(1) will do.

But compilers do tend to have very clever ways of avoiding O(n)-lookahead algorithms. The situation in the absence of async is exactly analogous to the situation with C++14’s auto return types, which compilers already handle just fine. (By the way, auto foo() { co_return 42; } isn’t a valid function definition, because the compiler can’t interpret the meaning of co_return without a concrete return type.) So the O(n) versus O(1) business is really about the human factor.

Here’s another “before” from Yakk’s article:

std::future<std::expected<std::string>> load_data(std::string resource_name)
{
    auto handle = co_await open_resource(resource_name);
    while (auto line = co_await read_line(handle)) {
        if (std::optional<std::string> r = parse_data_from_line(line))
            co_return *r;
    }
    co_return std::unexpected(resource_lacks_data(resource_name));
}

And here’s the P1485 “after”:

std::future<std::expected<std::string>> load_data(std::string resource_name) async
{
    auto handle = await open_resource(resource_name);
    while (auto line = await read_line(handle)) {
        if (std::optional<std::string> r = parse_data_from_line(line))
            return *r;
    }
    return std::unexpected(resource_lacks_data(resource_name));
}

This bit of code is a great example of the “human factors” concerned with coroutine programming. Seeing the “before” code, without async in the signature to remind you, you might be forgiven for recalling your C++ 101 training:

For user-defined types such as std::string, pass-by-const-reference is more efficient than pass-by-value.

So you’d rewrite the code to use const std::string& resource_name, and boom! — dangling pointer dereference and undefined behavior! (Notice that resource_name is used again on the last line of the function body. That’s the reference that will be dangling, not the first one.)

Having the keyword async in close proximity to the parameter list could be enough of a reminder that it would prevent dangling-pointer bugs at least some of the time.

P1485’s async-on-forward-declaration business is not so hot

P1485 proposes additionally that coroutines may be distinguished at the declaration site from non-coroutines (but they don’t have to be, and there’s no penalty for incorrectly marking a non-coroutine as a coroutine or vice versa). Antony’s example is:

// Is it a coroutine?
template<class T>
T function(std::string s);

// This is a coroutine!
template<class T>
T function(std::string s) async;

This strikes me as problematic; we end up with the same situation as with override, where we have a way to mark a signature as yes coroutine (or yes override), but no way to mark it unambiguously not coroutine (or not override). And unlike override, there’s no equivalent of “class scope” around this declaration; the compiler can’t detect places you accidentally forgot the async, because plenty of people writing declarations in that same scope — file scope — are deliberately not going to bother writing the async.

Philosophically, whether a future-returning function is implemented as a coroutine (using co_return and friends) or a non-coroutine (using return std::make_ready_future<T>(...) and friends) is an implementation detail; it doesn’t need to be committed to in the public API (at the declaration site), merely in the implementation (at the definition site).

So at best we’re marking the above function declaration as “Never mind if I’m a coroutine; still, I promise that I’ll return something awaitable.” But C++2a has much better ways to mark that!

// Is it a coroutine? Probably not.
template<class T>
T function(std::string s);

// This is quite definitely a "coroutine", as far as the caller cares!
template<class Future> requires std::Awaitable<Future>
Future async_function(std::string s);

The release of C++2a should be delayed past 2020

Not just because one of the flagship features (Coroutines) is still undergoing major revisions; but also because the other flagship features (Concepts, operator<=>) are undergoing major revisions; other major features (Modules and constexpr new) are completely unimplemented; nobody can even agree on what one major feature (Contracts) means…

Setting realistic deadlines is maybe an art, maybe a science, but regardless, the timetable must match the amount of work to be done. C++2a has a very big amount of work to be done, and work items of the form “implement and get user feedback on _____” aren’t necessarily parallelizable.

When is the best time to catch bugs: while the product is in development, or after it has been released to customers?

Posted 2019-06-26