In support of P1485 “Better keywords for coroutines”
Antony Polukhin’s P1485 “Better keywords for coroutines” is an almost-slam-dunk paper in my opinion. It names a glaringly obvious problem, proposes a clean solution, and incidentally solves some other minor problems along the way. I suspect it is going to need all the help it can get, just to get discussed before C++2a is released. (And after C++2a is released, of course it will be too late to change anything.)
To take some examples from Yakk’s StackOverflow explanation of C++2a coroutines… here’s the “before” syntax (current Working Draft syntax):
generator<int> get_integers(int start, int step) {
for (int current = start; true; current += step) {
co_yield current;
}
}
And here’s the “after” (P1485’s proposed syntax):
generator<int> get_integers(int start, int step) async {
for (int current = start; true; current += step) {
yield current;
}
}
Yakk points out that
A function becomes a coroutine by having [a keyword such as
co_await
,co_yield
, orco_return
] in its body. So [without close inspection of every line of the body] they are indistinguishable from functions.
With Antony’s P1485 syntax, this (minor?) problem goes away.
With P1485, coroutines are distinguished from non-coroutines at a fixed place:
to the right of the )
in their function header. You don’t have to spend O(n) time scanning the function body
before deciding what color function
you’re dealing with. This has benefits for humans, of course; but it might also be a benefit to compilers,
who don’t like to use O(n)-lookahead algorithms where O(1) will do.
But compilers do tend to have very clever ways of avoiding O(n)-lookahead algorithms. The situation in the
absence of async
is exactly analogous to the situation with C++14’s auto
return types,
which compilers already handle just fine. (By the way, auto foo() { co_return 42; }
isn’t a valid
function definition, because the compiler can’t interpret the meaning of co_return
without a concrete
return type.) So the O(n) versus O(1) business is really about the human factor.
Here’s another “before” from Yakk’s article:
std::future<std::expected<std::string>> load_data(std::string resource_name)
{
auto handle = co_await open_resource(resource_name);
while (auto line = co_await read_line(handle)) {
if (std::optional<std::string> r = parse_data_from_line(line))
co_return *r;
}
co_return std::unexpected(resource_lacks_data(resource_name));
}
And here’s the P1485 “after”:
std::future<std::expected<std::string>> load_data(std::string resource_name) async
{
auto handle = await open_resource(resource_name);
while (auto line = await read_line(handle)) {
if (std::optional<std::string> r = parse_data_from_line(line))
return *r;
}
return std::unexpected(resource_lacks_data(resource_name));
}
This bit of code is a great example of the “human factors” concerned with coroutine programming.
Seeing the “before” code, without async
in the signature to remind you, you might be forgiven
for recalling your C++ 101 training:
For user-defined types such as
std::string
, pass-by-const-reference is more efficient than pass-by-value.
So you’d rewrite the code to use const std::string& resource_name
, and boom! —
dangling pointer dereference
and undefined behavior! (Notice that resource_name
is used again on the last line of the function
body. That’s the reference that will be dangling, not the first one.)
Having the keyword async
in close proximity to the parameter list could be enough of a reminder
that it would prevent dangling-pointer bugs at least some of the time.
P1485’s async
-on-forward-declaration business is not so hot
P1485 proposes additionally that coroutines may be distinguished at the declaration site from non-coroutines (but they don’t have to be, and there’s no penalty for incorrectly marking a non-coroutine as a coroutine or vice versa). Antony’s example is:
// Is it a coroutine?
template<class T>
T function(std::string s);
// This is a coroutine!
template<class T>
T function(std::string s) async;
This strikes me as problematic; we end up with the same situation as with override
, where we have a way to mark
a signature as yes coroutine (or yes override), but no way to mark it unambiguously not coroutine (or
not override). And unlike override
, there’s no equivalent of “class scope” around this declaration; the compiler
can’t detect places you accidentally forgot the async
, because plenty of people writing
declarations in that same scope — file scope — are deliberately not going to bother writing the async
.
Philosophically, whether a future-returning function is implemented as a coroutine (using co_return
and friends)
or a non-coroutine (using return std::make_ready_future<T>(...)
and friends) is an implementation detail; it
doesn’t need to be committed to in the public API (at the declaration site),
merely in the implementation (at the definition site).
So at best we’re marking the above function
declaration as “Never mind if I’m a coroutine;
still, I promise that I’ll return something awaitable.” But C++2a has much better ways to mark that!
// Is it a coroutine? Probably not.
template<class T>
T function(std::string s);
// This is quite definitely a "coroutine", as far as the caller cares!
template<class Future> requires std::Awaitable<Future>
Future async_function(std::string s);
The release of C++2a should be delayed past 2020
Not just because one of the flagship features (Coroutines) is still undergoing major revisions;
but also because the other flagship features (Concepts, operator<=>
) are undergoing major revisions;
other major features (Modules and constexpr new
) are completely unimplemented; nobody can even agree on
what one major feature (Contracts) means…
Setting realistic deadlines is maybe an art, maybe a science, but regardless, the timetable must match the amount of work to be done. C++2a has a very big amount of work to be done, and work items of the form “implement and get user feedback on _____” aren’t necessarily parallelizable.
When is the best time to catch bugs: while the product is in development, or after it has been released to customers?