Hot take: P0947R0 “Another take on Modules”

I just read (quickly) through Richard Smith’s paper P0947R0 “Another take on Modules”, and here’s some random thoughts on the syntax.

Quick: Which of the following lines of code are valid C++? (The answers are the same in every version between C++03 and C++17, as far as I know.)

/*1*/ template<class T> void foo(T*);
/*2*/ template<class T> void foo(int);
/*3*/ template<> void foo(int*);
/*4*/ template void foo(int*);
/*5*/ extern template<class T> void foo(T*);
/*6*/ extern template<class T> void foo(int);
/*7*/ extern template<> void foo(int*);
/*8*/ extern template void foo(int*);

The answer: Numbers 1, 2, 3, 4, and 8 are valid C++. And they all mean different things. Number 1 declares (but does not define) a function template foo<T> that deduces T from its first argument. Number 2 similarly declares a different function template foo<T> in the same overload set, whose T is not deducible. Number 3 declares (but does not define) an explicit specialization of the template whose primary declaration was Number 1. Number 4, on the other hand, is an explicit instantiation definition — it does not declare any new entity but rather instructs the compiler to generate a definition in this translation unit for foo<int>(int*) (for those keeping track, that’s a specialization of Number 1).

And then Number 8 is an explicit instantiation declaration, which is similar to Number 4 but merely instructs the compiler that foo<int>(int*) exists somewhere else and that it needn’t bother to generate a definition for it in this translation unit, not even if it would otherwise have been implicitly instantiated.

For all the details (okay, maybe half of the details) of C++ template syntax, check out the videos of my two-hour CppCon presentation from 2016, “Template Normal Programming” (Part 1, Part 2).

So why do I bring this up?

Let’s look at this handy image somebody (not I) made of the examples from Richard’s modules paper:

All the examples from P0947R0 in one image

The existence of export module foo;, import foo;, and export import foo;, all doing different things, seems unfortunate to me. It’s uncomfortably reminiscent of how today’s C++ has eight different possible mashups of templatey keywords, five of which compile, and all of which mean subtly different things. I would rather have very clear and human-readable keyword-sequences for operations such as

import module foo;
export module foo;  // export everything we just imported
export class Widget;
export function Widget make();

Now, the counterargument is that Richard did pick reasonably readable syntax for many operations — notably, export class Widget; is proposed by P0947R0, and does exactly what it says on the tin. It’s just that there are so dang many different operations that are desired by so many different stakeholders in Modules right now!

Here are all the different syntax mashups I found in a quick skim through P0947. Again, these are all actually being proposed, together, in P0947, as far as I know.

// Module implementations use "module" without "export". Their names can be dotted.
module widget;
module widget.nest1;

// Module interfaces use "export module". They can have internal partitions.
export module widget;
export module widget:part1;
export module widget.nest1;

// "import" works as you'd expect, except that partitions can't be seen outside the current module.
import widget;
import widget.nest1;
import :part1;

// Imports can be atomically re-exported.
export import widget;
export import widget.nest1;
export import :part1;

// "public import" is special syntax to support wrappers like <cstdlib>.
public import widget;
public import widget.nest1;
public import :part1;

// The keyword "export" (only) can be prefixed to just about any declaration.
export class Widget;
export using Handle = Impl *;
export Widget make();
export int n;
export template<class T> void foo() {}
export namespace C { int n; }
export namespace C {}
export extern "C++" { int n; }

// Legacy headers have a new custom syntax involving angle brackets but also semicolons.
export module <header.h>;
import <header.h>;
export import <header.h>;
public import <header.h>;

// Macros are exported via a completely different syntax, styled to look like a preprocessor directive.
// It does wildcard globbing with '*' (only).
#export EXIT_FAILURE
#export *_MIN
#export *

Notice that both export namespace C { int n; } and export namespace C {} are used in the paper. The latter specifically does not make the name C::n visible to importers of the current module; when you export a namespace definition, you are exporting only the names declared inside that particular set of curly braces. The paper does not clearly say whether export namespace C; would be legal syntax, and if so, what it would do.

My complaint about this mishmash of syntax (which, again, is arguably being forced upon us by the huge number of different things people want to do with modules — particularly see public import above) is that, as with the template examples at the top of this post, it’s too hard to match up the programmer’s intent with the appropriate syntax.

One way to improve the readability and “intent-ful-ness” of this proposal would be to do away with the idea that you can just slap export in front of any declaration (probably repeating the declaration in the process); instead, we could just say that our intent is always to export either the definition of a name (such as a class definition or the definition of an inline function), or the declaration of that name (such as a variable or function or class declaration), or perhaps the contents of a namespace as a special case. And then we use contextual keywords to make that intent explicit in the source code…

export declaration Widget;
export definition Handle;
export declaration make;
export declaration n;
export definition foo;
export contents C;
export extern "C++" { int n; }  // okay, legacy headers still get their special cases

(Notice that this eliminates our ability to have an overload set named make that contains more members inside the module than outside it. Is this a good idea? You be the judge!)

Bonus rant!

My choice of syntax above has another very nice feature, besides being expressive: it is extensible. Consider P0947R0’s proposed syntax of

export Widget make();

Notice that what we have here is the (contextual) keyword export, immediately followed by a declaration, which means that we have to parse arbitrarily many tokens before we learn what is actually being exported — and all of those tokens can be user-defined identifiers. So, if in the future we wanted to support an extension such as

export autonamespaced Widget make();

it might be tricky to shoehorn in, because export autonamespaced ... already has a meaning in P0947R0: it means that autonamespaced is a user-defined type-name of some sort, and we’re looking for a declarator to follow it. Whereas with the “extensible” syntax, we have a natural place to hang new functionality — we simply introduce a brand-new contextual keyword!

export declaration make;
export autonamespaced declaration make;

We should explicitly leave some “breathing room” in the syntax, so that we don’t end up with Perl, where everything is a valid program. In my example syntax, the “breathing room” — the thing you can type to get an invalid program every time — is:

export anything-except-'declaration'-or-'definition'-or-'contents' ...

I would have thought someone would have learned that lesson from the using [typename] T = U; problem. In C++11, a new syntax for introducing “type aliases” was introduced:

using T = int;

As soon as C++11 was out the door, people started asking for the ability to create “variable aliases”, “member function aliases”, and so on:

class Base {
protected:
    int value_;
};

class Derived : private Base {
public:
    using exposed_value = Base::value_;
};

Unfortunately, we can never get this syntax, because using exposed_value = ... already has a meaning in C++11! It means that exposed_value is a newly introduced typename, and the thing on the right-hand side of the = must be a type-name of some sort.

If only someone had proposed

using typename T = int;

back in C++11!

All names, syntaxes, and incidents portrayed in this rant are long past and have no bearing on future language designs. Any resemblance to the current state of “Concepts” in C++2a (P0791, P0807, P0873) is purely coincidental. /s

Posted 2018-03-14