Hello World with C++2a modules
Here’s how to build a “Hello world” program using Clang’s implementation of C++2a Modules, as it currently stands as of November 2019. The compiler driver interface described here is practically guaranteed to change over the next couple of years, but this seems to be how it works right now in trunk.
For instructions on how to build Clang trunk in the first place, see “How to build LLVM from source” (2019-11-09).
Thanks to Steven Cook and Steve Downey for helping me puzzle this out today on Slack!
Step 1: Create main.cpp
Create your main.cpp
with these contents:
import helloworld;
int main() { hello(); }
Step 2: Create helloworld.cpp
The naming convention for module interface units isn’t established yet.
Nathan Sidwell (2019) uses the convention “module name plus -m.cpp
.”
Boris Kolpackov (September 2017) suggests “module name plus .mpp
,”
but that seems confusing to me in light of Objective-C++’s established .mm
extension.
For this example I see no reason not to just use “module name plus .cpp
.”
The naming convention for modules themselves also isn’t established yet. I’m choosing to name my
module module helloworld
, all lowercase. Notice that a module is not the same thing as a namespace!
This example uses only the global namespace.
So, create your helloworld.cpp
with these contents:
module;
#include <stdio.h>
export module helloworld;
export void hello();
module :private;
void hello() { puts("Hello world!"); }
Notice that #include <stdio.h>
causes the compiler to see a bunch of declarations for things like printf
that are not supposed to be part of module helloworld
, so it’s important to place that #include
directive
outside of module helloworld
, in what’s called the “global module fragment” or “GMF.”
We want module helloworld
’s interface to contain the declaration — but not the definition — of
void hello()
. So we place hello
’s definition down at the bottom, below the incantation
module :private
, in what’s called the “private module fragment” or “PMF.”
This mechanism of dividing a source file up into regions based on the placement of top-level declarations, rather than curly-braced lexical scopes, feels extremely “unlike C++” to me. I don’t know what the original rationale was for doing it this way.
Also, why is the latter incantation
module :private
instead ofmodule private
? I don’t know.
UPDATE: Further discussion has revealed that in this case we don’t need to
use the PMF. We can place the body of hello()
into the exported interface —
Clang will produce a slightly fatter .pcm
file, but that’s probably no big deal
and/or a simple bug in the compiler. So we can write helloworld.cpp
like this:
module;
#include <stdio.h>
export module helloworld;
export void hello() { puts("Hello world!"); }
Furthermore, once vendors support the C++2a notion of “header units,” we’ll be able to write
import <some-header.h>;
in the module’s own purview, instead of writing #include <some-header.h>
in the GMF. Clang supports <cstdio>
(but not <stdio.h>
) as a “header unit” if you
pass some extra flags — which means we can also forgo the GMF, like this:
export module helloworld;
import <cstdio>;
export void hello() { puts("Hello world!"); }
END UPDATE.
Step 3: Precompile helloworld.cpp
into helloworld.pcm
Now compile helloworld.cpp
into a BMI.
Clang’s default expectation is that each BMI file will have a basename equal to the module name,
and the extension .pcm
(for “precompiled module”).
To create a BMI, you need to pass -Xclang -emit-module-interface
to the compiler driver:
clang++ -std=c++2a -c helloworld.cpp -Xclang -emit-module-interface -o helloworld.pcm
Remember, a BMI is not an object file! Functionally, it is the equivalent of a PCH file or a plain old
header file. It contains the high-level information necessary to use our void hello()
;
it does not contain the actual x86-64 instructions that implement void hello()
.
UPDATE: If you wrote import <cstdio>
in your .cpp file, you’ll have to add two extra switches
to your compiler command line, so that the compiler knows (A) that <cstdio>
corresponds to a header unit,
and (B) how to find or create the corresponding module.
clang++ -std=c++2a -fimplicit-modules -fimplicit-module-maps -c helloworld.cpp -Xclang -emit-module-interface -o helloworld.pcm
Step 4: Compile *.cpp
into a.out
Our final step is to compile all our .cpp
files and link them together.
clang++ -std=c++2a -fprebuilt-module-path=. main.cpp helloworld.cpp
We must pass -fprebuilt-module-path=.
so that, when the compiler sees import helloworld
, it’ll know
where to look for helloworld.pcm
. This is exactly analogous to how we would have had to pass -I.
so that when the compiler saw #include "helloworld.h"
it would know where to look for helloworld.h
.
The convention for the name of this compiler driver option is not yet standardized; I predict that
eventually Clang will pick something shorter and more -I
-like.
(Unfortunately, -M
is taken.)
Again, if you wrote import <cstdio>
in your helloworld.cpp, you’ll have to add
-fimplicit-modules -fimplicit-module-maps
to this command line.
And now we have an executable that we can run!
./a.out
Hello world!
Notice that we collapsed a few steps into one, in that last command line there. We could have written it all out longhand like this:
clang++ -std=c++2a -fimplicit-module-maps -fprebuilt-module-path=. main.cpp -c
clang++ -std=c++2a -fimplicit-modules -fimplicit-module-maps helloworld.cpp -c
clang++ -std=c++2a main.o helloworld.o
We didn’t need to pass -fprebuilt-module-path
when compiling helloworld.cpp
, because helloworld.cpp
doesn’t import
any BMIs of ours and therefore doesn’t need to know the search path for them.
We didn’t need to pass -fimplicit-modules
when compiling main.cpp
, because main.cpp
doesn’t use the import <...>
or import "..."
syntax to import any header units. On the other
hand, we still have to pass -fimplicit-module-maps
because main.cpp
imports helloworld
which transitively depends on a header unit, so we still have to tell the compiler how to find
that transitively-included module. I suspect I don’t fully understand the details here.
It is mildly surprising to me that we need to build
main.cpp
with different command-line arguments depending on the “implementation details” ofhelloworld.pcm
. However, notice that this is not terribly different from how transitive dependencies with#include
work right now. If yourfoo.cpp
includes<SDL.h>
which transitively includes<iconv.h>
, then you’d better compilefoo.cpp
with-I
options that help the compiler find both of those headers, whether you know about the transitive dependency or not.
Arguably there ought to be some way to combine Step 3 (precompiling helloworld.cpp
into helloworld.pcm
)
with Step 4 (compiling helloworld.cpp
into helloworld.o
) in just a single compiler invocation with
two different output files.
I’m not aware that Clang supports any such feature at the moment, but it might be coming soon
and/or it might already exist and I just don’t know about it.