What are X-macros?

“X-macros” is a neat C and C++ technique that doesn’t get enough advertisement. Here’s the basic idea:

Suppose we have a table of similar records, each with the same schema. In games, this might be our collection of monster types (each with a display name, a representative icon, a dungeon level, a bitmask of special attack types, etc). In networking, this might be our collection of error codes (each with an integer value, a message string, etc).

We could encode that information into a data structure that we traverse at runtime to produce interesting effects — for example, an array of structs that we index into or loop over to answer a question like “What is the error string for this enumerator?” or “How many monsters have dungeon level 3?”.

But the “X-macros” technique is to encode that information in source code, which can be manipulated at compile time. We encode the information generically, without worrying about how it might be “stored” at runtime, because we’re not going to store it — it’s just source code! We encode it something like this:

// in file "errorcodes.h"
X(EPERM,  1, "Operation not permitted")
X(ENOENT, 2, "No such file or directory")
X(ESRCH,  3, "No such process")
X(EINTR,  4, "Interrupted system call")

// in file "monsters.h"
X(dwarf,     'h', 2, ATK_HIT,  0)
X(kobold,    'k', 2, ATK_HIT,  IMM_POISON)
X(elf,       '@', 3, ATK_HIT,  0)
X(centipede, 'c', 3, ATK_BITE, 0)
X(orc,       'o', 4, ATK_HIT,  IMM_POISON)

Now we’ve got a header file that encodes in source code all the data you might want about your error codes, or monsters, or whatever. If you need an enumeration type for your monsters, that’s easy to whip up:

enum Monster {
#define X(name,b,c,d,e) MON_##name,
#include "monsters.h"
#undef X

Monster example = MON_centipede;

Instead of array indexing, X-macros push you toward switch as your fundamental building block:

bool is_immune_to_poison(Monster m) {
    switch (m) {
#define X(name,b,c,d,imm) case MON_##name: return (imm == IMM_POISON);
#include "monsters.h"
#undef X

Instead of looping over the whole collection (say, from MON_first to MON_last), X-macros push you toward writing straight-line code that unrolls the loop:

int count_monsters_of_level(int target_level) {
    int sum = 0;
#define X(a,b,level,d,e) sum += (level == target_level);
#include "monsters.h"
#undef X
    return sum;

Or even this:

int number_of_monster_types() {
    return 0
#define X(a,b,c,d,e) +1
#include "monsters.h"
#undef X

Variations, upsides, downsides

The name “X-macros” comes from the stereotypical name of the macro in question; but of course the macro doesn’t have to be named X. In fact, at least two of the examples below use multiple macros (for different kinds of data records) intermixed in the same file.

A few commenters on this post have shown a variation on this technique, which is also reproduced (more or less) on the relatively low-quality Wikipedia page on X-macros:

// in file "monsters.h"
#pragma once
X(dwarf, 'h', 2, ATK_HIT, 0) \
X(kobold, 'k', 2, ATK_HIT, IMM_POISON) \
X(elf, '@', 3, ATK_HIT, 0) \
X(centipede, 'c', 3, ATK_BITE, 0) \
X(orc, 'o', 4, ATK_HIT, IMM_POISON)

// in the caller's code
#include "monsters.h"
#define X_IMM_POISON(name,b,c,d,imm) case MON_##name: return (imm == IMM_POISON);
bool is_immune_to_poison(Monster m) {
    switch (m) {

I suppose one advantage of this variation is that it makes “monsters.h” idempotent. This lets you put the #include directive up at the top of the caller’s file, next to #include <stdio.h> and so on. A big disadvantage (or so I would think) is that by putting the whole list into the macro expansion of LIST_OF_MONSTERS, you’re probably harming the quality of error messages and debug info you’ll get, and may even overwhelm your compiler’s internal limit on the size of a macro expansion. You also have to come up with a name for the macro LIST_OF_MONSTERS, and make sure it never collides with anything in the entire rest of your codebase. (In the original, there are no global names: the name X never leaks outside the immediate context of “monsters.h”). You also have to remember to type all those backslashes in “monsters.h”. Personally, I would avoid this variation.

It occurs to me that this variation is to my preferred variation more or less as a named function is to a lambda-expression.

A downside of the technique (either variation) is that each user of “monsters.h” must know the arity of X. If we decide that each monster also needs a boolean flag for intelligence, then not only do we have to change “monsters.h” to set that flag for each monster—

// in file "monsters.h"
X(dwarf,     'h', 2, ATK_HIT,  0,          true)
X(kobold,    'k', 2, ATK_HIT,  IMM_POISON, true)
X(elf,       '@', 3, ATK_HIT,  0,          true)
X(centipede, 'c', 3, ATK_BITE, 0,          false)
X(orc,       'o', 4, ATK_HIT,  IMM_POISON, true)

—but we also have to change every single call-site to add a sixth parameter to the definition of X, even if it’s irrelevant to most callers:

bool is_immune_to_poison(Monster m) {
    switch (m) {
#define X(name,b,c,d,imm,f) case MON_##name: return (imm == IMM_POISON);
#include "monsters.h"
#undef X

If this had used a lookup in a runtime data structure, like return (monsters[m].imm == IMM_POISON), then we could have added an “intelligence” field to the monster schema without needing a source-code change here.

Another downside is that if you have a lot — say, thousands — of data records, then X-macros will lead you to write a lot of “unrolled loops” consisting of thousands of C++ statements. The compiler might struggle to deal with these. See for example “The surprisingly high cost of static-lifetime constructors” (2018-06-26).

Examples of X-macros in real code

My (incomplete) port of Luckett & Pike’s Adventure II uses X-macros in “locs.h”, included three times from “adv440.c”. This was a hack to get the game to fit into the Z-machine’s memory, which has very little space for native C data such as arrays of char, but essentially infinite space for text if all you’re doing is printing it out. So I used X-macros here to rewrite a few trivial but space-hogging functions of the form


into tedious-looking, but extremely space-efficient, switch tables of the form

switch (loc) {
    case R_ROAD: puts("You're at the end of the road again."); break;
    case R_HILL: puts("You're at the hill in road."); break;
    case R_HOUSE: puts("You're inside the building."); break;

NetHack uses a variation on X-macros in “artilist.h”; the variation is that “artilist.h” itself checks to see where it’s being included from and will define A appropriately for that includer, instead of making the includer define A themselves.

HyperRogue uses X-macros for its monsters, items, and terrain, in “content.cpp”; you can see some of the ways it’s included from “classes.cpp” and “landlock.cpp”.

This StackOverflow question from 2011 gives some more examples of X-macro usage in the real world.

Posted 2021-02-01