Prefer core-language features over library facilities
There are many places in C++ where the working programmer has a choice between doing something via a standard-library facility, or via a similar (often newer) core-language feature. I keep expanding my mental list of these, so let’s write them all down in one place.
First, the caveats: This post’s title is expressed as advice to working programmers,
not language designers. One of the coolest things about C++ is that “C++ doesn’t have
a string type”; things that other languages must do in the core language, C and C++
can and often will do in the library instead. (Another classic example is C’s printf
.)
If you’re designing a language, I think it’s a fantastic idea to make a small core language
that’s powerful enough to implement most things in the library instead. The Committee should
indeed prefer to extend the library where possible, rather than keep adding to the core language.
But as a working programmer, you don’t need to worry about what the core language should provide;
you just need to know when a standard-library facility has been obsoleted by expansions to
the core language. Here are some examples.
These examples deliberately include some that will make you say “Duh, everyone knows that one,” and some that will make some people say “Whoa, I disagree there!” However, these are all cases that fall into the same category for me personally: places where the core-language feature should be preferred.
Prefer alignof
over std::alignment_of
static_assert(std::alignment_of_v<Widget> == 8); // worse
static_assert(alignof(Widget) == 8); // better
Training-course students regularly ask me why <type_traits>
provides std::alignment_of
at all, if alignof
exists. The answer (as far as I know) is that the C++11 cycle took
eight years. So they brought in std::alignment_of
from Boost, and then at some later point —
maybe years later — they invented the alignof
keyword. This made std::alignment_of
redundant;
but it’s a lot harder to remove something than to add it originally, especially since people
were already using std::alignment_of
in real “C++0x” code.
The same situation, even worse, applies to…
Prefer alignas
over std::aligned_storage_t
std::aligned_storage<sizeof(Widget)> data; // utterly wrong
std::aligned_storage_t<sizeof(Widget)> data; // still kind of wrong
std::aligned_storage_t<sizeof(Widget), alignof(Widget)> data; // correct but bad
alignas(Widget) char data[sizeof(Widget)]; // correct and better
std::aligned_storage
was also brought in from Boost during the C++11 cycle,
before they invented the alignas
keyword. It has many problems: The actual class
type aligned_storage<T>
doesn’t provide storage; instead, you must
be careful to use aligned_storage<T>::type
a.k.a. aligned_storage_t<T>
.
The template takes a second parameter that’s defaulted and thus easy to forget;
if you omit that parameter, you might get storage that’s not aligned. Or worse,
overaligned and thus bigger than expected!
The core-language alignas(T)
attribute has none of these problems. It Just Works.
Prefer char
over std::byte
This is a controversial one, but I stick to it dogmatically. std::byte
is a library facility that arrived in C++17 specifically as an enumeration type for talking about octets.
To use it, you must include <cstddef>
. It’s not an integral type (good?), but on the other hand it
still supports all the integral bitwise operations such as &
, |
, <<
, which (because it is an enum type
instead of an integral type) it must provide via overloaded operators in namespace std
. It’s just
so heavyweight
for what it is… which, again, is a verbose reinvention of the built-in primitive type char
.
std::byte data[100]; // worse
char data[100]; // better
Similarly…
Prefer void*
over std::byte*
I believe this example is less controversial than the preceding one. Suppose you’re writing a function
to copy some dumb bytes from point A to point B. How should you write its parameter types?
I’ve heard at least one person say “I’ll pass a std::byte*
, because my function deals in bytes.”
But that requires casting at every call-site! Instead, we should use void*
, which is the
C++ vocabulary type for “pointer to untyped memory” (i.e., just a bunch of dumb bytes).
int data[10];
// worse
void mycopy(std::byte *dst, const std::byte *src, size_t n);
mycopy((std::byte*)data, (std::byte*)(data+5), 5 * sizeof(int));
// better
void mycopy(void *dst, const void *src, size_t n);
mycopy(data, data+5, 5 * sizeof(int));
C++ inherited from (post-K&R) C the notion of void*
as a vocabulary type. K&R C didn’t use void
;
it just passed around char*
all over the place (e.g., in K&R1’s char *alloc()
).
This was rightly seen as a mistake and the C standardization committee quickly invented
void*
to replace char*
. As a bonus, C++ permits any object pointer to implicitly convert to void*
,
whereas it’s rightly more difficult to create a char*
or int*
or std::byte*
.
See also: “Pointer to raw memory? T*
” (2018-06-08).
Prefer lambdas over std::bind
Yet another example of C++11 first taking a C++98-flavored thing (std::bind
) from Boost into the library,
and later taking another thing into the core language in a way that essentially obsoleted the library facility.
However, std::bind
didn’t become truly obsolete until C++14 gave us init-capture syntax.
auto f = std::bind(&Widget::serve, this, std::placeholders::_1, 10); // worse
auto f = [this](int timeoutMs) { this->serve(timeoutMs, 10); }; // better
The lambda version is better in so many ways. It doesn’t require including <functional>
.
It gives us a place to document both the type of x
and its purpose (via naming).
It constrains f
to accept only one parameter, whereas the std::bind
closure object
will happily accept any number of arguments and silently ignore the excess.
It gives us a place to specify the return type of f
, in case we want it to be different
from the return type of serve
(or, as in this case, to be void
).
And std::bind
has unintuitive behavior
when one closure is nested as an argument to another, whereas lambdas compose
straightforwardly.
auto h = std::bind(f, std::bind(g, _1)); // worse
auto h = [](const auto& x) { return f(g(x)); }; // better
auto j = std::bind(std::plus<>(), std::bind(f, _1), 1); // worse
auto j = [](const auto& x) { return f(x) + 1; }; // better
Prefer ranged for
over std::for_each
std::for_each(v.begin(), v.end(), [](int& x) { ++x; }); // worse
for (int& x : v) { ++x; } // better
But the std::for_each
algorithm can actually be the most natural way to express your intent
if you’re writing an STL-style algorithm where you actually started with an iterator-pair [first, last)
.
In that case, you don’t have to use the library to break down v.begin(), v.end()
; you’d
actually have to use the library to build first, last
back up into a for
-friendly range!
for (auto&& x : std::ranges::subrange(first, last)) { myCallable(x); } // worse
std::for_each(first, last, myCallable); // better
A particularly outlandish example of library misuse is the std::accumulate
in Conor Hoekstra’s
“Structure and Interpretation of Computer Programs” (CppCon 2020),
where he’s basically reinventing a for
loop but having to track all of his mutable bits
through the elements of a std::pair
accumulator he manually created, instead of getting
the compiler to do it automatically via local variables on a stack frame.
// worse
auto [biggest, secondbiggest] = std::accumulate(
v.begin(), v.end(),
std::make_pair(0, 0),
[](std::pair<int, int> acc, int x) {
auto [biggest, secondbiggest] = acc;
if (x > biggest) return std::make_pair(x, biggest);
if (x > secondbiggest) return std::make_pair(biggest, x);
return acc;
}
);
// better
int biggest = 0;
int secondbiggest = 0;
for (int x : v) {
if (x > biggest) {
secondbiggest = std::exchange(biggest, x);
} else if (x > secondbiggest) {
secondbiggest = x;
}
}
See also: “The STL is more than std::accumulate
” (2020-12-14).
Prefer struct
over std::tuple
There are reasonable reasons to prefer std::variant
over union
, because std::variant
adds type-safety (it throws if you access the wrong alternative at runtime, instead of just
having UB).
But there is no such reason to prefer std::tuple
over struct
.
// worse
std::tuple<std::string, IPAddress, double> getHostIP();
auto info = getHostIP();
std::cout << std::get<0>(info);
auto [hostname, addr, ttl] = info;
// better
struct HostInfo {
std::string hostname;
IPAddress address;
double ttlMs; // "time to live"
};
HostInfo getHostIP();
auto info = getHostIP();
std::cout << info.hostname;
auto [hostname, addr, ttl] = info; // still works!
The struct
version has many advantages over the std::tuple
version: It gives us a
strong type class HostInfo
instead of a vague tuple of data. It gives us a single
declaration for each member, so we have a place to document its purpose (via naming
and/or code comments). It gives us names for each member: info.hostname
is less
error-prone and more readable than std::get<0>(info)
.
std::tuple
also generates really long mangled names, so switching to named class types
might lighten the load on your linker. Compare:
void f(std::tuple<std::string, IPAddress, double>);
// _Z1fSt5tupleIJNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE9IPAddressdEE
void f(HostInfo);
// _Z1f8HostInfo
So why does
std::tuple
exist at all, you ask? Mainly, variadic templates. You’re allowed to create astd::tuple<Ts...>
with a variadic number of elements, but (as of 2022) not allowed to create astruct S { Ts... ts; }
.
Prefer core-language arrays over std::array
std::array<int, 3> a = {1, 2, 3}; // worse
std::array<int, 3> a = {{1, 2, 3}}; // even worse
int a[] = {1, 2, 3}; // better
See “The array size constant antipattern” (2020-08-06).
The trick with the “even worse” line above is that
std::array
is technically an aggregate, and the paper standard doesn’t specify what its members actually are. So initializing anarray
with{1, 2, 3}
is guaranteed to work because of brace elision, but initializing it with{{1, 2, 3}}
could conceivably (on a hypothetical perverse implementation) fail to compile.
Prefer C++20 co_yield
over manual frame management
In the std::accumulate
example above, we were using a temporary “accumulator”
to hold data in a library data structure, when it really should have been in
local variables on a stack frame (so that those variables could be seen and understood
by the compiler’s optimizer; by the debugger; by the IDE; and so on, not to mention
by our own coworkers). Another place you sometimes see people managing their own
“stack frame” data structures is in coroutines.
For example, here’s a hand-coded generator of Pythagorean triples; see “Four versions of Eric’s Famous Pythagorean Triples Code” (2019-03-06).
struct TriplesGenerator {
int x_ = 1;
int y_ = 1;
int z_ = 1;
int n_;
explicit TriplesGenerator(int n) : n_(n) {}
std::optional<Triple> next() {
if (n_-- == 0) {
return std::nullopt;
}
for (; true; ++z_) {
for (; x_ < z_; ++x_) {
for (; y_ < z_; ++y_) {
if (x_*x_ + y_*y_ == z_*z_) {
return Triple{ x_, y_++, z_ };
}
}
y_ = x_ + 1;
}
x_ = 1;
}
}
};
auto tg = TriplesGenerator(5);
while (auto opt = tg.next()) {
auto [x,y,z] = *opt;
printf("%d %d %d\n", x, y, z);
}
The programmer of TriplesGenerator
is having to manage the lifetimes of x_
, y_
, z_
, and n_
by hand; and even worse, having to reason about the state of those variables each time the next()
function is exited and re-entered from the top.
Contrast the above manual code with this C++20 Coroutines code, which uses a std::generator<T>
class similar to the one currently slated for C++23.
“But wait, I thought you said library facilities were bad!”
Not at all. I said that some library facilities can be replaced with core-language features.
Other library facilities complement core-language features in a good way. The core-language
feature we’re going to use in the following code is co_yield
, so that we can stop using
data members of TriplesGenerator
and start using plain old local variables on a stack frame
(well, a coroutine frame) that the compiler will manage for us. Godbolt:
auto triples(int n) -> generator<Triple> {
for (int z = 1; true; ++z) {
for (int x = 1; x < z; ++x) {
for (int y = x; y < z; ++y) {
if (x*x + y*y == z*z) {
co_yield { x, y, z };
if (--n == 0) co_return;
}
}
}
}
}
for (auto [x,y,z] : triples(5)) {
printf("%d %d %d\n", x, y, z);
}
Notice on Godbolt that GCC can optimize the hand-coded TriplesGenerator
version down to almost nothing, whereas the co_yield
-based version has a lot more machine instructions.
On Clang the two versions are more coequal; that’s because Clang does a much worse job optimizing
TriplesGenerator
, not because it does any better at optimizing co_yield
.
If I had to maintain one of these two snippets in a real codebase, long-term (and I didn’t have any qualms about
the long-term stability of Coroutines support), I’d definitely prefer to maintain the shorter simpler
co_yield
-based version. Let the compiler deal with allocating variables on frames — computers are really
good at that! Save the programmer’s brain cells for other things (like dangling-reference bugs).
Prefer ::new (p) T(args...)
over std::construct_at
This is another controversial one. There’s no observable difference between
a call to the library function std::construct_at
and a manual call to ::new (p) Widget
,
so why should we prefer one over the other? For me, it’s partly (unintuitively?) about
readability: I find the line involving Widget(x, y)
looks more like a constructor call
than the line involving (Widget*)data, x, y
. And partly it’s consistent application
of the mantra of this blog post: When we can accomplish something using a shorter syntax
known to the compiler, or a longer syntax that requires library support,
as a general rule we should prefer the core-language syntax.
alignas(Widget) char data[sizeof(Widget)];
Widget *pw = std::construct_at((Widget*)data, x, y); // worse
Widget *pw = ::new ((void*)data) Widget(x, y); // better
Ditto for std::destroy_at
versus p->~T()
:
std::destroy_at(pw); // worse
pw->~Widget(); // better
(Observe that std::destroy_at
can also destroy arrays, starting in C++20; whereas
a direct call to ~T()
cannot. But if you’re in a situation where T
might be an
array type, and you actually want to support that, I’m okay with forcing you to
break this guideline.)
(C++20 also begins a weird window of historical time where std::construct_at
and
std::allocator<T>::allocate
are required to be constexpr-friendly where ::new
and new
are not; so constexpr code may need to use the library facilities for now,
in the same way that C++0x code might have needed to use std::alignment_of
or
std::bind
before the core language caught up and obsoleted them again. I predict
this current window will close again by C++26 at the latest.)
Prefer C++20 requires
over std::enable_if
I don’t use C++20 Concepts syntax in my day-to-day programming yet, but if you’re writing
C++20 code, then you should almost certainly prefer requires
-clauses over the more
old-school (and backward-compatible) SFINAE techniques.
template<class T, std::enable_if_t<std::is_polymorphic_v<T>, int> = 0>
void foo(T t); // worse
template<class T>
requires std::is_polymorphic_v<T>
void foo(T t); // better
It’s more readable, and might even compile faster.
Prefer assignment over .reset
auto p = std::make_unique<Widget>();
p.reset(); // worse
p = nullptr; // better
There are valid use-cases for .reset
; for example, when you’re using unique_ptr<T, Deleter>
as an implementation detail.
std::unique_ptr<FILE, FileCloser> fp = nullptr;
if (should_open) {
fp.reset(fopen("input.txt", "r"));
}
But in general, .reset
(and also .get
, and even the rarely-used .release
) is a code smell
that’s worth investigating closely.
Prefer '\n'
over std::endl
std::cout << "Hello world" << std::endl; // worse
std::cout << "Hello world\n"; // better
On modern platforms, the standard output is almost always at least line-buffered, so flushing
the stream immediately after a newline (as endl
does) is redundant.
std::endl
(“newline and flush”) can always be reduced to '\n'
(“newline and let the OS
do its thing, just like you would in any other language”).
See this StackOverflow question for various
more or less interesting differences between endl
and '\n'
. One interesting tidbit is that std::endl
is a function template! Thus decltype(std::endl)
is ill-formed, and std::cout << std::endl
works only
thanks to the long-despised rule [temp.deduct.funcaddr],
the same rule that permits things like
template<class T> T f(T);
int (*p1)(int) = f; // OK
std::ostream& (*p2)(std::ostream&) = f; // OK
If it weren’t for iomanipulators like endl
and flush
, maybe we wouldn’t need that rule in the core language?
Prefer std::thread(lambda)
over std::thread(f, args...)
This is a slight variation on “Prefer lambdas over std::bind
.”
C++11 std::thread
supports a variadic number of arguments;
but you should never pass any arguments. Just bundle them into a plain old core-language lambda, so that the compiler
can see what you’re doing; this will be cheaper and (more to the point) permit much better error messages.
void worker(int, int&);
auto t = std::thread(
worker, x, std::ref(y) // worse
);
auto t = std::thread([&, x]() {
worker(x, y); // better
});
The lambda version avoids using (and instantiating) std::ref
for arguments you intended to capture by reference.
But the main reason you should always use the lambda version is that if you ever get the argument types wrong,
you trade a library error spew like this (Godbolt):
In file included from /usr/include/c++/15.0.0/thread:49:
/usr/include/c++/15.0.0/bits/std_thread.h:156:17: error: static assertion failed
due to requirement '__is_invocable<void (*)(int, int &), int, std::reference_wrapper<float>>::value':
std::thread arguments must be invocable after conversion to rvalues
156 | static_assert( __is_invocable<typename decay<_Callable>::type,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157 | typename decay<_Args>::type...>::value,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: in instantiation of function template specialization 'std::thread::thread<void (&)(int, int &),
int &, std::reference_wrapper<float>, void>' requested here
11 | auto t = std::thread(worker, x, std::ref(y));
| ^
In file included from /usr/include/c++/15.0.0/thread:49:
/usr/include/c++/15.0.0/bits/std_thread.h:290:31: error: no type named 'type'
in 'std::thread::_Invoker<std::tuple<void (*)(int, int &), int, std::reference_wrapper<float>>>::__result<std::tuple<void (*)(int, int &), int, std::reference_wrapper<float>>>'
290 | typename __result<_Tuple>::type
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
/usr/include/c++/15.0.0/bits/std_thread.h:236:13: note: in instantiation of template class
'std::thread::_Invoker<std::tuple<void (*)(int, int &), int, std::reference_wrapper<float>>>' requested here
236 | _Callable _M_func;
| ^
[...24 more lines...]
for a nice simple compiler error:
In lambda function:
error: cannot bind non-const lvalue reference of type 'int&' to a value of type 'float'
12 | worker(x, y);
| ^
note: initializing argument 2 of 'void worker(int, int&)'
4 | void worker(int x, int& y) {
| ~~~~~^
Finally, some near misses
Regular readers of this blog will know that I advise strongly against CTAD,
which means that I strongly prefer the library syntax std::make_pair(x, y)
over std::pair(x, y)
,
std::make_reverse_iterator(it)
over std::reverse_iterator(it)
,
and so on. Of course if std::pair(x, y)
and std::make_pair(x, y)
did the same thing,
it wouldn’t matter which you used, and you might well prefer the shorter, “corer” syntax.
But they don’t do the same thing, and in fact CTAD is so pitfally that I just prefer to
ban it outright.
Either CTAD is the exception that proves the rule, or the explanation is that both syntaxes involve standard library facilities: my preferred syntax involves a standard library function, and my disfavored syntax involves a set of deduction guides (some implicitly generated) associated with a standard library type. So we have a choice between two different library facilities, and we should choose the one that’s less likely to lead to a bug.
Also note that in contexts where you get to choose between
std::make_pair(x, y)
and a simple braced initializer{x, y}
, I’d often recommend the latter.
Regular readers of this blog will also know that I tend to prefer
T(x)
over static_cast<T>(x)
, despite some caveats.
Both of those are core-language syntaxes. But it’s probably no coincidence
that I prefer the shorter, less angle-brackety of the two.