std::format from scratch, part 1
std::format from scratch, part 1This is the first post in a three-part series showing how to
make a simple class type formattable with C++20 std::format
(and, incidentally, the C++98 iostreams way as well).
These posts assume that you’re already vaguely familiar with the
basic way to use std::format — e.g. that std::format("{:.1f}{}x", 3.14, "abc")
returns a std::string with value "3.1abcx".
The series consists of the following posts:
Here’s the type that we’d like to print out. It’s just a thin wrapper around a vector of strings.
class Widget {
public:
explicit Widget(std::vector<std::string> n) :
names_(std::move(n)) { }
private:
std::vector<std::string> names_;
};
C++98 iostreams version
To make this type C++98–streamable, we’d do this (Godbolt):
class Widget {
public:
explicit Widget(std::vector<std::string> n) :
names_(std::move(n)) { }
void print(std::ostream& os) const {
const char *delim = "Widget({";
for (const auto& name : names_) {
os << std::exchange(delim, ", ") << std::quoted(name);
}
os << "})";
}
friend std::ostream& operator<<(std::ostream& os, const Widget& w) {
w.print(os);
return os;
}
private:
std::vector<std::string> names_;
};
We first wrote a member function Widget::print that does all the heavy lifting.
Then we wrote a hidden-friend operator<< to act as the “glue,” the interface,
between our implementation in Widget::print and callers like this in main:
int main() {
Widget w({"Håvard", "Howard", "Harold"});
std::cout << w << '\n';
}
There’s nothing terribly wrong with omitting Widget::print and just putting
the implementation straight into operator<<. (It’s a friend, so it gets access
to Widget’s private members already.) But I like to make the whole API consist
of member functions where possible, with a minimum of non-member “glue” as needed.
If you’ve taken my training courses, you know we do the same thing with
member .swap
(called from hidden-friend swap)
and member .hash (called from std::hash<T>).
C++20 std::format version
To make Widget C++20–formattable, we’ll do this (Godbolt):
class Widget {
public:
explicit Widget(std::vector<std::string> n) :
names_(std::move(n)) { }
template<class It>
It format_to(It out) const {
const char *delim = "Widget({";
for (const auto& name : names_) {
out = std::format_to(out, "{}\"{}\"", std::exchange(delim, ", "), name);
}
return std::format_to(out, "}})"); // an escaped "})"
}
private:
std::vector<std::string> names_;
};
Again we’ve implemented the heavy lifting in a member function, this time named
format_to. It takes an arbitrary output iterator,[1]
writes characters into it, and returns the same iterator. We’re writing those characters
with std::format_to, but we could just as well have pushed characters into the
output iterator manually, like this:
*out++ = '}';
*out++ = ')';
return out;
So we can write a Widget directly to std::cout like this:
w.format_to(std::ostream_iterator<char>(std::cout));
std::cout << '\n';
But we still need some glue code between Widget::format_to and the rest of the world.
In C++20, that glue code is a specialization of the std::formatter template.
When we std::format anything, the library will construct one std::formatter object
per format specifier. Each std::formatter will be asked to .parse its corresponding
format specifier, and then .format the matching argument. So we need to implement
both of those methods.
The .parse method doesn’t really need to do anything, yet, because the only specifier
we’ll ever ask it to parse is "{}". It just needs to scan and consume characters until
it hits the format-specifier-terminating } character or decides to report a parsing error.
The conventional way to report an error would be to throw std::format_error (as we’ll
do in Part 2), but here I just used assert.
The .format method simply needs to call Widget::format_to on the output iterator
passed to it. The output iterator is actually bundled up with some other things inside
a std::format_context,
and we have to call ctx.out() to get at it. What’s more, we can’t just accept
std::format_context& ctx; in order to work with std::format_to, we need to accept
ctx arguments of all different types. So std::formatter<Widget>::format must be a
member function template.
(.parse can be templated on the type of its ctx too — if you need to handle
wchar_t format strings, for example — but that’s not typically needed, as far as I know.)
template<>
struct std::formatter<Widget> {
constexpr auto parse(const std::format_parse_context& ctx) {
auto it = ctx.begin();
assert(it == ctx.end() || *it == '}');
return it;
}
template<class FormatContext>
auto format(const Widget& rhs, FormatContext& ctx) const {
return rhs.format_to(ctx.out());
}
};
Finally, we can test Widget’s newfound std::format-ability with the following main:
int main() {
Widget w({"Håvard", "Howard", "Harold"});
std::cout << std::format("{}\n", w);
}
Note: I wrote template<class It> for the template-head of
Widget::format_to. I certainly could have written template<std::output_iterator<char> It>
instead; but that would just be more typing. Also, if I did that, I’d probably feel like
I ought to account for the fact that C++20 output iterators can be move-only, which means
I should really have written
template<std::output_iterator<char> It>
It format_to(It out) const {
const char *delim = "Widget({";
for (const auto& name : names_) {
out = std::format_to(std::move(out),
"{}\"{}\"", std::exchange(delim, ", "), name);
}
return std::format_to(std::move(out), "}})");
}
None of that complication buys me anything in ordinary code. The std::moves will help
only if I ever try to format a Widget into a move-only output iterator; I can’t
immediately think of a situation where that would come up, so in ordinary code I wouldn’t
bother (although in library code I might).
Mark de Wever points out that in C++23, std::format will support
formatting ranges
right out of the box. It’ll even print strings double-quoted (and escaped in a
Python-style way) by default.
So when __cpp_lib_format_ranges >= 202207L, we can just write (Godbolt):
template<class It>
It format_to(It out) const {
return std::format_to(ctx.out(), "Widget({{{:n}}})", names_);
}
The doubled {{ and }} represent literal curly braces, in the same way that
a doubled %% represents a literal % in a printf format-specifier.
The :n specifier tells the underlying std::range_formatter<std::string>
to omit the square brackets it would usually print around a comma-separated
list of strings: we don’t want those square brackets because we’re printing
our own curly braces instead.
Speaking of non-default format specifiers: Our .parse method was pretty trivial
so far; but that’s exactly where we’d add code to allow the user-programmer to customize
the formatting of their Widget.
In Part 2, we’ll learn how to
customize Widget’s formatting logic with a non-trivial format specifier.
