Hot takes on flat_map
, resize_default_init
, and ranges::to
flat_map
, resize_default_init
, and ranges::to
The pre-Kona WG21 mailing has been out for a little while. Tonight I looked at a few papers; here’s my hot takes.
P0429R6 “A Standard flat_map
”:
I don’t honestly think flat_map
is well-baked enough to put into the Standard Library; but at least
it’s been in Boost for a while (if I understand correctly)
(EDIT: I do not), and it fits into the STL’s scheme of
“container adaptors” pretty naturally. The order of the template parameters is awkward but probably
correct.
Some minor typos, such as Allocator
for Alloc
in some of the text.
P0429 does innovate in one area: it has a bunch of constructors taking initializer_list<value_type>&&
.
This doesn’t do what it sounds like. Consider:
FlatSet(std::initializer_list<T>&& il) {
for (auto&& elt : il) {
keys_.emplace_back(std::move(elt));
}
}
You might think that this “moves-out-of” the elements of the initializer list. That would be efficient, but also horribly incorrect if it actually happened!
for (int i=0; i < 10; ++i) {
FlatSet<std::string> fs { "abc", "def", "ghi" };
// ...
}
If “moving-out-of” initializer_list
elements were permitted, then on the second time through
this loop, all the string elements of the initializer_list
would be in an already-moved-out-of
state, and bad stuff would happen. (Yes, initializer_list
s are more or less static
. The
actual semantics are a huge mess. See Jason Turner’s excellent C++Now 2018 talk
“initializer_list
s Are Broken — Let’s Fix Them”
for the horrendously gory details.)
Fortunately, initializer_list
prevents the naïve programmer from moving-out-of its elements,
by forcing its nested reference
type to be const T&
, not T&
.
So in our loop in the constructor above,
keys_.emplace_back(std::move(elt));
elt
has type const std::string&
, which means std::move(elt)
has type const std::string&&
,
which means that emplace_back
will end up perfectly forwarding to string
’s copy constructor,
not its move constructor. Result: taking an initializer_list
by non-const rvalue reference
is functionally equivalent to taking initializer_list
by const lvalue reference, or indeed taking
it by value. (“By value” is the single appropriate way to take initializer_list
.)
The only thing you accomplish by taking initializer_list
by rvalue reference is, you break everyone
who thinks they can “perfect-forward” initializer_list
by value as a special case. For example,
std::make_optional
.
auto opt = std::make_optional<FM>(
// this finds the initializer_list<U> overload of make_optional
{
FM::value_type{"1", "abc"},
FM::value_type{"2", "abc"},
}
);
With FM(initializer_list<value_type>)
, no problem.
But with FM(initializer_list<value_type>&&)
you get a cryptic error from deep in the
bowels of make_optional
.
Always pass
initializer_list
by value.
P1072R3 “basic_string::resize_default_init
”:
Ship it! I’ve watched this one for a while, and I think this is as good as it’s going to get.
The library folks have already approved its sister proposal
P1020
for make_shared_default_init<int[]>(100)
and make_unique_default_init<int[]>(100)
. Giving the
same powers to std::string
just makes sense.
P1072 specifically does not propose to add the resize_default_init
method to vector
(or deque
or list
or forward_list
, all of which have resize
methods — did you know?), because it’s unclear
whether a general facility would be useful, whereas anyone who works with parsing of character data
(e.g. receiving packets from a network socket) knows the use-case for string::resize_default_init
.
It would be even awesomer to be able to malloc
a buffer, put some data in it, and then hand that buffer
over to be managed by a std::string
(or even a basic_string
with a custom allocator). However,
that rabbit hole is very deep and twisty, and I’m glad the authors of P1072 backed away from it.
But read on!
P1206R1 “ranges::to
: A function to convert any range to a container”:
to
is such a great userspace identifier! Please don’t drop it into
the black hole of ADL!
The one interesting (and perhaps redeeming?) quality of this proposal is that it is proposing to give a standard meaning to
std::string str = "hello world";
std::vector<char> vec = std::move(str) | ranges::to<std::vector>();
From a human point of view, we can see that a quality implementation should implement that as a
couple of pointer swaps, and ta-da, we have the “awesomer” buffer-ownership-transfer primitive
described above! At least for transferring
ownership between containers that are both provided by the same library (and possibly
it has to be the same library that provides the implementation of ranges::to
, I’m not sure).
However, Section 10 of P1206R1 implies that this facility will always copy — it is proposed
to take its input range always by const lvalue reference — so, no move semantics for you!
Also, the people who are most interested in speedily converting vectors to strings or vice versa
are probably the same people who are least interested in pulling in all of <ranges>
(with
its 8-second compile time) just to get the facility.
For people who like their overload resolution faster than their ping time,
note that P1206 proposes to be useable without operator|
as well:
std::vector<char> vec = ranges::to<std::vector>(std::move(str));
Also, notice that the <std::vector>
in the angle brackets is (A) not a typo and (B) nothing
to do with CTAD.
The idea is that the library implementor would write two different overloaded templates,
both named to
:
namespace nonstd {
template<class Range>
using range_value_t = iter_value_t<iterator_t<Range>>;
} // namespace nonstd
template<class ContainerType, class Range, class... Args>
ContainerType
to(Range, Args&&...);
template<template<class...> class ContainerTemplate, class Range, class... Args>
ContainerTemplate<nonstd::range_value_t<Range>>
to(Range, Args&&...);
C++ continues to not-have (and likely will forever not-have) a way to encode “any arbitrary template with any arbitrary parameters.” So if you were hoping for
auto foo(std::span<int, 5> span)
{
return span | ranges::to<std::array>;
}
then nope, that’s still Right Out.
Overall, I think P1206 probably should be accepted as part of Ranges. I like the paper’s
“view materialization idiom.” I don’t like Ranges, or stomping on the name to
, or clever
metaprogramming shenanigans designed merely to further confuse the CTAD flock; but I think
if we’re going to get Ranges then we should also get “view materialization” something like P1206.