Field-testing P2266 “Simpler Implicit Move”
A few months ago, Matheus Izvekov implemented my WG21 proposal
P2266 “Simpler Implicit Move”
in Clang. As of Clang 13, it’s enabled by default under -std=c++2b
.
We’ve received some feedback from the field already (leading up to the Clang 13 release). Here are the punch lines as far as I can recall. These examples will all appear in P2266R2, whenever I get around to submitting it.
Many thanks to Matheus Izvekov for the implementation, and to Stephan Bergmann for reporting these issues!
Microsoft’s std::getline
Symptom: Code fails to compile.
Fix: return static_cast<istream&>(x);
Since C++11, the STL has had two overloads of std::getline
.
With some cruft simplified, they are:
std::istream& getline(std::istream&, std::string&, char);
std::istream& getline(std::istream&&, std::string&, char);
I suppose the point of the second overload is to allow the programmer to write e.g.
std::string line;
std::getline(std::ifstream("MobyDick.txt"), line, '.');
assert(line == "Call me Ishmael");
The Microsoft STL until recently implemented these overloads as follows:
std::istream& getline(std::istream&& in, std::string& line, char delim) {
// ...
return in; // X
}
std::istream& getline(std::istream& in, std::string& line, char delim) {
return std::getline(std::move(in), line, delim);
}
That is, they made the rvalue version the primary implementation, and made the lvalue version simply delegate to the rvalue version! (This code reminds me of the old joke about the mathematician and the room that is not on fire.)
Prior to P2266 (that is, in standard C++20), line X
works fine, because although
in
names an implicitly movable entity
([class.copy.elision]/3),
there is no copy-initialization context here.
in
(being a named variable)
is an lvalue, and happily binds to the lvalue std::istream&
being returned.
After P2266, line X
fails to compile, because in
(being the move-eligible
id-expression operand of a return
statement) is an xvalue, and refuses to bind to an lvalue std::istream&
.
The simplest fix IMHO would have been to swap the lvalue and rvalue versions:
std::istream& getline(std::istream& in, std::string& line, char delim) {
// ...
return in; // now OK
}
std::istream& getline(std::istream&& in, std::string& line, char delim) {
return std::getline(in, line, delim);
}
The alternative fix actually adopted by Microsoft is to insert a cast, so that the returned expression is no longer an id-expression at all:
std::istream& getline(std::istream&& in, std::string& line, char delim) {
// ...
return static_cast<std::istream&>(in);
}
Thanks to the STL team (and specifically the other STL) for already landing the fix.
LibreOffice OString
Symptom: Code fails to compile.
Fix: return OString(x);
Stephan Bergmann reported this breakage
in LibreOffice’s OString
class.
Their OString
has a very weird (and IMHO very ill-advised)
constructor overload set… but it also exposed a surprising aspect of P2266 I hadn’t
thought about before! Essentially, LibreOffice OString
tries to avoid taking the strlen
of
string literals:
char nc[10];
strcpy(nc, "foobar");
const char cc[] = "foobar";
OString nco = nc; // construct from char[10]
OString cco = cc; // construct from const char[7]
OString slo = "foobar"; // construct from const char[7]
LibreOffice wants cco
and slo
to use a different constructor from nco
. The constructor
taking const char[7]
can assume that the length of the new string being allocated is 6
;
the one taking char[10]
must do a strlen
to find out the actual length. (We are willing to
pretend that literals like "foo\0bar"
don’t exist; but we cannot pretend that strcpy
doesn’t
exist.) So the overload set looks basically like this (Godbolt):
template<class T>
concept IsCharPtr = std::same_as<T, char*> || std::same_as<T, const char*>;
template<class T> constexpr bool IsNonConstCharArray = false;
template<int N> constexpr bool IsNonConstCharArray<char[N]> = true;
template<> constexpr bool IsNonConstCharArray<char[]> = true;
template<class T> constexpr bool IsConstCharArray = false;
template<int N> constexpr bool IsConstCharArray<const char[N]> = true;
template<> constexpr bool IsConstCharArray<const char[]> = true;
class OString {
std::string s_;
public:
template<class T> requires IsCharPtr<T>
OString(const T& ptr) : s_(ptr, strlen(ptr)) {}
template<class T> requires IsNonConstCharArray<T>
OString(T& arr) : s_(arr, strlen(arr)) {}
template<class T> requires IsConstCharArray<T>
OString(T& arr) : s_(arr, sizeof(arr) - 1) {}
};
Here’s the problem (Godbolt):
OString problem()
{
char nc[10];
strcpy(nc, "foobar");
return nc; // OK in C++20, ERROR in C++2b
}
Here nc
is an implicitly movable entity, and the operand of the
return statement is a simple id-expression, so implicit move kicks in.
In C++20, we do two overload resolutions: first treating nc
as an xvalue
(which finds no viable candidates) and second treating nc
as an lvalue
(which successfully finds the IsNonConstCharArray
candidate).
In C++2b with P2266, we see that nc
is a move-eligible id-expression
and treat it simply as an xvalue. P2266’s single overload resolution finds
no way of converting an xvalue of type char (&&)[10]
into an OString
,
so the code fails to compile.
Before now, I’d never really thought about rvalue arrays. We don’t see
array xvalues in everyday life, at least not in C++20. But combining this
C++2b proposal with OString
’s very weird overload set makes them suddenly
appear and demand our attention.
The fix adopted by LibreOffice
was simply to insert explicit conversions to OString
everywhere that the
compiler complained about:
OString no_more_problem()
{
char nc[10];
strcpy(nc, "foobar");
return OString(nc); // OK in C++20 and C++2b
}
This code has the same behavior in all versions of C++. It’s arguably clearer for the reader, as well.
LibreOffice o3tl::temporary()
Symptom: Code fails to compile.
Fix: return static_cast<T&>(x);
The function o3tl::temporary
is documented to “cast an rvalue to an lvalue” — it’s basically the opposite of
std::move
. It creates a “temporary lvalue” for the purposes of passing to a function
taking out-parameters by lvalue reference and/or pointer, where the caller doesn’t
actually care about the results. For example, std::modf
:
double fractional_part(double x) {
return std::modf(x, &o3tl::temporary(0.0));
}
In C++11 through C++20, the implementation is simplicity itself:
template<class T> T& temporary(T&& x) { return x; }
Remember, x
is a variable with a name, so it’s an lvalue. C++20’s implicit move doesn’t kick in
here, because we’re returning a reference type. So an lvalue T&
happily binds to x
— we can “launder”
our rvalue into an lvalue without any special effort on our part.
In EWG discussion of P2266R1, the effortlessness of “laundering” rvalue into lvalues was generally
regarded as a bad thing. P2266 makes it a bit more effortful. The above code won’t compile
after P2266, because x
, being a move-eligible id-expression, will be treated as an xvalue, and
that lvalue T&
won’t bind to it.
The fix adopted by LibreOffice is simply to insert the explicit cast:
template<class T> T& temporary(T&& x) { return static_cast<T&>(x); }
Again, this makes the code clearer, and is backward-compatible across all versions of C++.
These are all the breakages that have been reported due to the new implicit-move rules
in Clang 13’s -std=c++2b
mode. If you know of other breakages in real-world codebases,
please tell me about them! (For example, send me an email.)
Note that it’s easy to contrive examples of code that works in C++20 but fails in -std=c++2b
mode (or vice versa). You can even find such examples ready-made inside C++ compiler test suites.
I’m interested only in real live examples from production code.