Who uses P2786 and P1144 for trivial relocation?

On Friday, June 28th, there will be a discussion at the C++ Committee meeting about the state of “trivial relocation” in C++. (The meeting is physically in St Louis, but also streamed remotely via Zoom and open to visiting experts per WG21’s Meetings and Participation Policy.) If you’re an expert (read: a library maintainer or developer) with opinions or experience on “trivial relocation” — especially if you have implemented it in your own codebase — I encourage you to read the participation policy and try (virtually) to show up for the discussion!

There are two active proposals for trivial relocation in C++: my own P1144 (based on pre-existing practice in Folly, BSL, and Qt) and Bloomberg’s P2786 (a reaction to P1144). The latter — unfortunately in my view — has raced past P1144 in the committee and shows a real danger of being included in C++26.

Why “danger”? Because P2786 is not fit for any present-day library’s needs. See section 2.

And how “raced”? See section 3.

For a good (non-Arthur-written) summary of “what is trivial relocation anyway,” see Giuseppe D’Angelo’s ongoing series of blog posts for KDAB, collectively titled “Qt and Trivial Relocation”:

What present-day libraries use P2786 and/or P1144

There are several aspects to “trivial relocation” as a proposed C++ feature:

  • The simple user-facing type-trait is_trivially_relocatable_v<T>. P2786 and P1144 both propose this, but with significantly different meanings. P2786 treats “relocatable” as a verb analogous to “move-constructible,” whereas P1144 treats “trivially relocatable” as a holistic property analogous to “trivially copyable.” A P2786-friendly codebase would say that tuple<int&> is trivially relocatable (because it’s trivially move-constructible and trivially destructible), whereas a P1144-friendly codebase would say that tuple<int&> is not trivially relocatable (because it permits a sequence of value-semantic operations preserving the number of live objects of the type, which cannot be emulated by operating solely on the participating objects’ bit-patterns).

  • New library algorithms. P1144 proposes the gamut you’d expect by treating the verb relocate analogously to construct: uninitialized_relocate, uninitialized_relocate_n, uninitialized_relocate_backward, relocate_at, and a prvalue-producing relocate. P2786 proposes only std::trivially_relocate. These, again, are significantly different approaches.

  • Invisible-to-the-user library optimizations that are compatible with either P1144 or P2786; e.g. vector reallocation, type-erased function move-construction and move-assignment, inplace_vector move-construction.

  • Invisible-to-the-user library optimizations that are compatible only with P1144; e.g. vector insert and erase, swap, rotate, inplace_vector move-assignment.

  • “Warrant” syntax. P1144 proposes a [[trivially_relocatable]] attribute; P2786 proposes a contextual keyword trivially_relocatable. Most libraries can be expected not to use these, because their effect would be limited to compilers that support the attribute (in the former case) or to compilers speculatively supporting a merely proposed C++26 keyword (in the latter case).

For any codebase that concerns itself at all with “trivial relocation,” we can look at how it deals with each of these four dimensions: Does it define a type-trait matching P1144’s holistic semantics or P2786’s single-operation semantics (or neither)? Does it provide some or all of P1144’s proposed library algorithms, or P2786’s single algorithm (or neither)? Does it implement optimizations compatible only with P1144, optimizations compatible with both proposals, or no optimizations at all?

Further, it’s only fair to look at four other relevant properties of a library:

  • Has it ever taken relevant input from Arthur O’Dwyer; from Mungo and Alisdair; both; or neither? (And if so: Was the input in the form of code-review feedback, or actual code?)

  • Does the library rely on P1144’s proposed feature-test macros __cpp_impl_trivially_relocatable and/or __cpp_lib_trivially_relocatable, i.e., can you compile it with Arthur’s fork of Clang/libc++ and get the optimizations today? Alternatively, does it rely on P2786’s proposed feature-test macro __cpp_trivial_relocatability?

  • Did any of the library’s maintainers sign P3236R1 “Please reject P2786 and adopt P1144” (May 2024)?

  • Did any of the library’s maintainers comment (pro or con) on Clang #84621, which asks Clang to implement P1144 semantics for __is_trivially_relocatable(T)?

I’ll try to update this list of libraries as I find out about new ones.

Abseil

Google Abseil implements absl::is_trivially_relocatable with P1144 semantics: “its object representation doesn’t depend on its address, and also none of its special member functions do anything strange.” Abseil goes out of its way to avoid using Clang’s __is_trivially_relocatable builtin, because of Clang #69394, but also because:

// Clang on all platforms fails to detect that a type with a user-provided
// move-assignment operator is not trivially relocatable. So in fact we
// opt out of Clang altogether, for now.
//
// TODO(b/325479096): Remove the opt-out once Clang's behavior is fixed.

Abseil does use the builtin if P1144’s feature-test macro __cpp_impl_trivially_relocatable is defined. This means that we can use Godbolt to see Abseil getting better codegen on a compiler that supports P1144: this Godbolt shows inlined_vector::erase generating a loop over the assignment operator with Clang trunk, but a straight-line memmmove with P1144 Clang.

Abseil’s Swiss tables (e.g. absl::flat_hash_set) optimize rehashing (compatible with both P2786 and P1144); this Godbolt shows the benefit. absl::InlinedVector optimizes erase and swap (compatible only with P1144), but not insert or reserve. The erase and swap optimizations were implemented by Arthur O’Dwyer (#1625, #1618, #1632).

One Abseil maintainer thumbs-upped Clang #84621, but did not leave a comment.

AMC

Amadeus AMC implements amc::is_trivially_relocatable with P1144 semantics. It implements the P1144 library algorithms uninitialized_relocate_n and relocate_at (but not the other three, and not P2786’s trivially_relocate).

amc::SmallVector and amc::Vector both optimize reserve (compatible with both P2786 and P1144), as well as swap (delegating some of the work to std::swap_ranges so as to remain compatible with both P2786 and P1144). They both optimize insert and erase (compatible only with P1144).

AMC’s sole contributor, Stéphane Janel, signed P3236R1. He also commented in favor on Clang #84621.

binutils-gdb

GNU binutils-gdb uses the adjective IsRelocatable, but uses it specifically to mean the same thing as is_trivially_copyable, and only in order to =delete memcpy and memmove for non-trivially-copyable types. I don’t know why they didn’t name their trait IsTriviallyCopyable. binutils-gdb isn’t evidence for any particular semantics.

Blender

Blender implements P1144’s uninitialized_relocate_n, but without any optimization for, or attempt to identify, trivially relocatable types.

Blender has never taken commits or code review from Arthur, Mungo, or Alisdair.

Two Blender maintainers signed P3236R1.

BSL

Bloomberg BSL’s implementation is very old — older than any standards proposal in this area.

BSL implements bslmf::IsBitwiseMoveable with P1144 semantics, i.e. as a synonym for the holistic property is_trivially_copyable, albeit with a (non-P1144-compatible, non-P2786-compatible) special case that assumes all one-byte types must be empty and therefore trivial. Notably, BSL does not set IsBitwiseMoveable for trivially move-constructible, trivially destructible types (such as tuple<int&>).

BSL provides the generic algorithms bslma::ConstructionUtil::destructiveMove(destp, alloc, srcp) (analogous but certainly not identical to P1144’s generic relocate_at(srcp, destp)) and ArrayPrimitives::destructiveMove(d_first, first, last, alloc) (ditto, P1144’s generic uninitialized_relocate(first, last, d_first)).

bsl::vector optimizes reserve (compatible with both P2786 and P1144) by delegating to the generic ArrayPrimitives::destructiveMove. It optimizes insert and erase (compatible only with P1144) likewise.

bsl::deque optimizes insert and erase (compatible only with P1144).

bsl::vector::insert(pos, first, last), when first is an input iterator (so that last - first cannot be computed), will append elements to the end of the vector and then delegate to the generic ArrayPrimitives::rotate, which optimizes (compatible only with P1144). BSL does not provide its own implementations of bsl::rotate, bsl::swap_ranges, etc.; those algorithms are just using‘ed from namespace std.

BSL has taken commits from Alisdair Meredith and Mungo Gill, but if any of them were in this particular area, it’s not immediately obvious. It’s never taken commits from Arthur O’Dwyer.

fast_io

The cppfastio/fast_io library implements fast_io::freestanding::is_trivially_relocatable with P1144 semantics.

In eleven places, it tests __has_cpp_attribute(clang::trivially_relocatable) and uses the attribute to warrant types as trivially relocatable if the attribute is supported. This attribute is supported only in Arthur’s P1144 reference implementation. It appears to me that none of these eleven instances strongly depend on P1144’s “sharp-knife” semantics; I think they are equally compatible with P2786’s “dull-knife” semantics.

fast_io::containers::vector<T> optimizes reallocation (compatible with both P2786 and P1144) and erase (compatible only with P1144). However, as of this writing, erase is certainly buggy for most T, regardless of the relocation optimization; so this library is fundamentally not an example of usage experience.

fast_io has taken code review from Arthur O’Dwyer.

Folly

Facebook Folly’s implementation is very old — older than any standards proposal in this area.

Folly implements folly::IsRelocatable with P1144 semantics. As of #2216 (June 2024), Folly will use P1144’s std::is_trivially_relocatable<T> when the feature-test macro __cpp_lib_trivially_relocatable is set. (This code was contributed by Arthur O’Dwyer.)

It does not implement either proposal’s library algorithms.

folly::fbvector optimizes reserve (compatible with both P2786 and P1144) and also insert and erase (compatible only with P1144).

folly::small_vector optimizes move-construction (compatible with both P2786 and P1144; contributed in #1934 by Arthur O’Dwyer) and move-assignment (compatible only with P1144; contributed separately by Giuseppe Ottaviano). small_vector does not yet optimize insert or erase.

Folly maintainer Giuseppe Ottaviano signed P3236R1.

HPX

Stellar HPX asked for an implementation of P1144 and/or P2786 as part of Google Summer of Code 2023. P1144 was implemented; Arthur was asked to code-review, and did so. The feature was shipped in HPX 1.10 (May 2024). IIUC, it’s all in the namespace hpx::experimental — although a lot of the documentation currently implies it’s in namespace hpx?

HPX 1.10 implements is_trivially_relocatable_v with P1144 semantics. It implements the P1144 library algorithms uninitialized_relocate{,_n,_backward} and relocate_at; it comments on, but doesn’t implement P1144 relocate; and it doesn’t implement P2786 trivially_relocate.

It does not implement any library optimizations.

It does not use anyone’s feature-test macros; it uses a config macro HPX_HAVE_P1144_RELOCATE_AT instead. If that macro is set, it will fall back to P1144’s std::is_trivially_relocatable, std::uninitialized_relocate, etc.

Three HPX contributors signed P3236R1. One (the primary implementor of HPX’s P1144 implementation) commented in favor on Clang #84621.

libc++

libc++ introduced std::__libcpp_is_trivially_relocatable<T> in February 2024. Unusually for this survey, libc++ uses Clang’s builtin __is_trivially_relocatable even in the absence of any feature-test macro. This means that it suffers from Clang #69394; but it’s also evidence that libc++ is satisfied for now to look only at “move construct + destroy” (compatible with P2786). For example, __libcpp_is_trivially_relocatable<tuple<int&>> is true (compatible only with P2786).

libc++ optimizes vector reallocation (compatible with both P2786 and P1144), and nothing else.

libc++ doesn’t implement any library API from either P1144 or P2786 (not even for internal use).

libc++ has taken commits and code review from Arthur, but not in this area.

libstdc++

libstdc++ introduced std::__is_bitwise_relocatable in October 2018; author Marc Glisse originally called it std::__is_trivially_relocatable but renamed it in February 2019 at Arthur’s suggestion (thus keeping the name __is_trivially_relocatable available for the core-language builtin, which was added to Clang by Devin Jeanpierre in February 2022.

std::__is_bitwise_relocatable is true for trivial types and for deque<T> specifically, because libstdc++’s deque has a throwing move-constructor (therefore before 2018 it was getting the vector pessimization).

libstdc++ optimizes vector reallocation (compatible with both P2786 and P1144), and nothing else.

libstdc++ defines a generic algorithm __relocate_a(first, last, d_first, alloc) which is roughly analogous to P1144’s uninitialized_relocate(first, last, d_first). It also defines a helper __relocate_object_a(dest, src, alloc) but only for non-trivially-relocatable types (and notice that this parameter order is the opposite of P1144’s std::relocate_at).

libstdc++ has never taken commits from Arthur, Mungo, or Alisdair.

ParlayLib

Carnegie Mellon ParlayLib has supported relocation with P1144 semantics since October 2020; originally the code was based on a draft of P1144R0. In November 2023 the project’s primary maintainer, Daniel Liam Anderson, rewrote the code to match P1144R10; this update was code-reviewed by Arthur O’Dwyer (#67, #1) and merged in February 2024.

ParlayLib implements parlay::is_trivially_relocatable per P1144. If the P1144 feature-test macro __cpp_lib_trivially_relocatable is defined, then it will fall back to std::is_trivially_relocatable. Unusually for this survey, ParlayLib will use Clang’s builtin __is_trivially_relocatable even in the absence of any feature-test macro.

ParlayLib supports the P1144 library algorithms uninitialized_relocate{,_n} and relocate_at. It doesn’t implement P2786 trivially_relocate.

ParlayLib uses the [[trivially_relocatable]] attribute if the feature-test macro __cpp_impl_trivially_relocatable is set. This allows it to warrant to the compiler that parlay::sequence<T> (a kind of multithreading-aware vector) is trivially relocatable.

ParlayLib supports several parallel sort_copy-style algorithms, each of which takes a template policy parameter controlling the way in which data is copied from the input range to the output range. One such policy is uninitialized_relocate_tag, which relocates elements from the input range to the output range. This is used to implement an in-place sort as:

auto Tmp = uninitialized_sequence<value_type>(In.size());
auto a = count_sort<uninitialized_relocate_tag>(In, make_slice(Tmp), make_slice(Keys), num_buckets);
parlay::uninitialized_relocate(Tmp.begin(), Tmp.end(), In.begin());

ParlayLib’s primary maintainer signed P3236R1, and commented in favor on Clang #84621.

PocketPy

PocketPy in #208 (2024) implemented a small_vector that uses relocation in its move-constructor.

PocketPy implements pkpy::is_trivially_relocatable_v with P1144 semantics, essentially as an implementation detail of pkpy::small_vector. This was introduced in #208 (February 2024).

pkpy::small_vector::reserve uses realloc for trivially relocatable value types (compatible with both P2786 and P1144). pkpy::small_vector doesn’t support insert, erase, or swap at all.

PocketPy has never taken commits or code review from Arthur, Mungo, or Alisdair.

Qt

Qt’s implementation is very old — older than any standards proposal in this area.

It implements Q_IS_RELOCATABLE (née Q_IS_MOVABLE) according to the P1144 definition, and has implemented the P1144-alike library algorithm q_uninitialized_relocate since June 2020.

Like BSL, Qt lacks a public rotate algorithm, but internally it uses a q_rotate optimized for trivially relocatable types (compatible only with P1144).

QVector and QList optimize insert and erase (both compatible only with P1144, not P2786). Qt’s incompatibility with P2786 is one of the main takeaways of the blog series “Qt and Trivial Relocatability” linked above.

Qt has never taken commits or code-review from Arthur, Mungo, or Alisdair.

Qt contributor Giuseppe D’Angelo authored P3233R0 “Issues with P2786” in April 2024.

Two Qt maintainers signed P3236R1. They commented in favor and “weak ‘leans-to’”, respectively, on Clang #84621.

small_vectors

Artur Bać’s C++23 small_vectors library provides small_vectors::is_relocatable_v with P1144 semantics.

It optimizes only reallocation (compatible with both P2786 and P1144).

It implements (as an implementation detail) a P1144-alike detail::uninitialized_relocate_n(first, n, dest), with no optimization. Besides that algorithm, it also provides three novel algorithms:

  • uninitialized_relocate_with_copy_n(first, n, dest) which does copy-and-destroy instead of move-and-destroy

  • uninitialized_relocate_if_noexcept_n(first, n, dest) which does move-and-destroy if move is nothrow, otherwise copy-and-destroy

  • uninitialized_uneven_range_swap(first1, n1, first2, n2) which swaps the first min(n1, n2) elements and then relocates the rest from the end of the longer range to the end of the shorter range. This is the building block of a small-vector swap (compatible only with P1144); although in fact this is dead code — small_vectors currently provides no swap functionality.

small_vectors’ almost-sole contributor commented in favor on Clang #84621.

stdx::error

Charles Salvia’s stdx::error library defines stdx::is_trivially_relocatable with P1144 semantics; if the P1144 feature-test macro __cpp_lib_trivially_relocatable is defined, it will fall back to std::is_trivially_relocatable. Unusually for this survey, stdx::error will use Clang’s builtin __is_trivially_relocatable even in the absence of any feature-test macro.

stdx::error uses P1144’s [[trivially_relocatable]] attribute to mark error as trivially relocatable, if the feature-test macro __cpp_impl_trivially_relocatable is defined.

Its use of trivial relocation is compatible with both P2786 and P1144.

All of this code was contributed by Arthur O’Dwyer (#1, #2) during the November 2023 Kona meeting. stdx::error has never taken commits or code review from Mungo or Alisdair.

Subspace

Chromium Subspace defines concept sus::mem::TriviallyRelocatable with P1144 semantics: the internal documentation talks about “non-trivial move operations and destructors,” and the concept tests is_trivially_move_assignable.

Alone in this survey, Subspace tests __has_extension(trivially_relocatable) when deciding whether to trust Clang’s __is_trivially_relocatable builtin. This __has_extension flag is defined in Arthur’s P1144 reference implementation but not in Clang trunk, nor in Corentin Jabot’s P2786 reference implementation.

sus::collections::Vec<T> optimizes reserve (compatible with both P2786 and P1144). Trivial relocation is also used as a building block in Vec::drain (compatible with both P2786 and P1144). Vec doesn’t support arbitrary insert or erase.

Subspace’s almost-sole-contributor Dana Jansens wrote a blog post describing Subspace’s use of trivial relocation in sus::mem::swap(T&, T&), which is compatible only with P1144, not P2786:

Dana also commented in favor on Clang #84621.

Thermadiag/seq

The seq library (“a collection of original C++14 STL-like containers and related tools”) implements seq::is_relocatable as a synonym for is_trivially_copyable (compatible with P1144), but with code comments suggesting that it is only used to replace move-construction and destruction (compatible with P2786 but not P1144).

However, seq::detail::CircularBuffer does optimize insert and erase (compatible only with P1144). It also provides the novel operation push_back_pop_front_relocatable(v), which is a three-element rotate — v goes into the back of the queue while the front of the queue goes into v — optimized into three memcpys. If we understand this as destroying and reconstructing v, it’s compatible with both P2786 and P1144; if we understand it as an assignment to v, it’s compatible only with P1144.

seq::any has a policy parameter (with good documentation) controlling whether it’s allowed to hold non-trivially-relocatable types in its SBO buffer. This is compatible with both P2786 and P1144.

seq::radix_map optimizes reallocation (compatible with both P2786 and P1144).

Thrust

NVIDIA Thrust implements thrust::is_trivially_relocatable with P1144 semantics and documentation that refers to P1144R0.

It uses the trait in async_copy_n to relocate data from one place to another (compatible with both P1144 and P2786); however, async_copy_n itself is then used as a building block of the in-place async_stable_sort_n analogous to ParlayLib’s code quoted above. Since this uses relocation to permute elements during their lifetime (“swap by relocating”), it is compatible with P1144 but not P2786.

Thrust has never taken commits or code-review from Arthur, Mungo, or Alisdair.

History of P1144 and P2786 in WG21

Links in this section tend to go to the committee minutes on the WG21 wiki, which is private and accessible to committee members only.

  • P1144 “std::is_trivially_relocatable (Arthur O’Dwyer, 2018–present) remains stuck in EWGI. It has been discussed three times in the six years since 2018:

    • At the February 2019 Kona meeting, EWGI saw P1144R3 and gave feedback.

    • At the February 2020 Prague meeting, EWGI saw P1144R4 (in my absence) and voted to forward to EWG (1–3–4–1–0, votes ordered “Strongly in Favor” to “Strongly Against”). No changes were requested (since everyone seems to have assumed it was being forwarded). P1144R5 was published in March 2020, but was never scheduled again in WG21 for the next three years.

    • At the February 2023 Issaquah meeting, EWGI saw both P2786R0 and P1144R6. An author of P2786 asked that P2786 and P1144 be forwarded together to EWG, or not at all, so that EWG could have a design discussion seeing both designs at once. However, two separate forwarding polls were taken. P1144R6 (with one author present remotely) polled 0–7–4–3–1; P2786R0 (with two) polled 1–8–3–3–1. The former was judged “not consensus” and the latter was judged “consensus.”

    • At the November 2023 Kona meeting, trivial relocatability was discussed in a Friday informational-only session of EWG. No quorum was present and no minutes were taken.

  • P2786 “Trivial Relocatability for C++26” (Mungo Gill and Alisdair Meredith, 2023–present) has passed once through EWG and CWG, although it enters St Louis on LEWG’s plate and also (luckily, although I don’t quite understand how we got here procedurally) back on EWG’s plate for further discussion.

    • P2786R0 was first published in February 2023, seen by EWGI (as above), forwarded 1–8–3–3–1.

    • At the November 2023 Kona meeting, trivial relocatability was discussed in a Friday informational-only session of EWG. No quorum was present and no minutes were taken.

    • At the February 2024 Tokyo meeting, EWG saw P2786 alone (despite the author’s request to discuss it alongside P1144). It was forwarded to CWG (7–9–6–0–2), seen by CWG, and approved. It was then sent to LEWG for consideration by that subgroup.

    • On 2024-04-09, LEWG saw P2786R4 in a Zoom telecon. It was pointed out by several participants that P2786 didn’t match the semantics of any present-day library; even Bloomberg’s own BSL uses the P1144 model.

  • The past year (especially the months since the February 2024 Tokyo meeting) has seen several new papers in the space:

Posted 2024-06-15