Shifting objects by less than sizeof(T)

This morning I watched the video of Christopher Fretz’s talk from C++Now 2024, “Designing a Slimmer Vector of C++ Variants.” I recommend it! I found two interesting things to say about it:

First, around the 40-minute mark, Chris gives a pretty good rundown of what “alignment” means in C++. In my training classes I’ve found that people are generally aware of sizeof(T), and can explain that it’s the number of bytes required to store a T — its meaning is pretty clear and calculable a priori. It often comes as news to them that there’s this second magic number, alignof(T), which represents the number n such that “if a T object’s address is ever nonzero modulo n, then the hardware might make bad stuff happen” — and its value often seems arbitrary, determined a posteriori from the instruction set and sometimes from ABI conventions too. If you’ve never heard of alignment, or you’ve heard of the alignof keyword but don’t know what it does, then I recommend this part of Chris’s talk.

Second, around 43 minutes, Chris shows that his “vector of variants” data structure gives rise to a really interesting situation. Watch the full talk for details; but basically, where an ordinary vector<variant<A,B>> looks like this:

…Chris’s vvector<A,B> looks like this instead (Godbolt):

It manages a buffer of raw memory into which we construct elements of type A or B, plus a separate vector of “metadata” that tells us for each element what type it is (A or B) and where it’s located (as a byte offset into the buffer).

In an ordinary vector<variant<A,B>>, you can change the i-th element from A to B no problem. But in vvector, changing the i-th element from A to B would require shifting the rest of the elements to make room for the new B. Therefore Chris’s vvector, somewhat like std::set, makes its elements immutable as long as they’re in the container.

When we insert a new element at the beginning of the container, we need to shift down the other elements to make room, just like we would when inserting into a std::vector. Each existing element is relocated from its current byte offset to an offset further down in the buffer. In P1144-land, std::vector::insert simply uses std::uninitialized_relocate_backward(p, p+n, q) where the ranges p and q happen to overlap because q == p+1.

But here’s what happens when we insert an A object at the beginning of our vvector from above:

This is the interesting situation I promised you. Notice that the first B object (the original vv[0]) used to be located at offset 0, occupying bytes 0,1,2,3,4,5. After the insertion, it’s now at offset 4, occupying bytes 4,5,6,7,8,9. So the operation here is, again, a relocation from p to q, where p and q happen to overlap… but this time p and q aren’t ranges; they’re single objects! When B is trivially relocatable, that’s fine: memmove automatically handles overlapping ranges of bytes and doesn’t care how many objects are involved. But for non-trivially-relocatable objects, this operation cannot be performed by:

std::construct_at(q, std::move(*p));
std::destroy_at(p);

nor by (P1144):

std::relocate_at(p, q);

No, it has to be done with the help of a temporary object, for example like this (P1144):

std::construct_at(q, std::relocate(p));

The line above relocates from p into a materialized temporary, then move-constructs into q, then (at the end of the full-expression) destroys the temporary.

Watch out! ::new (q) T(std::relocate(p)) would have UB instead, because std::relocate(p) is a prvalue — nothing materializes on the stack unless we pass that prvalue to a function that turns it into an xvalue, such as construct_at or move.

What’s the lesson here? Are we worried enough about this kind of use-case to suggest that std::uninitialized_relocate and/or std::relocate_at ought to guard against this kind of partial-object overlap? For example, instead of specifying that relocate_at move-constructs from src to dest and then destroys src, we could respecify it to move-construct from src to tmp, destroy src, move-construct from tmp to dest, and finally destroy tmp. But in real life — no. That would lose tons of performance in the expected use-cases (such as vector and rotate). I’m quite confident that the real lesson here is: If you’re doing odd things with byte offsets, you should watch out for pitfalls. But this is a really cool use-case and a really fun and interesting situation Chris found himself in! I enjoyed hearing about it.

See my toy implementation of Chris’s vvector here (backup).

Posted 2024-11-08