Shifting objects by less than sizeof(T)
sizeof(T)
This morning I watched the video of Christopher Fretz’s talk from C++Now 2024, “Designing a Slimmer Vector of C++ Variants.” I recommend it! I found two interesting things to say about it:
First, around the 40-minute mark,
Chris gives a pretty good rundown of what “alignment” means in C++. In my training classes
I’ve found that people are generally aware of sizeof(T)
, and can explain that it’s the
number of bytes required to store a T
— its meaning is pretty clear and calculable a priori.
It often comes as news to them that there’s this second magic number, alignof(T)
, which represents
the number n such that “if a T
object’s address is ever nonzero modulo n, then the hardware
might make bad stuff happen” — and its value often seems arbitrary, determined a posteriori from
the instruction set and sometimes from ABI conventions too. If you’ve never heard of alignment,
or you’ve heard of the alignof
keyword but don’t know what it does, then I recommend this
part of Chris’s talk.
Second, around 43 minutes,
Chris shows that his “vector of variants” data structure gives rise to a really interesting
situation. Watch the full talk for details; but basically, where an ordinary vector<variant<A,B>>
looks like this:
…Chris’s vvector<A,B>
looks like this instead (Godbolt):
It manages a buffer of raw memory into which we construct elements of type A
or B
,
plus a separate vector of “metadata” that tells us for each element what type it is (A
or B
)
and where it’s located (as a byte offset into the buffer).
In an ordinary
vector<variant<A,B>>
, you can change thei
-th element fromA
toB
no problem. But invvector
, changing thei
-th element fromA
toB
would require shifting the rest of the elements to make room for the newB
. Therefore Chris’svvector
, somewhat likestd::set
, makes its elements immutable as long as they’re in the container.
When we insert a new element at the beginning of the container, we need to shift down the
other elements to make room, just like we would when inserting into a std::vector
.
Each existing element is relocated from its current byte offset to an offset further down
in the buffer. In P1144-land,
std::vector::insert
simply uses std::uninitialized_relocate_backward(p, p+n, q)
where the
ranges p
and q
happen to overlap because q == p+1
.
But here’s what happens when we insert an A
object at the beginning of our vvector
from above:
This is the interesting situation I promised you. Notice that the first B
object (the original vv[0]
)
used to be located at offset 0
, occupying bytes 0,1,2,3,4,5.
After the insertion, it’s now at offset 4
, occupying bytes 4,5,6,7,8,9.
So the operation here is, again, a relocation from p
to q
, where p
and q
happen to overlap… but
this time p
and q
aren’t ranges; they’re single objects! When B
is trivially relocatable, that’s fine:
memmove
automatically handles overlapping ranges of bytes and doesn’t care how many objects are
involved. But for non-trivially-relocatable objects, this operation cannot be performed by:
std::construct_at(q, std::move(*p));
std::destroy_at(p);
nor by (P1144):
std::relocate_at(p, q);
No, it has to be done with the help of a temporary object, for example like this (P1144):
std::construct_at(q, std::relocate(p));
The line above relocates from p
into a materialized temporary, then move-constructs into q
,
then (at the end of the full-expression) destroys the temporary.
Watch out!
::new (q) T(std::relocate(p))
would have UB instead, becausestd::relocate(p)
is a prvalue — nothing materializes on the stack unless we pass that prvalue to a function that turns it into an xvalue, such asconstruct_at
ormove
.
What’s the lesson here? Are we worried enough about this kind of use-case to suggest that
std::uninitialized_relocate
and/or std::relocate_at
ought to guard against this kind of
partial-object overlap? For example, instead of specifying that relocate_at
move-constructs from src
to dest
and then destroys src
, we could respecify it to move-construct from src
to tmp
, destroy src
, move-construct
from tmp
to dest
, and finally destroy tmp
. But in real life — no. That would lose tons of performance
in the expected use-cases (such as vector
and rotate
). I’m quite
confident that the real lesson here is: If you’re doing odd things with byte offsets, you should watch out for
pitfalls. But this is a really cool use-case and a really fun and interesting situation Chris found himself in!
I enjoyed hearing about it.