MapView
can be faster than MapRef
MapView
can be faster than MapRef
Chandler Carruth gave a pretty great lightning talk
at C++Now 2019 on his new ideas for a
Map
API; that is, the API for a data structure that stores key-value pairs. I disagreed
with some of his intuitions, such as the promiscuous use of overloading — in particular
I recall that some member functions took lambdas and were overloaded based on their
parameter’s parameter type! Yuck! But the thing that made it great was this observation:
Sometimes a
view
type is not just a convenience, but a performance optimization.
Chandler’s library provided not only Map<K,V>
, but also MapRef<K,V>
and MapView<K,V>
.
The difference between MapRef
and MapView
is mutability. MapRef
is mutable,
whereas MapView
is immutable.
A MapView
is more than just a ConstMapRef
. The name ConstMapRef
implies that
the holder of the reference can’t modify the map, but maybe someone else can. That is,
a ConstMapRef
behaves pretty much just like a const Map&
.
The name MapView
implies that the map itself can’t be modified. That is, a MapView
behaves pretty much like an iterator-pair, or a reference to some element of the map.
If I’ve got a MapView
into your map, and then you modify that
map by inserting or deleting elements, your action invalidates my view.
Accessing via an invalid view produces undefined behavior.
Suppose we have a Map
type like Chandler’s. The elements of the map are stored in
contiguous memory (unlike either std::map
or std::unordered_map
) because we like
cache-locality. Therefore, we might as well add a small buffer optimization (SBO).
If the map has few enough elements, we’ll just store them all in-line.
But SBO has a cost! Now every time we call m.lookup(key)
, we have to start by getting
a pointer to the map’s contiguous data. We could pull a libstdc++ std::string
and
store a pointer-to-our-own-buffer, but that would harm our trivial relocatability.
So instead, m
compares its size to the capacity of the SBO buffer and branches on the
result. It does this comparison and branch on every lookup. (Modulo the as-if rule,
of course.)
When we create a MapView
, on the other hand, we know that the underlying map will never
change size, and so its contiguous data will remain in the same place
on every lookup. We can cache the Map
’s data pointer as a member variable of our MapView
object.
(Notice that our MapView
remains trivially relocatable.) This means that when we use
a MapView
, we don’t perform a compare-and-branch on every lookup
! Lookups in a MapView
can be faster than lookups in a fundamentally-mutable Map
.
I wrote a quick benchmark to test this intuition with respect to const std::string&
versus
C++17 std::string_view
. Here are the results
(backup). Superficially, they support the
intuition above.
Iterating over the elements of a string via
int n = s.size();
for (int i=0; i < n; ++i) {
result += s[i];
}
takes noticeably longer when s
is a const std::string&
than when s
is a std::string_view
.
However, I think it would be wrong in this case to blame the compare-and-branch used (on libc++)
to fetch the data pointer; the assembly shows us that the compiler is sufficiently smart to
see that the string is not modified and thus to hoist the compare-and-branch out of the inner loop.
The other weird thing going on there is that using a ranged for
loop actually hurts our performance
a lot. I haven’t figured out what Clang is doing there. (Godbolt.)
Where Chandler’s differentiation of const Map&
from MapView
could really save us a lot of time,
even on a trivial benchmark like this one, is if Map::lookup
and MapView::lookup
are implemented
in a different object file. That would prevent the compiler from hoisting the compare-and-branch
as it has done in this simple case.
Conclusion: You can sometimes think of a view
type as a particularly ergonomic form of manual
loop-invariant code motion.