Some C++20 ranges aren’t const-iterable
Via StackOverflow: Newcomers to C++20 Ranges are often surprised that some ranges cannot be “const-iterated.” But it’s true!
template<std::ranges::range R>
void foreach(const R& rg) { // Wrong!
for (auto&& elt : rg) { }
}
C++98’s tricky iterators liked to store “state” inside the iterator object; e.g. istream_iterator<T>
stores
a whole T
inside the iterator, and copies the T
when you copy the iterator. Instead, C++20 Ranges likes
to store state inside the range object; e.g. ranges::istream_view<T>
stores that T
inside the istream_view
,
and the istream_view
’s iterators merely hold a reference to it. This means the iterators of an istream_view<T>
are safely-and-cheaply movable even when T
is not. (Godbolt.) Thus, incrementing
an istream_view::iterator
modifies the istream_view
itself. So the mere act of handing out an iterator grants
permission to modify the view, and thus can’t be done on a const
view object. istream_view::begin()
is a
non-const member function.
Another classic example of storing “state” inside a view is filter_view
.
When you’re filtering an input range, filter_view
simply stores a view of the underlying range and the
predicate on which to filter. Its begin()
returns a filter_view::iterator
to the first element of the underlying range
that satisfies the predicate. The iterator skips over elements that don’t match the predicate, so its operator++
can take
up to linear time… and so can its begin()
. But you only ever call begin()
once, so that’s not a problem.
When you’re filtering a forward range, it gets tricky! A forward range is multi-pass, so it’s theoretically possible the
user might call begin()
a bunch of times in a row. For efficiency, therefore, filter_view
is required to cache
the computed value of begin()
to make each subsequent call O(1). (In general you still won’t call begin()
more than once;
but if you ever did, you’d see it be faster the second time.)
We can observe the cost of that cache by asking for the sizeof
a filter_view
over an iterator that claims to be single-pass,
versus the same view over an iterator that claims to be multi-pass. Godbolt:
template<bool B>
struct DevZeroIterator {
using value_type = int;
using iterator_category = std::conditional_t<B,
std::forward_iterator_tag,
std::input_iterator_tag>;
using difference_type = int;
int operator*() const { return 0; }
auto& operator++() { return *this; }
auto operator++(int) { return *this; }
bool operator==(const DevZeroIterator&) const = default;
};
auto v1 = std::views::counted(DevZeroIterator<true>(), 10);
auto v2 = std::views::counted(DevZeroIterator<false>(), 10);
auto f1 = v1 | std::views::filter(std::identity());
auto f2 = v2 | std::views::filter(std::identity());
printf("%zu, %zu\n", sizeof(f1), sizeof(f2));
// Microsoft: 28, 16
// libstdc++: 20, 8
// libc++: 12, 4
At first glance, this suggests to me that there might be a market for some kind of views::singlepass(r)
that downgrades
a range’s iterator type all the way to input_iterator_tag
, in order to avoid paying for caching at various levels.
But at second glance, it seems wrong to strip the traversal operations like operator+=
from an iterator simply because
we intend to iterate it in only a single pass. During the C++11 cycle, Boost.Iterator
had some thoughts about how to separate the orthogonal aspects that the C++98/20 iterator model mashes together;
but I don’t think anything ever came of that idea, even within Boost.
Anyway, in the case of filter_view
, incrementing the iterator doesn’t modify the view object. It’s the mere act of
computing begin()
’s return value that modifies the view, by updating the cache.
Yes, we could make the cache
mutable
, but then we’d have to guard it withonce_flag
to prevent data races. You’d have to link against pthreads just toviews::filter
an array. That’d be ridiculous.
A third example of a non-const-iterable range is ranges::subrange
. The job of subrange
is to
temporarily hold onto an iterator/sentinel pair and hand them back out when you call begin()
resp. end()
.
If the iterator type is move-only (like istream_view::iterator
above), then subrange::begin()
will
move it out, which modifies the subrange
object, so again subrange::begin()
must be non-const.
Yes, this happens (Godbolt) even when the
subrange
is an lvalue. In normal C++, lvalue-ness means you might look at that value later. But in Ranges, value category is a proxy for lifetime — that’s why we always take ranges by forwarding reference but treat them as lvalues inside the function body — and “look-at-later-ness” is communicated strictly by the multi-pass-ness (forward-ness) of the iterator type.subrange::begin()
is allowed to modify the subrange in this case because we know it’s an input range — it would be a semantic error to callbegin()
a second time.
template<bool B>
struct DevZeroIterator {
using value_type = int;
using difference_type = int;
explicit DevZeroIterator() = default;
DevZeroIterator(const DevZeroIterator&) requires B = default;
DevZeroIterator(DevZeroIterator&&) = default;
DevZeroIterator& operator=(const DevZeroIterator&) = default;
int operator*() const { return 0; }
auto& operator++() { return *this; }
void operator++(int) {}
bool operator==(std::default_sentinel_t) const { return false; }
};
auto r1 = std::ranges::subrange(DevZeroIterator<true>(), std::default_sentinel);
auto r2 = std::ranges::subrange(DevZeroIterator<false>(), std::default_sentinel);
static_assert(std::ranges::range<const decltype(r1)>);
static_assert(!std::ranges::range<const decltype(r2)>);
Bottom line: Never take arbitrary ranges (not even views, which might be filter_view
;
not even borrowed ranges, which might be subrange
) by const&
. C++20 Ranges are
designed always to be taken by forwarding reference, and then iterated as (possibly-modifiable) lvalues.
template<std::ranges::range R>
void foreach(const R& rg) { // Wrong!
for (auto&& elt : rg) { }
}
template<std::ranges::range R>
void foreach(R&& rg) { // Right!
for (auto&& elt : rg) { }
}
Of course this doesn’t apply if you know what kind of object you’ll receive.
If you know you’re taking a std::vector<int>
, take by const&
. If you know you’re
taking a string_view
, take by value.
The advice to take ranges by forwarding reference applies only to STL-algorithm-like templates
dealing with ranges of completely unknown type R
.
See also:
- “Value category is not lifetime” (2019-03-11)
- “
for (auto&& elt : range)
Always Works” (2018-12-15) - “Iterating and inverting a const
views::filter
” (2023-03-13)