<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://quuxplusone.github.io/blog/feed.xml" rel="self" type="application/atom+xml" /><link href="https://quuxplusone.github.io/blog/" rel="alternate" type="text/html" /><updated>2026-05-12T01:01:42+00:00</updated><id>https://quuxplusone.github.io/blog/feed.xml</id><title type="html">Arthur O’Dwyer</title><subtitle>Stuff mostly about C++</subtitle><entry><title type="html">`std::is_heap` could be faster</title><link href="https://quuxplusone.github.io/blog/2026/05/11/is-heap/" rel="alternate" type="text/html" title="`std::is_heap` could be faster" /><published>2026-05-11T00:01:00+00:00</published><updated>2026-05-11T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/05/11/is-heap</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/05/11/is-heap/"><![CDATA[<p>The other day I was noodling around with some libc++ unit-test code that looked
roughly like this (<a href="https://godbolt.org/z/cGWcrdjcs">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template&lt;class A&gt;
auto extract_container(A&amp; a) {
  struct UnwrapAdaptor : A { A::container_type&amp; cc = A::c; };
  return UnwrapAdaptor(a).cc;
}

template&lt;class Adaptor&gt;
void test_push_range(bool is_heapified) {
  int in1[] = {1,3,7};
  int in2[] = {2,4,5,6};
  int expected[] = {1,3,7,2,4,5,6};
  Adaptor a;
  a.push_range(in1);
  a.push_range(in2);
  if (auto c = extract_container(a); is_heapified) {
    assert(std::ranges::is_heap(c));
    assert(std::ranges::is_permutation(c, expected));
  } else {
    assert(std::ranges::equal(c, expected));
  }
}

int main() {
  test_push_range&lt;std::stack&lt;int&gt;&gt;(false);
  test_push_range&lt;std::queue&lt;int&gt;&gt;(false);
  test_push_range&lt;std::priority_queue&lt;int&gt;&gt;(true);
}
</code></pre></div></div>

<p>I tried to extend <code class="language-plaintext highlighter-rouge">main</code> to test also some non-default containers.
(Incidentally, did you know <code class="language-plaintext highlighter-rouge">stack</code>’s default container is <code class="language-plaintext highlighter-rouge">deque</code>,
not <code class="language-plaintext highlighter-rouge">vector</code>?)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  test_push_range&lt;std::stack&lt;int, std::vector&lt;int&gt;&gt;&gt;(false);
  test_push_range&lt;std::stack&lt;int, std::list&lt;int&gt;&gt;&gt;(false);
</code></pre></div></div>

<p>…And suddenly the unit test no longer compiled!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: no matching function for call to object of type 'const __is_heap'
   22 |     assert(std::ranges::is_heap(c));
      |            ^~~~~~~~~~~~~~~~~~~~
[...]
note: because 'std::list&lt;int&gt; &amp;' does not satisfy 'random_access_range'
</code></pre></div></div>

<p>That caught me by surprise. Why should testing whether a range is heapified
require random access? It’s simple to implement with only forward traversal,
and <code class="language-plaintext highlighter-rouge">make_heap</code>/<code class="language-plaintext highlighter-rouge">is_heap</code> seems to analogize perfectly against <code class="language-plaintext highlighter-rouge">sort</code>/<code class="language-plaintext highlighter-rouge">is_sorted</code>.
<code class="language-plaintext highlighter-rouge">is_sorted</code> doesn’t require random access; why should <code class="language-plaintext highlighter-rouge">is_heap</code>?</p>

<table>
  <thead>
    <tr>
      <th>Ranges algorithm</th>
      <th>Constraint</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">sort</code></td>
      <td><code class="language-plaintext highlighter-rouge">random_access_range</code></td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">is_sorted</code></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">is_sorted_until</code></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
  </tbody>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">make_heap</code></td>
      <td><code class="language-plaintext highlighter-rouge">random_access_range</code></td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">is_heap</code></td>
      <td><code class="language-plaintext highlighter-rouge">random_access_range</code> (could be <code class="language-plaintext highlighter-rouge">forward_range</code>)</td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">is_heap_until</code></td>
      <td><code class="language-plaintext highlighter-rouge">random_access_range</code> (could be <code class="language-plaintext highlighter-rouge">forward_range</code>)</td>
    </tr>
  </tbody>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">partition</code></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">is_partitioned</code></td>
      <td><code class="language-plaintext highlighter-rouge">input_range</code></td>
    </tr>
    <tr>
      <td>  <a href="https://www.boost.org/doc/libs/latest/libs/algorithm/doc/html/algorithm/Misc.html#the_boost_algorithm_library.Misc.misc_inner_algorithms.is_partitioned_until"><code class="language-plaintext highlighter-rouge">is_partitioned_until</code></a></td>
      <td><code class="language-plaintext highlighter-rouge">input_range</code></td>
    </tr>
  </tbody>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">unique</code></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
    <tr>
      <td>  <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2848r1.html"><code class="language-plaintext highlighter-rouge">is_uniqued</code></a></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
    <tr>
      <td>  <code class="language-plaintext highlighter-rouge">adjacent_find</code></td>
      <td><code class="language-plaintext highlighter-rouge">forward_range</code></td>
    </tr>
  </tbody>
</table>

<p>Note that <code class="language-plaintext highlighter-rouge">is_partitioned</code> only needs to view one element at a time, and remember
the boolean value of <code class="language-plaintext highlighter-rouge">pred(elt)</code>. By contrast, <code class="language-plaintext highlighter-rouge">is_sorted</code>, <code class="language-plaintext highlighter-rouge">adjacent_find</code>, and <code class="language-plaintext highlighter-rouge">is_heap</code>
need to view two elements at a time; that’s why they can’t handle <code class="language-plaintext highlighter-rouge">input_range</code>.</p>

<p>The two “could bes” in the table above seem to indicate a design defect.</p>

<h2 id="benchmark-it">Benchmark it!</h2>

<p>libc++’s implementation of <code class="language-plaintext highlighter-rouge">std::is_heap</code> looks <a href="https://github.com/llvm/llvm-project/blob/17e0686ab1107a1a675d8783383dedf70fa24033/libcxx/include/__algorithm/is_heap_until.h#L23-L46">shockingly inefficient</a>:
its 24 lines spend a lot of time recomputing subexpressions like <code class="language-plaintext highlighter-rouge">__first + __c</code> (to get to the <code class="language-plaintext highlighter-rouge">c</code>’th element)
and <code class="language-plaintext highlighter-rouge">2 * __p + 1</code> (to compute the child index from the parent). A straight-line implementation,
avoiding all arithmetic and just advancing two iterators in lockstep,
would have taken only 18 lines:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template &lt;class _Compare, class _ForwardIterator, class _Sentinel&gt;
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 _ForwardIterator
__is_heap_until(_ForwardIterator __first, _Sentinel __last, _Compare&amp;&amp; __comp) {
  _ForwardIterator __child = __first;
  if (__child == __last) {
    return __child;
  }
  while (true) {
    ++__child;
    if (__child == __last || __comp(*__first, *__child))
      break;
    ++__child;
    if (__child == __last || __comp(*__first, *__child))
      break;
    ++__first;
  }
  return __child;
}
</code></pre></div></div>

<p>Surprisingly, <a href="https://github.com/gcc-mirror/gcc/blob/319c0f0249cb30c2290426a3a9d4ce81a47a684d/libstdc%2B%2B-v3/include/bits/stl_heap.h#L72-L94">libstdc++</a>
and <a href="https://github.com/microsoft/STL/blob/main/stl/inc/algorithm#L7667-L7679">MS STL</a> also spend time adding and dividing
(or shifting) when they don’t need to.</p>

<p>Still, just because a piece of code <em>looks</em> inefficient doesn’t always mean that it <em>is</em> inefficient.
So I whipped up a simple benchmark:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void BM_vector_half(benchmark::State&amp; state) {
  auto n = state.range(0);
  auto v = std::vector&lt;unsigned&gt;(n);
  std::mt19937 g;
  std::generate(v.begin(), v.end(), g);
  std::make_heap(v.begin(), v.begin() + (n / 2));
  for (auto _ : state) {
    benchmark::DoNotOptimize(v);
    auto it = std::is_heap_until(v.begin(), v.end());
    benchmark::DoNotOptimize(it);
  }
}
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">vector_full</code> is the same but with <code class="language-plaintext highlighter-rouge">make_heap(v.begin(), v.end())</code>.
<code class="language-plaintext highlighter-rouge">deque_{half,full}</code> are the same but with <code class="language-plaintext highlighter-rouge">deque</code> instead of <code class="language-plaintext highlighter-rouge">vector</code>.</p>

<p>Here’s the benchmark result before and after applying the patch above to libc++.
In this case, our eyes don’t lie: what <em>looks</em> inefficient <em>is</em> inefficient!
(Note that our comparator here is <code class="language-plaintext highlighter-rouge">less&lt;int&gt;</code>, which is cheap. An expensive comparator
could dwarf the cost of arithmetic, making the benefit of this patch less perceptible.)</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Benchmark</th>
      <th style="text-align: left"><code class="language-plaintext highlighter-rouge">n</code></th>
      <th style="text-align: right">CPU time (before)</th>
      <th style="text-align: right">CPU time (after)</th>
      <th style="text-align: right">%</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_half</code></td>
      <td style="text-align: left">1K</td>
      <td style="text-align: right">4155 ns</td>
      <td style="text-align: right">2658 ns</td>
      <td style="text-align: right">−36%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_half</code></td>
      <td style="text-align: left">100K</td>
      <td style="text-align: right">311560 ns</td>
      <td style="text-align: right">263807 ns</td>
      <td style="text-align: right">−15%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_half</code></td>
      <td style="text-align: left">10M</td>
      <td style="text-align: right">32487789 ns</td>
      <td style="text-align: right">26089364 ns</td>
      <td style="text-align: right">−19%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_full</code></td>
      <td style="text-align: left">1K</td>
      <td style="text-align: right">8813 ns</td>
      <td style="text-align: right">5294 ns</td>
      <td style="text-align: right">−39%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_full</code></td>
      <td style="text-align: left">100K</td>
      <td style="text-align: right">644535 ns</td>
      <td style="text-align: right">535987 ns</td>
      <td style="text-align: right">−16%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">vector_full</code></td>
      <td style="text-align: left">10M</td>
      <td style="text-align: right">59771100 ns</td>
      <td style="text-align: right">52807359 ns</td>
      <td style="text-align: right">−11%</td>
    </tr>
  </tbody>
  <tbody>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_half</code></td>
      <td style="text-align: left">1K</td>
      <td style="text-align: right">4186 ns</td>
      <td style="text-align: right">2662 ns</td>
      <td style="text-align: right">−36%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_half</code></td>
      <td style="text-align: left">100K</td>
      <td style="text-align: right">375844 ns</td>
      <td style="text-align: right">264856 ns</td>
      <td style="text-align: right">−29%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_half</code></td>
      <td style="text-align: left">10M</td>
      <td style="text-align: right">31999750 ns</td>
      <td style="text-align: right">26152052 ns</td>
      <td style="text-align: right">−18%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_full</code></td>
      <td style="text-align: left">1K</td>
      <td style="text-align: right">9002 ns</td>
      <td style="text-align: right">5271 ns</td>
      <td style="text-align: right">−41%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_full</code></td>
      <td style="text-align: left">100K</td>
      <td style="text-align: right">619876 ns</td>
      <td style="text-align: right">523727 ns</td>
      <td style="text-align: right">−15%</td>
    </tr>
    <tr>
      <td style="text-align: left"><code class="language-plaintext highlighter-rouge">deque_full</code></td>
      <td style="text-align: left">10M</td>
      <td style="text-align: right">65697333 ns</td>
      <td style="text-align: right">52488343 ns</td>
      <td style="text-align: right">−20%</td>
    </tr>
  </tbody>
</table>

<p>Even with the current specification of <code class="language-plaintext highlighter-rouge">is_heap</code>, we can achieve this performance today.
The algorithm can remain <em>constrained</em> on <code class="language-plaintext highlighter-rouge">random_access_range</code>
(thus doing nothing to solve the motivating use-case that introduced this post)
while internally <em>using</em> nothing more than forward-iterator operations.
But once the algorithm uses nothing more than forward-iterator operations…
wouldn’t it be nice for the Standard to say so?</p>

<h2 id="conclusions">Conclusions</h2>

<ul>
  <li>
    <p>libc++ should improve its <code class="language-plaintext highlighter-rouge">is_heap</code> implementation along the lines above,
  reaping the performance benefit.</p>
  </li>
  <li>
    <p>So should libstdc++ and Microsoft STL, I expect.</p>
  </li>
  <li>
    <p>WG21 should consider relaxing the constraint on <code class="language-plaintext highlighter-rouge">is_heap</code> and <code class="language-plaintext highlighter-rouge">is_heap_until</code> from “random access” to “forward,”
  incidentally solving my original use-case. I’ll probably bring a paper to this effect at some point.
  Note that if such a paper were adopted, all three vendors would <em>have</em> to change their
  implementations to the faster one.</p>
  </li>
</ul>]]></content><author><name></name></author><category term="benchmarks" /><category term="library-design" /><category term="proposal" /><category term="stl-classic" /><summary type="html"><![CDATA[`is_sorted` doesn't require random access; why should `is_heap`?]]></summary></entry><entry><title type="html">Two-Minute _Iolanthe_</title><link href="https://quuxplusone.github.io/blog/2026/05/08/two-minute-iolanthe/" rel="alternate" type="text/html" title="Two-Minute _Iolanthe_" /><published>2026-05-08T00:01:00+00:00</published><updated>2026-05-08T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/05/08/two-minute-iolanthe</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/05/08/two-minute-iolanthe/"><![CDATA[<p>The other day I came across Connie Kleinjans’ <a href="https://www.misosoup.com/connie/TwoMin/TwoMin.html">page of “two-minute versions”</a>
of G&amp;S shows. She’s got two versions of <em>Gondoliers</em> and one each of <em>Iolanthe</em> and <em>Ruddigore</em>.
The technique is the same as in <a href="https://en.wikipedia.org/wiki/Blackout_poetry">blackout poetry</a>:
take the whole work and black out all but the most important and/or funniest bits.</p>

<p><img src="/blog/images/2026-05-08-iolanthe-blackout.png" alt="" /></p>

<p>Kleinjans’ short scripts are funny in their own rights, but I wanted audio versions; so I made one.
Presenting <a href="https://www.youtube.com/watch?v=lhXP-kBlt5k">“Two-Minute <em>Iolanthe</em> in five minutes”</a>.</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/lhXP-kBlt5k" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>

<p>The <a href="https://www.youtube.com/watch?v=DLYSZN2IqeU">original recording</a> I “blacked out” for this
video is a TV broadcast of a 1976 production at the Sydney Opera House featuring Rosemary Gunn
(Iolanthe), Heather Begg (Fairy Queen), Dennis Olsen (the Lord Chancellor), June Bronhill (Phyllis),
Lyndon Terracini (Strephon), Graeme Ewer (Mountararat), Ronald Maconaghie (Tolloller), and
Alan Light (Private Willis). This is a low-quality VHS rip of an excellent performance.</p>

<p>The VHS rip on YouTube is missing a chunk of the finale, which
prompted some alterations to the script; I made a few other alterations for pacing. Even after
those cuts, this “two-minute” <em>Iolanthe</em> is almost five minutes long; watch at 2.3x speed for
a true two-minute experience.</p>

<hr />

<p>To create the video, I used <code class="language-plaintext highlighter-rouge">ffmpeg</code> to clip and concat the snippets. When concatenating 169 tiny snippets,
your biggest problem will be “timestamp drift” between the audio and video channels. I spent a long
time cajoling ChatGPT into giving me new permutations of command-line switches to try before finally
settling on the programs linked below.</p>

<p>Step 1 was to get the original video (2.4 gigabytes, saved as <code class="language-plaintext highlighter-rouge">input.mkv</code>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew install ffmpeg yt-dlp
yt-dlp 'https://www.youtube.com/watch?v=DLYSZN2IqeU'
</code></pre></div></div>

<p>Step 2 was to snip the constituent bits via <code class="language-plaintext highlighter-rouge">ffmpeg</code> commands. I factored out the common arguments
into environment variables so I didn’t have to keep typing or tabbing over them.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PREFIX="-y"
SUFFIX="-i input.mkv -c:v libx264 -preset veryfast -crf 20 -c:a aac"
ffmpeg $PREFIX -ss 00:07:07.2 -to 00:07:10.5 $SUFFIX part001.mkv
ffmpeg $PREFIX -ss 00:07:45.0 -to 00:07:48.6 $SUFFIX part002.mkv
~~~~
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ffmpeg</code> turns out to be supremely sensitive to whether you put the input
(<code class="language-plaintext highlighter-rouge">-i input.mkv</code>) before, or after, the <code class="language-plaintext highlighter-rouge">-ss</code> and <code class="language-plaintext highlighter-rouge">-to</code> switches. With <code class="language-plaintext highlighter-rouge">-i input.mkv</code> as
part of the <code class="language-plaintext highlighter-rouge">SUFFIX</code>, my whole script.txt runs in 61 seconds; as part of the <code class="language-plaintext highlighter-rouge">PREFIX</code>,
it takes 98 <em>minutes.</em></p>

<p>To concatenate all those clips and re-encode a “preview” video, we can do this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rm list.txt
for i in part*.mkv ; do echo "file $i" &gt;&gt;list.txt ; done
ffmpeg -y -f concat -safe 0 -i list.txt \
  -c:v libx264 -crf 20 -preset veryfast -c:a aac -ar 48000 output.mp4
</code></pre></div></div>

<p>Step 3 was to fight timestamp desynchronization. My solution here was generated entirely
by blind fumbling and incantations, with input from ChatGPT. It seems that there are
basically two ways to get <code class="language-plaintext highlighter-rouge">ffmpeg</code> to “supercut” a video as we’re doing here: either
clip out the clips into temporary files and then concatenate all those little files
(as we did above — this way causes a lot of drift), or do one big “filter” operation
to take just the frames you care about in a single <code class="language-plaintext highlighter-rouge">ffmpeg</code> invocation. That looks like
this, except with 169 clips instead of 3:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -y -i input.mkv -filter_complex \
 "[0:v]trim=start=427.2:end=430.5,setpts=PTS-STARTPTS[v0];
  [0:a]atrim=start=427.2:end=430.5,asetpts=PTS-STARTPTS[a0];
  [0:v]trim=start=465.0:end=468.6,setpts=PTS-STARTPTS[v1];
  [0:a]atrim=start=465.0:end=468.6,asetpts=PTS-STARTPTS[a1];
  [0:v]trim=start=616.5:end=618.7,setpts=PTS-STARTPTS[v2];
  [0:a]atrim=start=616.5:end=618.7,asetpts=PTS-STARTPTS[a2];
  [v0][a0][v1][a1][v2][a2]concat=n=3:v=1:a=1[v][a]"
  -map [v] -map [a] \
  -c:v libx264 -crf 20 -preset veryfast \
  -c:a aac -b:a 192k -ar 48000 \
  output.mp4
</code></pre></div></div>

<p>That’s horribly slow; it seems to have quadratic behavior as the number of clips
increases. Even worse is trying to use the non-“<code class="language-plaintext highlighter-rouge">complex</code>” filter options <code class="language-plaintext highlighter-rouge">-vf</code> and <code class="language-plaintext highlighter-rouge">-af</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -y -i input.mkv \
  -vf "select=between(t,427.2,430.5)+between(t,465.0,468.6)+between(t,616.5,618.7),setpts=N/FRAME_RATE/TB" \
  -af "aselect=between(t,427.2,430.5)+between(t,465.0,468.6)+between(t,616.5,618.7),asetpts=N/SR/TB" \
  -c:v libx264 -crf 20 -preset veryfast \
  -c:a aac -b:a 192k -ar 48000 \
  output.mp4
</code></pre></div></div>

<p>That just makes <code class="language-plaintext highlighter-rouge">ffmpeg</code> run out of memory before you’ve even hit 50 clips.
So I ended up using a hybrid approach: I used <code class="language-plaintext highlighter-rouge">-filter_complex</code> to produce
nine intermediate concatenations of 20 clips at a time, and then used</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -y -f concat -i list.txt -c copy output.mp4
</code></pre></div></div>

<p>to paste those nine files together. I’ve saved my programs for posterity:
<a href="/blog/code/2026-05-08-iolanthe-script.txt">script.txt</a> runs as a Bash script
(in just over 1 minute on my machine) to create that “draft preview” video output;
its textual contents also serve as input to <a href="/blog/code/2026-05-08-iolanthe-script.py">script.py</a>,
which creates the final product (in about 9 minutes) using the two-level <code class="language-plaintext highlighter-rouge">-filter_complex</code>
approach. The finished <code class="language-plaintext highlighter-rouge">output.mp4</code> is about 42 megabytes in size.</p>]]></content><author><name></name></author><category term="gilbert-and-sullivan" /><category term="how-to" /><category term="jokes" /><category term="memes" /><category term="television" /><category term="web" /><summary type="html"><![CDATA[The other day I came across Connie Kleinjans’ page of “two-minute versions” of G&amp;S shows. She’s got two versions of Gondoliers and one each of Iolanthe and Ruddigore. The technique is the same as in blackout poetry: take the whole work and black out all but the most important and/or funniest bits.]]></summary></entry><entry><title type="html">C++ Alignment Chart</title><link href="https://quuxplusone.github.io/blog/2026/05/06/cpp-alignment-chart/" rel="alternate" type="text/html" title="C++ Alignment Chart" /><published>2026-05-06T00:01:00+00:00</published><updated>2026-05-06T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/05/06/cpp-alignment-chart</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/05/06/cpp-alignment-chart/"><![CDATA[<p><img src="/blog/images/2026-05-06-alignment-chart.jpg" alt="" class="meme" /></p>]]></content><author><name></name></author><category term="alignment" /><category term="memes" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">ELF’s ways to combine potentially non-unique objects</title><link href="https://quuxplusone.github.io/blog/2026/05/05/potentially-nonunique-strategies/" rel="alternate" type="text/html" title="ELF’s ways to combine potentially non-unique objects" /><published>2026-05-05T00:01:00+00:00</published><updated>2026-05-05T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/05/05/potentially-nonunique-strategies</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/05/05/potentially-nonunique-strategies/"><![CDATA[<p>Previously <a href="/blog/2026/04/24/define-static-array/">I wrote</a>:</p>

<blockquote>
  <p>[Template parameter objects of array type] are permitted to overlap or be
coalesced, just like <code class="language-plaintext highlighter-rouge">initializer_list</code>s and string literals. Clang trunk
isn’t smart enough to coalesce potentially non-unique objects [but]
GCC, once it implements <code class="language-plaintext highlighter-rouge">define_static_array</code>, will presumably make them the same.</p>
</blockquote>

<p>Well, GCC 16 has an experimental implementation of <code class="language-plaintext highlighter-rouge">define_static_array</code>
(compile with <code class="language-plaintext highlighter-rouge">g++ -std=c++26 -freflection</code>), and it does <em>not</em> coalesce
template parameter objects of array type in the way I expected. Digging deeper
into why not, I learned that there are at least three ways compilers and linkers
(on ELF — that is, non-Windows — platforms) conspire to “merge”
potentially non-unique objects:</p>

<ul>
  <li>Merging at the compiler level (for <code class="language-plaintext highlighter-rouge">initializer_list</code> backing arrays)</li>
  <li>Sections with <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> (for string literals and backing arrays)</li>
  <li>Sections with <code class="language-plaintext highlighter-rouge">SHF_GROUP</code>, a.k.a. COMDAT sections (for inline variables)</li>
</ul>

<p>Sadly, no combination of these facilities <em>quite</em> achieves
ideal behavior for <code class="language-plaintext highlighter-rouge">define_static_array</code>. Let’s take a look.</p>

<h2 id="the-compiler-can-merge-similar-data">The compiler can merge similar data</h2>

<p>GCC itself merges similar <code class="language-plaintext highlighter-rouge">initializer_list</code> backing arrays. For example
(<a href="https://godbolt.org/z/EnzxoM9ch">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void f(std::initializer_list&lt;int&gt;);
int main() {
  f({1,2,3}); // C.0.0
  f({1,2,3}); // C.1.1
}
</code></pre></div></div>

<p>turns into the assembly directives</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.cst16,"aM",@progbits,16
  .align 8
  .type C.0.0, @object
  .size C.0.0, 12
C.0.0:
  .long 1
  .long 2
  .long 3
  .zero 4
  .set C.1.1,C.0.0
</code></pre></div></div>

<p>The symbols <code class="language-plaintext highlighter-rouge">C.0.0</code> and <code class="language-plaintext highlighter-rouge">C.1.1</code> are set to the same memory address, because GCC
itself can see that the two <code class="language-plaintext highlighter-rouge">initializer_list</code> objects should have the same backing
array.</p>

<p>This is the most powerful approach, but at the same time the least elegant, because
it requires ad-hoc “smarts” built directly into the compiler. For example, we
could imagine GCC generating code that merges one list into the tail of another:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void f(std::initializer_list&lt;int&gt;);
int main() {
  f({1,2,3}); // C.0.0
  f({2,3});   // C.2.2
}

C.0.0:
  .long 1
  .long 2
  .long 3
  .zero 4
  .set C.2.2,C.0.0 + 4
</code></pre></div></div>

<p>but in fact GCC doesn’t generate that code, because nobody has taught GCC that specific
trick. Nor will GCC 16 merge the backing arrays of <code class="language-plaintext highlighter-rouge">{1,2,3}</code> and <code class="language-plaintext highlighter-rouge">{1u,2u,3u}</code> using
this technique, again because it hasn’t been taught to.</p>

<p>Merging things at the compiler level also, by definition, works only within a single
translation unit (a single .cpp file). If you want to merge things between different TUs,
you’ll need help from the linker. Which brings us to…</p>

<h2 id="shf_merge-sections"><code class="language-plaintext highlighter-rouge">SHF_MERGE</code> sections</h2>

<p>I said GCC 16 wouldn’t merge <code class="language-plaintext highlighter-rouge">{1,2,3}</code> and <code class="language-plaintext highlighter-rouge">{1u,2u,3u}</code> in the compiler. But if you
try this program (<a href="https://godbolt.org/z/MqxsPE98z">Godbolt</a>), you’ll see that the
backing arrays are indeed merged at runtime — the same pointer value is printed twice:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template&lt;class... Ts&gt;
void f(std::initializer_list&lt;Ts&gt;... ils) {
  (printf("%p\n", (const void*)ils.begin()) , ...);
}
int main() {
  f({1,2,3},     // C.0.0
    {1u,2u,3u}); // C.1.1
}
</code></pre></div></div>

<p>The same pointer value is printed twice, despite that we can see GCC emitting two
different objects into the assembly file:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.cst16,"aM",@progbits,16
  .align 8
  .type C.0.0, @object
  .size C.0.0, 12
C.0.0:
  .long 1
  .long 2
  .long 3
  .zero 4
  .align 8
  .type C.1.1, @object
  .size C.1.1, 12
C.1.1:
  .long 1
  .long 2
  .long 3
  .zero 4
</code></pre></div></div>

<p>The trick here is in the <code class="language-plaintext highlighter-rouge">.section</code> directive, which creates an ELF section in the
object file with the name <code class="language-plaintext highlighter-rouge">.rodata.cst16</code>, the section flag <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> (that’s the <code class="language-plaintext highlighter-rouge">M</code>
in <code class="language-plaintext highlighter-rouge">"aM"</code>), and a <code class="language-plaintext highlighter-rouge">sh_entsize</code> of <code class="language-plaintext highlighter-rouge">16</code> bytes. After concatenating every object file’s
<code class="language-plaintext highlighter-rouge">.rodata.cst16</code> sections as usual, the linker is permitted (but not required) to treat
the contents of this section as an array of 16-byte elements, and to deduplicate any
identical elements it finds. Since the 16-byte region starting at <code class="language-plaintext highlighter-rouge">C.1.1</code> matches the
16-byte region starting at <code class="language-plaintext highlighter-rouge">C.0.0</code>, the linker is allowed to eliminate the 16 bytes
at <code class="language-plaintext highlighter-rouge">C.1.1</code> and point the label <code class="language-plaintext highlighter-rouge">C.1.1</code> at <code class="language-plaintext highlighter-rouge">C.0.0</code> (or vice versa).</p>

<p>The <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> trick works across TUs, and even across types. For example, GCC 14+
makes the following <em>four</em> initializer lists share a single backing array
at runtime by putting them all into <code class="language-plaintext highlighter-rouge">.rodata.cst16</code> sections with the
<code class="language-plaintext highlighter-rouge">SHF_MERGE</code> flag set. This works even if the lists appear in different TUs!
(<a href="https://godbolt.org/z/19ch8oYK9">Godbolt.</a>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{1,2,3,4}
{1u,2u,3u,4u}
{0x200000001, 0x400000003} // given little-endian int64
{4.2439915824e-314, 8.4879831654e-314} // ditto
</code></pre></div></div>

<p>The major optimization-related downside of the <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> approach — as it is sketched in the
<a href="https://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html#:~:text=The%20data%20in%20the%20section%20may%20be%20merged">System V ABI document</a>
(2001) and as it is implemented in GNU <code class="language-plaintext highlighter-rouge">ld</code> as far as I know — is that you can’t use it to merge data
elements of different sizes or alignments. GCC 16 won’t merge <code class="language-plaintext highlighter-rouge">{1,2}</code> with <code class="language-plaintext highlighter-rouge">{1,2,3}</code>
because GCC puts the former in section <code class="language-plaintext highlighter-rouge">.rodata.cst8</code> and the latter in <code class="language-plaintext highlighter-rouge">.rodata.cst16</code>.
(GCC 14 and 15 put the latter in plain old unmergeable <code class="language-plaintext highlighter-rouge">.rodata</code> instead, because
its size — 12 bytes — isn’t precisely 16 bytes. GCC 16 fixed that.) Basically, GCC has to
precommit every data element to a specific “bucket”; the linker will consider merging
it <em>only</em> with other elements in its own bucket.</p>

<p>And <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> cannot merge parts of elements; it can merge only full elements.
So while you might think a “sufficiently smart linker” could merge <code class="language-plaintext highlighter-rouge">{2,3}</code> across the
conjunction of <code class="language-plaintext highlighter-rouge">{1,2}</code> and <code class="language-plaintext highlighter-rouge">{3,4}</code>, the <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> algorithm by itself will never do that.
Some users might even rely on that guarantee (somehow), so I don’t imagine that
any linker will ever gain the smarts to do that.</p>

<h2 id="shf_merge--shf_strings"><code class="language-plaintext highlighter-rouge">SHF_MERGE | SHF_STRINGS</code></h2>

<p>When an ELF section specifies both <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> and <code class="language-plaintext highlighter-rouge">SHF_STRINGS</code>, then instead of chunking
the section into elements of size <code class="language-plaintext highlighter-rouge">sh_entsize</code> bytes, the linker chunks the section into
variable-length elements each of which is a null-terminated C-style string. It then
deduplicates those strings.</p>

<p>As in the previous subsection, it seems that no linker will merge
<code class="language-plaintext highlighter-rouge">"ello"</code> into the tail of <code class="language-plaintext highlighter-rouge">"Hello"</code>; the <code class="language-plaintext highlighter-rouge">SHF_MERGE|SHF_STRINGS</code> algorithm alone
will merge only full elements. (In this case, full null-terminated strings.)</p>

<p>The <code class="language-plaintext highlighter-rouge">SHF_MERGE|SHF_STRINGS</code> algorithm finds the “elements” of the section by simplemindedly
scanning for null bytes; no additional metadata is involved. Therefore, such a section must
never contain strings with embedded null bytes. Try the following on your machine:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat &gt;x.c &lt;&lt;EOF
#include &lt;stdio.h&gt;
int main() {
  puts("p");
  puts(&amp;"qxp"[2]);
  puts("r");
}
EOF
gcc -S x.c -o - | sed 's/qxp/q\\0p/' &gt; x.s
gcc x.s
./a.out
</code></pre></div></div>

<p>The assembly file <code class="language-plaintext highlighter-rouge">x.s</code> should end up containing something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.str1.1,"aMS",@progbits
.LC0:
  .string "p"
.LC1:
  .string "q\0p"  # Danger!
.LC2:
  .string "r"
</code></pre></div></div>

<p>and when executed, will print not <code class="language-plaintext highlighter-rouge">p p r</code> but rather <code class="language-plaintext highlighter-rouge">p r r</code>, because the <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>
algorithm understands <code class="language-plaintext highlighter-rouge">"q\0p"</code> as <code class="language-plaintext highlighter-rouge">"q" "p"</code> and eliminates the second <code class="language-plaintext highlighter-rouge">"p"</code> as a duplicate.
Therefore, a C++ compiler must go out of its way never to store a string literal containing
embedded null bytes into such a section.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const char *p1 = "hello world"; // literal in .rodata.str1.1 (SHF_MERGE)
const char *p2 = "hell\0world"; // literal in .rodata (not SHF_MERGE)
</code></pre></div></div>

<p>In theory, a compiler could place the backing array for an <code class="language-plaintext highlighter-rouge">initializer_list&lt;char&gt;</code>
into a mergeable string section, and potentially merge the backing arrays for
<code class="language-plaintext highlighter-rouge">{'x','y','z','\0'}</code> and <code class="language-plaintext highlighter-rouge">"xyz"</code>. In practice, neither GCC nor Clang does this (yet).</p>

<hr />

<p>The big disadvantage of <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> merging from the C++ compiler’s point of view —
the thing that makes it unsuitable for certain use-cases such as merging the duplicate definitions
of template parameter objects — is that it is completely optional. It’s legal for a
dumb linker to just ignore the <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> flag. It would be unwise to rely on <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>
to take care of merging objects that the C++ standard <em>requires</em> us to merge, such as the
duplicate definitions of inline variables or template parameter objects.</p>

<p>And, of course, it would be wrong to place an inline variable into an <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> section
anyway, because the standard (and common sense) <em>forbids</em> us to merge unrelated inline
variables just because they happen to have the same value!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inline constexpr char a[] = "hello world";
inline constexpr char b[] = "hello world";
</code></pre></div></div>

<p>Here <code class="language-plaintext highlighter-rouge">&amp;a == &amp;b</code> is guaranteed to be false; an implementation that made it true would be
non-conforming. (Even <code class="language-plaintext highlighter-rouge">gcc -fmerge-all-constants</code> will make it true, non-conformingly,
only if you remove the <code class="language-plaintext highlighter-rouge">inline</code> keyword in both places.)</p>

<p>To handle inline variables, which are deduplicated according to their <em>symbol names</em>
rather than their <em>data contents</em>, we need the next approach, which is…</p>

<h2 id="shf_group-aka-comdat-sections"><code class="language-plaintext highlighter-rouge">SHF_GROUP</code>, a.k.a. COMDAT sections</h2>

<p>For the past twenty-some years, C and C++ compilers have traditionally compiled
inline functions, inline variables, and implicit instantiations of function and variable
templates into what are called “COMDAT sections.” This feature came late to ELF;
the name “COMDAT” <a href="https://itanium-cxx-abi.github.io/cxx-abi/abi/prop-72-comdat.html">apparently comes</a>
from Windows NT. For more than you ever wanted to know about COMDAT, see
<a href="https://maskray.me/blog/2021-07-25-comdat-and-section-group">“COMDAT and section group”</a>
(Fangrui Song, July 2021).</p>

<p>ELF’s version of COMDAT was basically designed to do <em>exactly</em> what a C++ compiler needs
in order to implement inline functions. The compiler can take C++ code such as</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inline int f() {
  static int i = 42;
  return ++i;
}
</code></pre></div></div>

<p>and turn it into a whole group of sections (text, data, rodata, whatever else it needs) —
basically a whole mini object file of its own — something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .text._Z1fv,"axG",@progbits,fgroup,comdat
  .globl _Z1fv
_Z1fv:
  movl _ZZ1fvE1i(%rip), %eax
  addl $1, %eax
  movl %eax, _ZZ1fvE1i(%rip)
  ret

  .section .data._ZZ1fvE1i,"awG",@progbits,fgroup,comdat
  .globl _ZZ1fvE1i
_ZZ1fvE1i:
  .long 42
</code></pre></div></div>

<p>The assembler emits an ELF section of type <code class="language-plaintext highlighter-rouge">SHT_GROUP</code> representing the group of sections
with “group identifier” <code class="language-plaintext highlighter-rouge">fgroup</code>. The linker, at link time, will pick one object file’s
<code class="language-plaintext highlighter-rouge">fgroup</code> group and throw away the rest.</p>

<blockquote>
  <p>Now, I simplified that codegen quite a bit. Really, GCC doesn’t make such a human-friendly
section group; it dumps each section into its own <em>individual</em> section group (so in this
example there will be two different group identifiers, not just one); and GCC also marks
both <code class="language-plaintext highlighter-rouge">f</code> and <code class="language-plaintext highlighter-rouge">i</code> as <code class="language-plaintext highlighter-rouge">.weak</code> symbols rather than global. I’m not sure why GCC does these
things; I conjecture “intermediate codegen targeting an object format less powerful than ELF”
and “compatibility with very old ELF linkers lacking <code class="language-plaintext highlighter-rouge">SHT_GROUP</code> support” respectively,
but I don’t know. Email and tell me!</p>
</blockquote>

<p>COMDAT sections are exactly what you need to implement the definitions of (1) inline functions;
(2) implicit instantiations of function templates; (3) the static local variables of
inline functions and implicitly instantiated function templates; (4) implicit instantiations
of variable templates; (5) inline variables and static inline data members of classes;
and probably a few more things I’m forgetting. All of these are entities with <em>names</em>,
and C++ requires them to be properly deduplicated by name: <code class="language-plaintext highlighter-rouge">&amp;myInlineFunc</code> must have the same pointer
value no matter what translation unit you’re in.</p>

<p>Another kind of entity we talked about the other day in
<a href="/blog/2026/04/24/define-static-array/">“Things <code class="language-plaintext highlighter-rouge">define_static_array</code> can’t do”</a> (2026-04-24):
template parameter objects of class type. Code like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template&lt;A t&gt;
const A *f() { return &amp;t; }
</code></pre></div></div>

<p>will produce a template parameter object “variable” in its own COMDAT section,
like this (<a href="https://godbolt.org/z/Pb9WdKezh">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata._ZTAXtl1AEE,"aG",@progbits,_ZTAXtl1AEE,comdat
  .weak _ZTAXtl1AEE
_ZTAXtl1AEE:
  .zero 1
</code></pre></div></div>

<p>That’s exactly the same strategy the compiler would use for a simple inline variable like</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inline const A v;
</code></pre></div></div>

<p>Now, when the linker deduplicates COMDAT sections, it looks at the group identifier (a symbol name)
to decide if two sections are “duplicates” or not. It doesn’t care whether they have the same
bytewise contents. That makes sense, because usually the text sections corresponding to
instantiations of the same inline function in different TUs <em>won’t</em> be byte-for-byte identical
(especially if the two TUs were compiled with different optimization levels). For inline functions,
that’s exactly what we need: deduping by name, not by contents.</p>

<hr />

<p>You might imagine abusing COMDAT sections to do content-addressed deduplication à la <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>.
We just have to put each element in its own individual section group, with a
group identifier based on (the hash of) its contents. For example, instead of</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.str1.1,"aMS",@progbits
.LC0:
  .string "hello"
.LC1:
  .string "world"
.LC2:
  .string "hello"
</code></pre></div></div>

<p>we could emit</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.str1.hello,"aG",@progbits,group_hello,comdat
  .globl str_hello
str_hello:
  .string "hello"
  .section .rodata.str1.world,"aG",@progbits,group_world,comdat
  .globl str_world
str_world:
  .string "world"
</code></pre></div></div>

<p>But that would be vastly increase the linker’s burden — not just processing all those tiny
sections, but keeping track of all those new global symbols. (We couldn’t use local, internal-linkage
symbols anymore, because the linker wouldn’t be helping us to repoint one symbol’s relocations
at another.) And the compiler would have to dedupe <em>within</em> the TU: we could no longer
refer to <code class="language-plaintext highlighter-rouge">"hello"</code> by local symbols like <code class="language-plaintext highlighter-rouge">.LC0</code> and <code class="language-plaintext highlighter-rouge">.LC2</code>, but at the same time we couldn’t emit the
global symbol <code class="language-plaintext highlighter-rouge">str_hello</code> twice in the same TU.</p>

<blockquote>
  <p>By the way, making symbol names that incorporate hashes of user-provided data is just asking for trouble.
See <a href="/blog/2022/12/31/mid-snow-and-ice/">“Hash-colliding string literals on MSVC”</a> (2022-12-31).
A hash collision is disastrous; and when someone discovers how to generate hash collisions and starts
writing blog posts like the above, you can’t switch out your hash function for a better one because
that would break ABI.</p>
</blockquote>

<p>So it’s very good that we have different, essentially custom-fitted, tools for deduping inline functions
versus string literals.</p>

<ul>
  <li>
    <p>Duplicate inline functions <em>must</em> be merged (we forbid false negatives);
  non-duplicates <em>must not</em> be merged (we forbid false positives). Dupe-ness is decided by symbol name;
  contents can be different. Duplication need be detected only across TUs; duplication within a single TU
  is impossible. Use COMDAT sections.</p>
  </li>
  <li>
    <p>Duplicate strings can be left unmerged (we permit false negatives), although we still forbid false positives.
  Dupe-ness is decided by bytewise contents. Duplication should be detected both within and across TUs.
  Use <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>.</p>
  </li>
</ul>

<p>Templates and inline variables can also use COMDAT. A downside is that we’d like to merge the
definitions of e.g. <code class="language-plaintext highlighter-rouge">f&lt;char*&gt;</code> and <code class="language-plaintext highlighter-rouge">f&lt;void*&gt;</code> when they’re identical (if nothing in the program compares their
addresses), but if dupe-ness is decided by symbol name and those two have different symbols, we’re out
of luck. MSVC has a thing called <a href="https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations">“Identical Comdat Folding”</a>
that helps with that. (MSVC’s ICF is a non-conforming optimization, because it doesn’t check the parenthetical above.
This can <a href="https://developercommunity.visualstudio.com/t/Safe-Identical-COMDAT-folding-ICF/10888506">break code</a>:
<a href="https://godbolt.org/z/5TM15Wzcx">Godbolt</a>).</p>

<p>Initializer-list backing arrays can use <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> too. A downside is that we’d <em>really</em> like to
merge <code class="language-plaintext highlighter-rouge">{2,3,4}</code> into the tail of <code class="language-plaintext highlighter-rouge">{1,2,3,4}</code>, and the <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> algorithm will never do that.
At least the compiler can do that, if someone teaches it to. That compiler optimization “composes”
properly with <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>. The compiler could even merge <code class="language-plaintext highlighter-rouge">{1,2}</code>, <code class="language-plaintext highlighter-rouge">{3,4}</code>, and <code class="language-plaintext highlighter-rouge">{2,3}</code> into a single
16-byte element <code class="language-plaintext highlighter-rouge">{1,2,3,4}</code> with three embedded labels, which could then merge with <code class="language-plaintext highlighter-rouge">{1u,2u,3u,4u}</code>
at link time:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  .section .rodata.cst16,"aM",@progbits,16
.C.0.0:
  .long 1
.C.1.1:
  .long 2
.C.2.2:
  .long 3
  .long 4
</code></pre></div></div>

<p>Of course it would be better if the linker could work this out itself; the linker has more
information than the compiler and therefore can do a better job of merging elements. It would also
be nice if we could do something like <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> for literals larger than 32 bytes, and/or literals
of lengths that aren’t powers of two. <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119153">GCC bug #119153</a>
is related.</p>

<p>Contrariwise, C++26 <code class="language-plaintext highlighter-rouge">std::define_static_array</code> seemingly cannot use <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>, because it cannot tolerate
false negatives: C++26 at present <em>requires</em> us to merge all copies of <code class="language-plaintext highlighter-rouge">define_static_array(std::array{1,2,3})</code>
into the same place in memory. That’s too bad, because forcing it to use COMDAT (name-addressed)
instead of <code class="language-plaintext highlighter-rouge">SHF_MERGE</code> (data-addressed) means we cannot take advantage of the latitude C++26 gives us to
merge <code class="language-plaintext highlighter-rouge">define_static_array(array{1,2,3})</code> with <code class="language-plaintext highlighter-rouge">define_static_array(array{1u,2u,3u})</code>.</p>

<blockquote>
  <p>Well, we could merge <code class="language-plaintext highlighter-rouge">define_static_array(array{1,2,3})</code> with <code class="language-plaintext highlighter-rouge">define_static_array(array{1u,2u,3u})</code>
if we used group identifiers based on a hash of the data! When we tried that on string data
<a href="#you-might-imagine-abusing-comdat">above</a>
we paid a huge performance penalty as well as a correctness penalty. Here we’d pay only
the correctness penalty. It’s still not worth it, IMHO.</p>
</blockquote>

<h2 id="conclusion">Conclusion</h2>

<p>Figuring out the best way to “compress” static data using only the tools we’ve got — single-TU ad-hoc compiler
smarts, data-addressed <code class="language-plaintext highlighter-rouge">SHF_MERGE</code>, and name-addressed COMDAT sections — is a very hard problem. The implementation
won’t be optimum in every case.</p>

<p>Maybe we could give future linkers even better tools. For example, now that
“potentially non-unique object” is a term of art in C++, maybe we could just dump all PNU objects’ initializers into
a single section (<code class="language-plaintext highlighter-rouge">.rodata.pnu</code>?) and in one more special section (<code class="language-plaintext highlighter-rouge">.pnu_symtab</code>, storing just a list of indices
into the real <code class="language-plaintext highlighter-rouge">.symtab</code>?) specify their starts and sizes — I think that’s all the information the linker needs
in order to overlap them any way it sees fit and repair <code class="language-plaintext highlighter-rouge">.symtab</code> accordingly.</p>

<p>Something like that might already exist. If it does, I’d certainly like to hear about it.
And if it doesn’t, I’d certainly like someone to build it!</p>]]></content><author><name></name></author><category term="abi" /><category term="elf" /><category term="inline-functions" /><category term="language-design" /><category term="proposal" /><category term="rant" /><category term="sufficiently-smart-compiler" /><category term="templates" /><summary type="html"><![CDATA[Previously [I wrote](/blog/2026/04/24/define-static-array/): > [Template parameter objects of array type] are permitted to overlap or be > coalesced, just like `initializer_list`s and string literals. Clang trunk > isn't smart enough to coalesce potentially non-unique objects [but] > GCC, once it implements `define_static_array`, will presumably make them the same. Well, GCC 16 has an experimental implementation of `define_static_array` (compile with `g++ -std=c++26 -freflection`), and it does _not_ coalesce template parameter objects of array type in the way I expected. Digging deeper into why not, I learned that there are at least three ways compilers and linkers (on ELF — that is, non-Windows — platforms) conspire to "merge" potentially non-unique objects: * Merging at the compiler level (for `initializer_list` backing arrays) * Sections with `SHF_MERGE` (for string literals and backing arrays) * Sections with `SHF_GROUP`, a.k.a. COMDAT sections (for inline variables)]]></summary></entry><entry><title type="html">_Adventure:_ Is there light in the cobble crawl?</title><link href="https://quuxplusone.github.io/blog/2026/04/30/cobble-crawl-truncation/" rel="alternate" type="text/html" title="_Adventure:_ Is there light in the cobble crawl?" /><published>2026-04-30T00:01:00+00:00</published><updated>2026-04-30T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/04/30/cobble-crawl-truncation</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/04/30/cobble-crawl-truncation/"><![CDATA[<p>The original <em>Colossal Cave Adventure</em> consists basically of a
Fortran source file and a textual data file. These files would
often travel from one installation to another via paper printouts:
printed out at one site, typed in by hand at another.</p>

<p>The lines of WOOD0350’s Fortran source (intentionally or not)
never exceed 80 columns regardless of your tab stop:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ detab -4 advent.for | awk '{print length}' | sort -n | uniq -c | tail -4
  48 77
  52 78
  35 79
   3 80
$ detab -8 advent.for | awk '{print length}' | sort -n | uniq -c | tail -4
  58 77
  62 78
  39 79
   3 80
</code></pre></div></div>

<p>But the data file fits within 80 columns only with a tab stop of four.
With an eight-space tab stop, four lines of the data file exceed 80 columns:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ detab -4 advent.dat | awk '{print length}' | sort -n | uniq -c | tail -4
  55 71
  59 72
  67 73
  69 74
$ detab -8 advent.dat | awk '{print length}' | sort -n | uniq -c | tail -4
  59 76
  66 77
  69 78
   4 82
</code></pre></div></div>

<p>These are the four offending lines. The first comes from section 3 (travel table)
and the rest from section 9 (bit flags).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>108     95556   43      45      46      47      48      49      50      29      30
0       1       2       3       4       5       6       7       8       9       10
7       42      43      44      45      46      47      48      49      50      51
7       52      53      54      55      56      80      81      82      86      87
</code></pre></div></div>

<p>If your version of <em>Adventure</em> has suffered truncation at the 80th column
at any point in its pedigree, you’d expect to see
(1) going DOWN from Witt’s End is impossible; (2) the Cobble Crawl
does not have light; (3) entering maze room 51 or 87 resets the maze
hint counter.</p>

<p>I first became aware of this possibility in December 2025, when Mike Willegal
sent me a fan-fold printout of HORV0350 (a
<a href="https://en.wikipedia.org/wiki/Systems_Engineering_Laboratories">SEL32</a>/<a href="http://mnembler.com/ragooman/computers_mini_history.html">RTM</a>
port of WOOD0350 most recently touched by Ned Horvath) that he’d kept since March 1979.
In that fan-fold printout, all four of the lines above are truncated and
thus missing the last number.</p>

<p><img src="/blog/images/2026-04-30-horv0350.jpg" alt="Three of the truncated lines" /></p>

<p>Now, I have no evidence that HORV0350’s own data file was actually truncated.
But anyone retyping the game from that printout could easily have assumed that
the Cobble Crawl was meant to be dark, and that Witt’s End was meant
to have no DOWN exit.</p>

<p>I see evidence that such truncation really did happen between LONG0501 and
ANON0501/OSKA0501.</p>

<ul>
  <li>ANON0501 reflows line <a href="https://github.com/Quuxplusone/Advent/blob/master/ANON0501/adv.data.2#L353-L354">(1)</a>,
  preserving the exits from Witt’s End,
  but truncates <a href="https://github.com/Quuxplusone/Advent/blob/master/ANON0501/adv.data.3#L742">(2)</a>
  and <a href="https://github.com/Quuxplusone/Advent/blob/master/ANON0501/adv.data.3#L759-L760">(3)</a>.</li>
  <li>OSKA0501 reflows <a href="https://github.com/Quuxplusone/Advent/blob/master/OSKA0501/src/adventure.text#L1583-L1584">(1)</a>
  but truncates <a href="https://github.com/Quuxplusone/Advent/blob/master/OSKA0501/src/adventure.text#L3218">(2)</a>
  and <a href="https://github.com/Quuxplusone/Advent/blob/master/OSKA0501/src/adventure.text#L3235-L3236">(3)</a>.</li>
  <li>MCDO0551 reflows <a href="https://github.com/Quuxplusone/Advent/blob/master/MCDO0551/ADVDAT#L1606-L1607">(1)</a>
  but truncates <a href="https://github.com/Quuxplusone/Advent/blob/master/MCDO0551/ADVDAT#L3172">(2)</a>
  and <a href="https://github.com/Quuxplusone/Advent/blob/master/MCDO0551/ADVDAT#L3191-L3192">(3)</a>.</li>
  <li>ROBE0665 reflows all three.</li>
</ul>

<p>My decompilation of <a href="/blog/2025/12/29/long0751/">the recovered LONG0751’s</a> data file indicates
that it did not truncate any of these lines; and my playtesting of the recovered LONG0501
indicates that it did not truncate (1) or (2) either. (It’s inconvenient to test (3) by playtesting.)</p>

<p>These observations are consistent with the hypothesis that first LONG0501 reflowed (1) but left (2) and (3)
untouched; then ANON0501/OSKA0501 and MCDO0551 descended from truncated paper printouts of LONG0501
(or via a now-lost common ancestor that was itself descended from LONG0501 by truncation).
Meanwhile LONG0751 and ROBE0665 descended, independently, from non-truncated copies of LONG0501.</p>]]></content><author><name></name></author><category term="adventure" /><category term="digital-antiquaria" /><category term="war-stories" /><summary type="html"><![CDATA[The original _Colossal Cave Adventure_ consists basically of a Fortran source file and a textual data file. These files would often travel from one installation to another via paper printouts: printed out at one site, typed in by hand at another. The lines of WOOD0350's Fortran source (intentionally or not) never exceed 80 columns regardless of your tab stop. But the data file fits within 80 columns only with a tab stop of four. With an eight-space tab stop, four lines of the data file exceed 80 columns:]]></summary></entry><entry><title type="html">_Adventure:_ Walking on the ceiling</title><link href="https://quuxplusone.github.io/blog/2026/04/28/walking-on-the-ceiling/" rel="alternate" type="text/html" title="_Adventure:_ Walking on the ceiling" /><published>2026-04-28T00:01:00+00:00</published><updated>2026-04-28T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/04/28/walking-on-the-ceiling</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/04/28/walking-on-the-ceiling/"><![CDATA[<p>On 2012-12-01 I wrote to Don Woods (in a postscript to a production update
on <a href="https://boardgamegeek.com/boardgame/121751/colossal-cave-the-board-game"><em>Colossal Cave: The Board Game</em></a>):</p>

<blockquote>
  <p>By the way, I just noticed last week that in “Adventure”, in the Hall
of the Mountain King, the directions NORTH and LEFT are synonyms, as
are SOUTH and RIGHT… as are WEST and FORWARD!  West being forward
makes sense, if the Hall of Mists is back to the east; but for the
rest I suppose the adventurer must be walking on the ceiling. :)  This
little mixup is present all the way back to
<a href="https://github.com/Quuxplusone/Advent/blob/master/CROW0005/advdat.77-03-11#L249-L253">Crowther’s code</a>.
I just thought it was funny that nobody had commented on it before, as
far as I know.</p>
</blockquote>

<p>Don Woods wrote back the same day:</p>

<blockquote>
  <p>Yes, I vaguely recall finding that at some point.  It was wrong in my
version 1 but got fixed somewhere between there and my version 2.5.</p>
</blockquote>

<p>Indeed, where WOOD0350 (circa 1977) <a href="https://github.com/Quuxplusone/Advent/blob/master/WOOD0350/advent.dat#L447-L454">has</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>19    311028  45  36   # 45=N 36=LEFT
19    311029  46  37   # 46=S 37=RIGHT
19    311030  44  7    # 44=W  7=FORWA
</code></pre></div></div>

<p>WOOD0430 (his expanded <em>Adventure 2.5</em>, circa 1995) <a href="https://github.com/Quuxplusone/Advent/blob/master/WOOD0430/adventure.text#L526-L533">has</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>19    311028  45  37   # 45=N 37=RIGHT
19    311029  46  36   # 46=S 36=LEFT
19    311030  44  7    # 44=W  7=FORWA
</code></pre></div></div>

<p>The original inconsistency is preserved, without comment, in
KNUT0350, GIBI0375 (<em>Original Adventure</em>), LUPI0440, PLAT0550,
LONG0751, and SMIT0370 (Georgia Tech <em>FunAdv</em>).</p>

<p><a href="https://cs.stanford.edu/people/eroberts/Adventure/">ROBE0665</a> (<em>Wellesley Adventure</em>)
and <a href="https://mipmip.org/adv770/adv770.php">ARNA0770</a> eliminate the
inconsistency — albeit not in an obviously purposeful way — by simply
not recognizing LEFT, RIGHT, and FORWARD as exits from the Hall of the
Mountain King.</p>

<p>ROBE0665 has a superficially similar slip-up at Three-Opening Arch:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>265     To the east stands a wide dark arch opening into three
265     passages.  All lead eastwards; but the left-handed passage
265     plunges down, while the right-hand climbs up, and the middle
265     way seems to run on, smooth and level but very narrow.
265     To the north of the great arch stands a stone door, half open.
265     To the west the passage fades into darkness.

265     0 66   LEFT  NE UP
265     0 34   RIGHT SE DOWN
265     0 273  EAST
</code></pre></div></div>

<p>Here LEFT correctly matches NE, but goes UP where the room description says “plunges down”;
and RIGHT correctly matches SE, but goes DOWN where the room description says “climbs up.”
I reported that bug to Eric Roberts on 2025-04-23, although at that time I don’t think I’d noticed
that LEFT and RIGHT actually worked correctly in that location, and that it was only the UP
and DOWN directions that were wrong.</p>

<hr />

<p>See also:</p>

<ul>
  <li><a href="/blog/2020/02/06/water-bottle-bug/">“A bug in <em>Adventure</em>’s endgame”</a> (2020-02-06)</li>
</ul>]]></content><author><name></name></author><category term="adventure" /><summary type="html"><![CDATA[On 2012-12-01 I wrote to Don Woods (in a postscript to a production update on [_Colossal Cave: The Board Game_](https://boardgamegeek.com/boardgame/121751/colossal-cave-the-board-game)): > By the way, I just noticed last week that in "Adventure", in the Hall > of the Mountain King, the directions NORTH and LEFT are synonyms, as > are SOUTH and RIGHT... as are WEST and FORWARD!  West being forward > makes sense, if the Hall of Mists is back to the east; but for the > rest I suppose the adventurer must be walking on the ceiling. :)  This > little mixup is present all the way back to > [Crowther's code](https://github.com/Quuxplusone/Advent/blob/master/CROW0005/advdat.77-03-11#L249-L253). > I just thought it was funny that nobody had commented on it before, as > far as I know.]]></summary></entry><entry><title type="html">Things C++26 `define_static_array` can’t do</title><link href="https://quuxplusone.github.io/blog/2026/04/24/define-static-array/" rel="alternate" type="text/html" title="Things C++26 `define_static_array` can’t do" /><published>2026-04-24T00:01:00+00:00</published><updated>2026-04-24T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/04/24/define-static-array</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/04/24/define-static-array/"><![CDATA[<p>We’ve <a href="/blog/2023/10/13/constexpr-string-round-2/">seen previously</a>
that it’s not possible to create a <code class="language-plaintext highlighter-rouge">constexpr</code> global variable of container type,
when that container holds a pointer to a heap allocation. It’s fine to create a
global constexpr <code class="language-plaintext highlighter-rouge">std::array</code>, or even a <code class="language-plaintext highlighter-rouge">std::string</code> that uses only its SSO buffer;
but you can’t create a global constexpr <code class="language-plaintext highlighter-rouge">std::vector</code> or <code class="language-plaintext highlighter-rouge">std::list</code> (unless it’s
empty) because it would have to hold a pointer to a heap allocation.</p>

<p>Think of constexpr evaluation as taking place “in the compiler’s imagination.”
Since C++20 it’s fine to use <code class="language-plaintext highlighter-rouge">new</code> and <code class="language-plaintext highlighter-rouge">delete</code> at constexpr time; but there’s a firewall between
constexpr evaluation and real, material runtime existence. You can’t, at runtime, get
a pointer to a heap allocation that was made only “in the compiler’s imagination,” any more
than you can get a pointer to a local variable of a stack frame that was made only
“in the compiler’s imagination.” So none of these snippets will compile:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr int *f() { int i = 42; return &amp;i; }
constinit int *p = f(); // error

constexpr int *f() { return new int(42); }
constinit int *p = f(); // error

constexpr std::vector&lt;int&gt; f() { return {1,2,3}; }
constinit std::vector&lt;int&gt; p = f(); // error
</code></pre></div></div>

<p>But if you can compute a <code class="language-plaintext highlighter-rouge">std::vector&lt;int&gt;</code> at constexpr time, then you can persist its contents
into a global constexpr <code class="language-plaintext highlighter-rouge">std::array</code> of the appropriate size. The appropriate size is just
the <code class="language-plaintext highlighter-rouge">.size()</code> of the vector you computed, of course. So we have what’s become known as the
“constexpr two-step” (<a href="https://godbolt.org/z/fMdfPe14f">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr std::vector&lt;int&gt; f() { return {1,2,3}; }

constinit auto a = []() {
  std::array&lt;int, f().size()&gt; a;
  std::ranges::copy(f(), a.begin());
  return a;
}();
</code></pre></div></div>

<p>Thanks to Barry Revzin’s <a href="https://isocpp.org/files/papers/P3491R3.html">P3491</a> (June 2025)
and Jason Turner’s <a href="https://www.youtube.com/watch?v=_AefJX66io8">“Understanding the Constexpr 2-Step”</a> (C++ On Sea 2024)
for the term “constexpr two-step.” Jason’s talk deals with a specific formula in which
instead of <em>repeating</em> — and repeatedly evaluating — <code class="language-plaintext highlighter-rouge">f()</code> in the body of the lambda,
we factor it out into a template argument (<a href="https://godbolt.org/z/dff31Y6E9">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr std::vector&lt;int&gt; f() { return {1,2,3}; }

template&lt;auto B&gt;
consteval auto to_array() {
  // MAGIC NUMBER WARNING!
  constexpr auto v = B() | std::ranges::to&lt;std::inplace_vector&lt;int, 999&gt;&gt;();
  std::array&lt;int, v.size()&gt; a;
  std::ranges::copy(v, a.begin());
  return a;
}

constinit auto a = to_array&lt;[]() { return f(); }&gt;();
</code></pre></div></div>

<p>C++26 will introduce a new and improved tool for this kind of compile-time array generation.
It’s spelled <code class="language-plaintext highlighter-rouge">std::define_static_array</code>. In C++26 you can just write this
(<a href="https://godbolt.org/z/5dc3EGraq">Godbolt</a>):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr std::vector&lt;int&gt; f() { return {1,2,3}; }
constinit std::span&lt;const int&gt; sp = std::define_static_array(f());
</code></pre></div></div>

<p>This call to <code class="language-plaintext highlighter-rouge">define_static_array</code> returns a <code class="language-plaintext highlighter-rouge">span</code> over a static-storage constant array of three ints.
Basically this is asking the compiler to take the data it’s come up with “in its imagination” and
write down a copy of it in the object file. This is much cleaner and more compile-time-efficient than
the “two-step”!</p>

<p>Unfortunately, if I understand it correctly, C++26 <code class="language-plaintext highlighter-rouge">define_static_array</code> does not (yet?) support
several things that you <em>can</em> do using the “two-step.” Here are a few such things.</p>

<h3 id="1-non-structural-types">1. Non-structural types</h3>

<p><code class="language-plaintext highlighter-rouge">std::define_static_array</code> is defined in terms of <code class="language-plaintext highlighter-rouge">std::meta::reflect_constant(e)</code>,
which <a href="https://eel.is/c++draft/meta.reflection.result">C++26 defines</a> as
<code class="language-plaintext highlighter-rouge">std::meta::template_arguments_of(^^TCls&lt;e&gt;)[0]</code> for some invented template <code class="language-plaintext highlighter-rouge">TCls</code>. That is,
<code class="language-plaintext highlighter-rouge">reflect_constant</code> (and thus <code class="language-plaintext highlighter-rouge">define_static_array</code>) is defined only for
<a href="https://eel.is/c++draft/temp.param#def:type,structural">structural types</a>. <code class="language-plaintext highlighter-rouge">int</code> is a structural
type, and thus we can write the code above. But we cannot write</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>using OInt = std::optional&lt;int&gt;;
constexpr std::vector&lt;OInt&gt; f() { return {1,2,3}; }
std::span&lt;const OInt&gt; sp = std::define_static_array(f());
</code></pre></div></div>

<p>because <code class="language-plaintext highlighter-rouge">optional&lt;int&gt;</code> is not a structural type. Nor are <code class="language-plaintext highlighter-rouge">string</code>, <code class="language-plaintext highlighter-rouge">string_view</code>, <code class="language-plaintext highlighter-rouge">span</code> itself…
There are many types that can’t be materialized using <code class="language-plaintext highlighter-rouge">define_static_array</code>, even though
they work fine with the “constexpr two-step” (<a href="https://godbolt.org/z/48fE48Mz6">Godbolt</a>).</p>

<h3 id="2-pointers-to-string-literals">2. Pointers to string literals</h3>

<p>Because <code class="language-plaintext highlighter-rouge">reflect_constant</code> is defined in terms of <code class="language-plaintext highlighter-rouge">TCls&lt;e&gt;</code>, not only must the
<em>type</em> of <code class="language-plaintext highlighter-rouge">e</code> be structural, but each particular <em>value</em> <code class="language-plaintext highlighter-rouge">e</code> in the array must be suitable for
use as a template argument. <code class="language-plaintext highlighter-rouge">const char*</code> is a structural type, but if that pointer points to
a string literal, then it’s not suitable for use as a template argument. So we can use
<code class="language-plaintext highlighter-rouge">define_static_array</code> to make an array of null pointers:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr std::vector&lt;const char*&gt; f() { return {nullptr, nullptr, nullptr}; }
std::span&lt;const char *const&gt; sp = std::define_static_array(f());
</code></pre></div></div>

<p>but it cannot make an array of pointers to literals:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr std::vector&lt;const char*&gt; f() { return {"a", "b", "c"}; }
std::span&lt;const char *const&gt; sp = std::define_static_array(f());
</code></pre></div></div>

<p>On the other hand, the “constexpr two-step” has no problem with string literals
(<a href="https://godbolt.org/z/7Tfo9Krb3">Godbolt</a>).</p>

<h3 id="3-move-only-types">3. Move-only types</h3>

<p>In order to create a template parameter object representing <code class="language-plaintext highlighter-rouge">e</code>, we must make
a copy of <code class="language-plaintext highlighter-rouge">e</code> (<a href="https://eel.is/c++draft/temp.arg.nontype#4">[temp.arg.nontype]/4</a>).
Therefore NTTP types must be copyable. You can (with care) use the two-step to create
a static array of move-only type:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>constexpr auto a = []() {
  std::array&lt;MoveOnly, f().size()&gt; a;
  std::ranges::copy(f() | std::views::as_rvalue, a.begin());
  return a;
}();
</code></pre></div></div>

<p>but you cannot do the same with <code class="language-plaintext highlighter-rouge">define_static_array</code>. (<a href="https://godbolt.org/z/nMv34E1qq">Godbolt.</a>)</p>

<p>The above snippet, like all my other examples of the “two-step,” never actually uses move-construction;
it uses default construction followed by assignment. This is unsatisfying, and prevents the two-step
from creating e.g. an array of <code class="language-plaintext highlighter-rouge">reference_wrapper</code>. <code class="language-plaintext highlighter-rouge">define_static_array</code>, on the other hand, does
not use default-construction (<a href="https://godbolt.org/z/Wbn73MaGM">Godbolt</a>).
Can we rework the two-step to eliminate the default-constructibility requirement?
I imagine we can, but at the moment I don’t see how.</p>

<h3 id="4-make-the-array-mutable">4. Make the array mutable</h3>

<p><code class="language-plaintext highlighter-rouge">define_static_array</code> allocates its array in rodata and gives you a <code class="language-plaintext highlighter-rouge">span&lt;const T&gt;</code>
over it. This allows the compiler to do cool things, like point multiple invocations of
<code class="language-plaintext highlighter-rouge">define_static_array</code> at the same backing array (<a href="https://godbolt.org/z/zKTMs8bd4">Godbolt</a>).
In fact, the compiler is actually <em>required</em> to do that, because
<code class="language-plaintext highlighter-rouge">reflect_constant</code> is defined in terms of a <a href="https://eel.is/c++draft/temp.param#13">template parameter object</a>
which for all intents and purposes behaves like an inline variable: there is guaranteed
to be only one template parameter object with a given type and value in the whole program
(<a href="https://godbolt.org/z/KnzT1qhae">Godbolt</a>).</p>

<p>Treating template parameter objects as inline variables means the compiler <em>must</em> combine
such objects when they have the same type and value (optimization! hooray!) but sadly also
<em>forbids</em> an otherwise sufficiently smart compiler from combining such objects when their
types are merely similar. <a href="https://godbolt.org/z/ea9f943rK">Godbolt</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template&lt;auto V&gt; auto tpo() { return std::span(V); }
template&lt;auto V&gt; auto tpo2() { return std::span(V); }

const void *p1 = tpo&lt;std::array&lt;signed char,3&gt;{1,2,3}&gt;().data();
const void *p2 = tpo2&lt;std::array&lt;signed char,3&gt;{1,2,3}&gt;().data();
const void *p3 = tpo&lt;std::array&lt;unsigned char,3&gt;{1,2,3}&gt;().data();
const void *p4 = tpo&lt;std::array&lt;char,3&gt;{1,2,3}&gt;().data();
</code></pre></div></div>

<p>All four of these pointers point to arrays of the three bytes <code class="language-plaintext highlighter-rouge">01 02 03</code>. <code class="language-plaintext highlighter-rouge">p1</code> and <code class="language-plaintext highlighter-rouge">p2</code>
are required to point to the same byte; <code class="language-plaintext highlighter-rouge">p3</code> and <code class="language-plaintext highlighter-rouge">p4</code>, since they point to <code class="language-plaintext highlighter-rouge">std::array</code>
objects of different types, are required to point to different arrays. The compiler
isn’t allowed to coalesce <code class="language-plaintext highlighter-rouge">p3</code> and <code class="language-plaintext highlighter-rouge">p4</code>, the way it’s allowed to coalesce
the backing arrays of differently typed <code class="language-plaintext highlighter-rouge">initializer_list</code>s (<a href="https://godbolt.org/z/8EPae4cPo">Godbolt</a>).</p>

<p>But (hooray! and thanks to Tim Song for correcting me on this!) there is a special case
specifically for the “template parameter objects of array type” created by <code class="language-plaintext highlighter-rouge">reflect_constant_array</code>
and <code class="language-plaintext highlighter-rouge">define_static_array</code>. <em>These</em> objects <em>are</em> permitted
(<a href="https://eel.is/c++draft/intro.object#def:object,potentially_non-unique">[intro.object]/9.3</a>)
to overlap or be coalesced, just like <code class="language-plaintext highlighter-rouge">initializer_list</code>s and string literals. Clang trunk
isn’t smart enough to coalesce potentially non-unique objects; therefore the Clang reference
implementation of C++26 Reflection doesn’t coalesce these array objects either; but it’s
not the paper standard’s fault. <a href="https://godbolt.org/z/o6r73rP3W">Godbolt</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>const void *p1 = std::define_static_array(std::vector&lt;signed char&gt;{1,2,3}).data();
const void *p2 = std::define_static_array(std::list&lt;signed char&gt;{1,2,3}).data();
const void *p3 = std::define_static_array(std::vector&lt;unsigned char&gt;{1,2,3}).data();
const void *p4 = std::define_static_array(std::vector&lt;char&gt;{1,2,3}).data();
</code></pre></div></div>

<p>All four of these pointers point to arrays of the three bytes <code class="language-plaintext highlighter-rouge">01 02 03</code>. <code class="language-plaintext highlighter-rouge">p1</code> and <code class="language-plaintext highlighter-rouge">p2</code>
are required to point to the same byte; <code class="language-plaintext highlighter-rouge">p3</code> and <code class="language-plaintext highlighter-rouge">p4</code> are permitted, but not required,
to point to different arrays. In practice Clang makes them different; GCC, once it implements
<code class="language-plaintext highlighter-rouge">define_static_array</code>, will presumably make them the same.</p>

<p>However, template parameter objects are invariably const! Therefore, you cannot use
<code class="language-plaintext highlighter-rouge">define_static_array</code> to produce a <code class="language-plaintext highlighter-rouge">constinit</code>-but-mutable array, the way you can
with the “constexpr two-step.” It seems to me perfectly reasonable to want a magic consteval
function that says, “Please generate me a mutable array in static storage with these
contents” — specified as a constexpr-time <code class="language-plaintext highlighter-rouge">vector&lt;int&gt;</code> — “and give me a <code class="language-plaintext highlighter-rouge">span</code> over it”:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>template&lt;class R&gt;
consteval auto define_mutable_static_storage_array(R&amp;&amp; r)
    -&gt; std::span&lt;std::ranges::range_value_t&lt;R&gt;&gt;;
</code></pre></div></div>

<p>Perfectly reasonable to <em>want</em> such an API; but C++26 <code class="language-plaintext highlighter-rouge">define_static_array</code> fundamentally
isn’t that API. It can’t produce mutable data: it can’t produce <em>anything</em> except pointers
into (potentially non-unique) template parameter objects, which behave like const inline variables.</p>

<h2 id="conclusion">Conclusion</h2>

<p>In short, <code class="language-plaintext highlighter-rouge">define_static_array</code> is constitutionally unsuited for some conspicuous use-cases.
I’m not sure what this means for the future. I’m sure we don’t want to require people to
use the “constexpr two-step” forever; but <code class="language-plaintext highlighter-rouge">define_static_array</code> doesn’t seem suited to
replace <em>all</em> of its uses — certainly not in C++26, and I don’t see how it could be extended
in the future to solve any of the problems I outlined above.</p>

<p>I imagine the answer is not “<code class="language-plaintext highlighter-rouge">define_static_array</code> will solve all your problems today,”
nor “a new and improved <code class="language-plaintext highlighter-rouge">define_static_array</code> will solve all your problems in C++XY,”
but rather “C++XY will introduce a new and different facility for manipulating static storage” —
possibly related to the as-yet-unstandardized code-generation side of reflection —
and we’ll use that new facility to solve some (but perhaps not all) of the above problems.</p>

<hr />

<p>UPDATE: Actually, problems (1), (2), and (3) all stem from <code class="language-plaintext highlighter-rouge">define_static_array</code>’s
requirement that each element be usable as an NTTP. Barry Revzin’s
<a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3380r1.html">P3380R1 “Extending support for class types as NTTPs”</a>
(December 2024) lays out a plan that would permit the programmer to mark their own types
as <em>explicitly structural</em>, thus (if accepted) addressing all three of those problems.
On the other hand, making a user-defined type
<em>explicitly structural</em> per P3380R1 seems to involve pretty arcane programming.
The “constexpr two-step” stays general by staying above the fray: it simply never requires
anything to be encoded as a template argument.</p>]]></content><author><name></name></author><category term="constexpr" /><category term="metaprogramming" /><category term="reflection" /><summary type="html"><![CDATA[We’ve seen previously that it’s not possible to create a constexpr global variable of container type, when that container holds a pointer to a heap allocation. It’s fine to create a global constexpr std::array, or even a std::string that uses only its SSO buffer; but you can’t create a global constexpr std::vector or std::list (unless it’s empty) because it would have to hold a pointer to a heap allocation.]]></summary></entry><entry><title type="html">`auto{x} != auto(x)`</title><link href="https://quuxplusone.github.io/blog/2026/04/11/auto-x-with-curly-braces/" rel="alternate" type="text/html" title="`auto{x} != auto(x)`" /><published>2026-04-11T00:01:00+00:00</published><updated>2026-04-11T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/04/11/auto-x-with-curly-braces</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/04/11/auto-x-with-curly-braces/"><![CDATA[<p>Recently it was asked: What’s the difference between the expressions <code class="language-plaintext highlighter-rouge">auto(x)</code>
and <code class="language-plaintext highlighter-rouge">auto{x}</code> in C++23?</p>

<p>The construct <code class="language-plaintext highlighter-rouge">auto(x)</code> arrived via
<a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p0849r8.html">P0849 “<em>decay-copy</em> in the language”</a>.
We could already write direct-initialization to a <em>named</em> type as either a declaration
or a cast-expression:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>T y(x); // declaration
return T(x); // cast-expression
</code></pre></div></div>

<p>P0849 just extended this syntax to work for a <em>placeholder</em> type as well:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto y(x); // declaration
return auto(x); // cast-expression
</code></pre></div></div>

<p>Both of the latter lines mean “Deduce the type of <code class="language-plaintext highlighter-rouge">auto</code> from the type of <code class="language-plaintext highlighter-rouge">x</code> (decaying to
a non-array object type if necessary) — let’s call that type <code class="language-plaintext highlighter-rouge">T</code> — and then explicitly cast <code class="language-plaintext highlighter-rouge">x</code> to
that type exactly as if the user had written <code class="language-plaintext highlighter-rouge">T</code> in place of <code class="language-plaintext highlighter-rouge">auto</code>.”</p>

<p>This <em>usually</em> means we’re just making a copy of <code class="language-plaintext highlighter-rouge">x</code> using its copy constructor.
If <code class="language-plaintext highlighter-rouge">x</code> stands for an xvalue expression, we’re calling the move constructor, and if
<code class="language-plaintext highlighter-rouge">x</code> is a prvalue, we’re probably not doing anything at all. <code class="language-plaintext highlighter-rouge">auto(x)</code> is simple.</p>

<p>But <code class="language-plaintext highlighter-rouge">auto{x}</code> is more complicated, because curly braces produce an initializer list.
This means the same thing as <code class="language-plaintext highlighter-rouge">T{x}</code>: “given the list of elements <code class="language-plaintext highlighter-rouge">{x}</code>, make me a <code class="language-plaintext highlighter-rouge">T</code>
with those elements.” As <a href="https://eel.is/c++draft/dcl.init.list#3.7">[dcl.init.list]/3.7</a> shows,
that’s not always the same thing as “make me a copy of <code class="language-plaintext highlighter-rouge">x</code>.” <a href="https://godbolt.org/z/75MEveMPW">Godbolt</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto paren() {
    std::vector&lt;std::any&gt; v;
    return auto(v);
}

auto curly() {
    std::vector&lt;std::any&gt; v;
    return auto{v};
}
</code></pre></div></div>

<p>The former means “make an empty vector, then return a copy of that vector (with no elements).”
The latter means “make an empty vector, then return a vector <em>containing</em> that vector (with one element).”</p>

<ul>
  <li>
    <p>As of this writing, MSVC gets this wrong: it calls the copy constructor in both cases.
  But <a href="https://eel.is/c++draft/dcl.init.list#3.7">[dcl.init.list]/3.7</a> (last improved by
  <a href="https://cplusplus.github.io/CWG/issues/2638.html">CWG2638</a>) makes it very clear that
  MSVC is in the wrong.</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">return (x);</code> implicitly moves from <code class="language-plaintext highlighter-rouge">x</code> (see <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2266r3.html">P2266 “Simpler implicit move”</a>),
  but <code class="language-plaintext highlighter-rouge">return auto(x);</code> does not. This makes sense, because <code class="language-plaintext highlighter-rouge">return T(x);</code> doesn’t move-from <code class="language-plaintext highlighter-rouge">x</code> either.
  Remember, all <code class="language-plaintext highlighter-rouge">auto</code> does here is hold the place of an explicitly specified <code class="language-plaintext highlighter-rouge">T</code>.</p>
  </li>
</ul>

<p>Consider also these variations:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto p = (v); // copy
auto c = {v}; // initialize a new vector of 1 element

auto p(v); // copy
auto c{v}; // initialize a new vector of 1 element (MSVC gets this wrong)
</code></pre></div></div>

<p>Of course <code class="language-plaintext highlighter-rouge">vector&lt;any&gt;</code> is a pathological case. Its benefit is that it’s also a <em>simple</em> case
using only STL types, in case you ever need to demonstrate the difference between <code class="language-plaintext highlighter-rouge">(x)</code> and <code class="language-plaintext highlighter-rouge">{x}</code>
to anyone else.</p>

<p>The takeaway: As always, you should use curly braces when you have a sequence of elements
(such as when initializing an aggregate or a container); if you aren’t in that situation (such as when
you’re writing generic code) you should use ordinary parentheses.
See <a href="/blog/2019/02/18/knightmare-of-initialization/">“The Knightmare of Initialization in C++”</a> (2019-02-18).</p>

<hr />

<p>I suspect there is no situation where it ever makes sense to use <code class="language-plaintext highlighter-rouge">auto{x}</code> in real code.
But I’m glad it exists in the language, for symmetry and consistency with <code class="language-plaintext highlighter-rouge">T{x}</code>.</p>

<p>Note that all of these lines—</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto a(1,2,3);
auto a = auto(1,2,3);
auto a{1,2,3};
auto a = auto{1,2,3};
</code></pre></div></div>

<p>—are invalid C++. You can never direct-initialize <code class="language-plaintext highlighter-rouge">auto</code> with multiple arguments.
However, both of the following copy-initializations are legal and silly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto i = (1,2,3); // comma operator; i is int
auto i = {1,2,3}; // i is initializer_list&lt;int&gt;
</code></pre></div></div>

<p>The latter is a historical accident which is supported these days, as far as I know, <em>only</em> so that we can specify the behavior
of <code class="language-plaintext highlighter-rouge">for (int i : {1,2,3})</code> without having to write a special case into <a href="https://eel.is/c++draft/stmt.ranged">[stmt.ranged]</a>.</p>

<p><a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3922.html">N3922 “New rules for auto deduction from braced-init-list”</a>
is the paper that removed <code class="language-plaintext highlighter-rouge">auto a{1,2,3}</code> from the language. N3922 came in 2014, at the height of the “Almost Always Auto” and “Uniform Initialization”
fads; it was widely assumed that newbies would write <code class="language-plaintext highlighter-rouge">auto a{1,2}</code> and shoot themselves in the foot, but writing <code class="language-plaintext highlighter-rouge">auto a = {1,2}</code>
wasn’t so attractive to newbies and thus wasn’t treated so urgently as a footgun. At the same time, N3922 changed both
<code class="language-plaintext highlighter-rouge">auto a{1}</code> and <code class="language-plaintext highlighter-rouge">auto a = {1}</code> to deduce <code class="language-plaintext highlighter-rouge">int</code> rather than <code class="language-plaintext highlighter-rouge">initializer_list&lt;int&gt;</code>. Only <code class="language-plaintext highlighter-rouge">auto a = {1,2}</code> remains as a special case
inconsistent with the rest of the language. <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3912.html">N3912 §1</a> says
this special case will be useful to “advanced users,” which I think in hindsight was a bad justification for keeping it.</p>]]></content><author><name></name></author><category term="c++-learner-track" /><category term="c++-style" /><category term="initialization" /><category term="initializer-list" /><summary type="html"><![CDATA[Recently it was asked: What’s the difference between the expressions auto(x) and auto{x} in C++23?]]></summary></entry><entry><title type="html">The “macro overloading” idiom</title><link href="https://quuxplusone.github.io/blog/2026/04/02/macro-overloading/" rel="alternate" type="text/html" title="The “macro overloading” idiom" /><published>2026-04-02T00:01:00+00:00</published><updated>2026-04-02T00:01:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/04/02/macro-overloading</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/04/02/macro-overloading/"><![CDATA[<p>Here’s a neat trick to create an “overloaded macro” in C, such that
<code class="language-plaintext highlighter-rouge">M(x)</code> does one thing and <code class="language-plaintext highlighter-rouge">M(x, y)</code> does something else. For example,
we could make a macro <code class="language-plaintext highlighter-rouge">ARCTAN</code> such that <code class="language-plaintext highlighter-rouge">ARCTAN(v)</code> calls <code class="language-plaintext highlighter-rouge">atan(v)</code>
and <code class="language-plaintext highlighter-rouge">ARCTAN(y,x)</code> calls <code class="language-plaintext highlighter-rouge">atan2(y,x)</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define GET_ARCTAN_MACRO(_1, _2, x, ...) x
#define ARCTAN(...) GET_ARCTAN_MACRO(__VA_ARGS__, atan2, atan)(__VA_ARGS__)
</code></pre></div></div>

<p>So <code class="language-plaintext highlighter-rouge">ARCTAN(1)</code> expands to <code class="language-plaintext highlighter-rouge">GET_ARCTAN_MACRO(1, atan2, atan)(1)</code> expands to <code class="language-plaintext highlighter-rouge">atan(1)</code>,
while <code class="language-plaintext highlighter-rouge">ARCTAN(2,3)</code> expands to <code class="language-plaintext highlighter-rouge">GET_ARCTAN_MACRO(2,3, atan2, atan)(2,3)</code> expands to <code class="language-plaintext highlighter-rouge">atan2(2,3)</code>.</p>

<p>Or again, to make an “overloaded” <code class="language-plaintext highlighter-rouge">HYPOT</code> macro:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define GET_HYPOT_MACRO(_1, _2, _3, x, ...) x
#define HYPOT(...) GET_HYPOT_MACRO(__VA_ARGS__, hypot3, hypot, )(__VA_ARGS__)
</code></pre></div></div>

<p>So <code class="language-plaintext highlighter-rouge">HYPOT(x)</code> expands to <code class="language-plaintext highlighter-rouge">(x)</code>, <code class="language-plaintext highlighter-rouge">HYPOT(x,y)</code> expands to <code class="language-plaintext highlighter-rouge">hypot(x,y)</code>, and <code class="language-plaintext highlighter-rouge">HYPOT(x,y,z)</code>
expands to <code class="language-plaintext highlighter-rouge">hypot3(x,y,z)</code>.</p>

<ul>
  <li>
    <p><code class="language-plaintext highlighter-rouge">HYPOT(1,2,3,4)</code> expands to <code class="language-plaintext highlighter-rouge">GET_HYPOT_MACRO(1,2,3,4, hypot3, hypot,)(1,2,3,4)</code>
  expands to <code class="language-plaintext highlighter-rouge">4(1,2,3,4)</code>, which is garbage. It’s likely to be ill-formed garbage, though,
  so that’s not too user-unfriendly.</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">HYPOT()</code> expands to <code class="language-plaintext highlighter-rouge">GET_HYPOT_MACRO(,hypot3,hypot,)()</code> expands to <code class="language-plaintext highlighter-rouge">()</code>. <code class="language-plaintext highlighter-rouge">ARCTAN()</code>
  expands to <code class="language-plaintext highlighter-rouge">GET_ARCTAN_MACRO(, atan2, atan)()</code> expands to <code class="language-plaintext highlighter-rouge">atan()</code>. These are less
  user-friendly.</p>
  </li>
</ul>

<p>If you don’t mind relying on a C23/C++20 preprocessor feature, you can improve the latter
experience:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define GET_ARCTAN_MACRO(_1, _2, x, ...) x
#define ARCTAN(...) GET_ARCTAN_MACRO(__VA_ARGS__ __VA_OPT__(,) atan2, atan)(__VA_ARGS__)
</code></pre></div></div>

<p>Now <code class="language-plaintext highlighter-rouge">ARCTAN()</code> expands to <code class="language-plaintext highlighter-rouge">GET_ARCTAN_MACRO(atan2, atan)()</code> which is more cleanly ill-formed.
(It has too few macro arguments.)</p>

<blockquote>
  <p>You might think you could use <a href="https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html#:~:text=has%20a%20special%20meaning">a well-known GCC extension</a>
to write <code class="language-plaintext highlighter-rouge">__VA_ARGS__##,</code> — but no, the token-paste operator <code class="language-plaintext highlighter-rouge">##</code> has its special meaning
only within <code class="language-plaintext highlighter-rouge">,##__VA_ARGS__</code>, not within <code class="language-plaintext highlighter-rouge">__VA_ARGS__##,</code>.</p>
</blockquote>

<p>Boost.Preprocessor implements <a href="https://www.boost.org/doc/libs/latest/libs/preprocessor/doc/ref/variadic_size.html"><code class="language-plaintext highlighter-rouge">BOOST_PP_VARIADIC_SIZE</code></a>
via a minor variation on this idiom:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define GET_SIZE_MACRO(_1, _2, _3, _4, _5, x, ...) x
#define SIZE(...) GET_SIZE_MACRO(__VA_ARGS__ __VA_OPT__(,) 5,4,3,2,1,0)
</code></pre></div></div>

<p>Hat tip to <a href="https://www.quarterstar.tech/2026/03/30/rust-to-cpp-implementing-the-question-mark-operator#the-implementation">this blog post by “Quarterstar”</a>
(March 2026); the technique is also shown by <a href="https://codelucky.com/c-macros/#6_Macro_Overloading">CodeLucky</a> (September 2024),
on <a href="https://stackoverflow.com/questions/11761703/overloading-macro-on-number-of-arguments">StackOverflow</a> (2012),
and presumably much older places.
<a href="https://github.com/search?q=%2FGET_.*_MACRO%5B%28%5D__VA_ARGS__%2F&amp;type=code">GitHub search</a> turns up many cases of the pattern,
even without considering variations in the <code class="language-plaintext highlighter-rouge">GET_*_MACRO</code> naming convention.</p>

<p>Caveat: As of 2026, MSVC’s preprocessor can’t handle this trick by default.
You have to tell it to behave conformingly, by adding <a href="https://learn.microsoft.com/en-us/cpp/build/reference/zc-preprocessor"><code class="language-plaintext highlighter-rouge">-Zc:preprocessor</code></a>
to your command line. (This is also how you get it to recognize <code class="language-plaintext highlighter-rouge">__VA_OPT__</code>!)
Alternatively, MSVC’s old non-conforming preprocessor will accept the code as long as it’s wrapped in
an additional layer of indirection. See
<a href="/blog/2018/06/18/fundamental-theorem-of-software-engineering/">“The Fundamental Theorem of Software Engineering”</a> (2018-06-18).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// Work around MSVC's non-conforming preprocessor
#define EXPAND(x) x
#define GET_ARCTAN_MACRO(_1, _2, x, ...) x
#define ARCTAN(...) EXPAND(GET_ARCTAN_MACRO(__VA_ARGS__, atan2, atan)(__VA_ARGS__))
</code></pre></div></div>]]></content><author><name></name></author><category term="msvc" /><category term="preprocessor" /><summary type="html"><![CDATA[Here's a neat trick to create an "overloaded macro" in C, such that `M(x)` does one thing and `M(x, y)` does something else. For example, we could make a macro `ARCTAN` such that `ARCTAN(v)` calls `atan(v)` and `ARCTAN(y,x)` calls `atan2(y,x)`. #define GET_ARCTAN_MACRO(_1, _2, x, ...) x #define ARCTAN(...) GET_ARCTAN_MACRO(__VA_ARGS__, atan2, atan)(__VA_ARGS__)]]></summary></entry><entry><title type="html">Chromium’s `span`-over-initializer-list success story</title><link href="https://quuxplusone.github.io/blog/2026/03/19/p2447-success-story/" rel="alternate" type="text/html" title="Chromium’s `span`-over-initializer-list success story" /><published>2026-03-19T00:02:00+00:00</published><updated>2026-03-19T00:02:00+00:00</updated><id>https://quuxplusone.github.io/blog/2026/03/19/p2447-success-story</id><content type="html" xml:base="https://quuxplusone.github.io/blog/2026/03/19/p2447-success-story/"><![CDATA[<p>Previously: <a href="/blog/2021/10/03/p2447-span-from-initializer-list/">“<code class="language-plaintext highlighter-rouge">span</code> should have a converting constructor from <code class="language-plaintext highlighter-rouge">initializer_list</code>”</a>
(2021-10-03). This converting constructor was added by <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2447r6.html">P2447</a>
for C++26. Way back in 2024, Peter Kasting added the same constructor to Chromium’s
<a href="https://github.com/chromium/chromium/blob/4aa6967/base/containers/span.h#L1166-L1171"><code class="language-plaintext highlighter-rouge">base::span</code></a> —
he emailed me about it at the time — but I was only recently reminded that in
<a href="https://old.reddit.com/r/cpp/comments/1n8fbg8/showcasing_underappreciated_proposals/">the /r/cpp thread</a>
about the feature he’d written:</p>

<blockquote>
  <p>Yup, this change was so useful it led to me doing a ton of reworking of Chromium’s
<code class="language-plaintext highlighter-rouge">base::span</code> just so I could implement it there.</p>
</blockquote>

<p>Speaking of <a href="/blog/2026/03/19/seven-types-of-ambiguity/">ambiguity</a>: out of context that comment <em>could</em> be taken
as sarcasm. What programmer enjoys “doing a ton of reworking just” to implement a single new constructor?
Did he mean the change was so <em>useful</em>, or, like, “<em>so</em> useful”? :) So it’s worthwhile to track down
<a href="https://github.com/chromium/chromium/commit/7a129f92f54dafe6c3ef98030ebbdbc2704d3411">pkasting’s actual commit</a>
from November 2024 and see all the places he sincerely did clean up as a result.</p>

<p>What follows is a “close reading” of all the client call sites changed in Chromium commit
<a href="https://github.com/chromium/chromium/commit/7a129f92f54dafe6c3ef98030ebbdbc2704d3411">7a129f92f5</a>.</p>

<hr />

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>std::vector&lt;scoped_refptr&lt;const Cert&gt;&gt; certs(
    {kcer_cert_0, kcer_cert_1, kcer_cert_2, kcer_cert_3, kcer_cert_3,
     kcer_cert_2, kcer_cert_1, kcer_cert_0, kcer_cert_0, kcer_cert_2,
     kcer_cert_3, kcer_cert_1});
CertCache cache(certs);
</code></pre></div></div>

<p>becomes simply</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CertCache cache({kcer_cert_0, kcer_cert_1, kcer_cert_2, kcer_cert_3,
                 kcer_cert_3, kcer_cert_2, kcer_cert_1, kcer_cert_0,
                 kcer_cert_0, kcer_cert_2, kcer_cert_3, kcer_cert_1});
</code></pre></div></div>

<p>This is the poster-child use-case: the new code directly views a stack-allocated
<code class="language-plaintext highlighter-rouge">initializer_list</code>, where the old code had wasted time and memory copying the contents
of that <code class="language-plaintext highlighter-rouge">initializer_list</code> into a heap-allocated <code class="language-plaintext highlighter-rouge">vector</code>.
This being test code, we don’t really care about the new code’s improved efficiency,
but we do care about its improved readability and convenience.</p>

<hr />

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ASSERT_TRUE(ConfigureAppContainerSandbox(
    std::array&lt;const base::FilePath*, 2&gt;{&amp;pathA, &amp;pathB}));
</code></pre></div></div>

<p>becomes simply</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ASSERT_TRUE(ConfigureAppContainerSandbox({&amp;pathA, &amp;pathB}));
</code></pre></div></div>

<hr />

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>EXPECT_THAT(MapThenFilterStrings(
                {{"en", "de"}},
                base::BindRepeating(~~~~)),
            IsEmpty());
</code></pre></div></div>

<p>replaces its double-braces with single-braces.</p>

<hr />

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FetchImagesForURLs(base::span_from_ref(card_art_url),
                   base::span({AutofillImageFetcherBase::ImageSize::kSmall,
                               AutofillImageFetcherBase::ImageSize::kLarge}));
</code></pre></div></div>

<p>becomes simply</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FetchImagesForURLs(base::span_from_ref(card_art_url),
                   {AutofillImageFetcherBase::ImageSize::kSmall,
                    AutofillImageFetcherBase::ImageSize::kLarge});
</code></pre></div></div>

<p>Notice that these preceding three examples all had the same <em>intent</em> — to view a
fixed list of two items — but in the absence of natural syntax they invented three
different workarounds to imperfectly express their intent. (Temporary <code class="language-plaintext highlighter-rouge">std::array</code>;
doubled curly braces; explicit cast to <code class="language-plaintext highlighter-rouge">base::span</code>.) All three converged on the natural
syntax as soon as it became available.
Two of them benefit from <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2752r3.html">P2752</a>, too.</p>

<hr />

<p>There were two “failure stories” in Peter’s commit, both due to the
new constructor’s lack of CTAD. (I still don’t think anyone should
ever use CTAD, and LEWG was a little scared of adding it here anyway.)
For example Peter rewrote</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (base::span(box.type) == base::span({'f', 't', 'y', 'p'}))
</code></pre></div></div>

<p>into</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (base::span(box.type) == base::span&lt;const char&gt;({'f', 't', 'y', 'p'}))
</code></pre></div></div>

<p>Now, you might think after P2447 this could have become simply</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (base::span(box.type) == {'f', 't', 'y', 'p'})
</code></pre></div></div>

<p>but sadly no, for <a href="https://stackoverflow.com/questions/11420448/initializer-lists-and-rhs-of-operators">historical reasons</a>
a braced initializer list is grammatically disallowed
after most C++ operators (the exceptions being <a href="https://eel.is/c++draft/expr.yield#nt:yield-expression"><code class="language-plaintext highlighter-rouge">co_yield</code></a>
and <a href="https://eel.is/c++draft/expr.assign#nt:assignment-expression">the assignment operators</a>).
I myself would probably have written one of</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (std::string_view(box.type, 4) == "ftyp")

if (memcmp(box.type, "ftyp", 4) == 0)
</code></pre></div></div>

<p>In the other “failure case,” Peter rewrote</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hosts[DnsHostsKey("localhost", ADDRESS_FAMILY_IPV4)] =
    IPAddress({192, 168, 1, 1});
</code></pre></div></div>

<p>into</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hosts[DnsHostsKey("localhost", ADDRESS_FAMILY_IPV4)] =
    IPAddress(base::span&lt;const uint8_t&gt;({192, 168, 1, 1}));
</code></pre></div></div>

<p>The trick here is that the old code (<a href="https://godbolt.org/z/hfYTqh3GW">Godbolt</a>)
wasn’t actually constructing a <code class="language-plaintext highlighter-rouge">span</code> at all; it was calling
<a href="https://github.com/chromium/chromium/blob/3afe88cc17a748340a53c3eea07fb706a5054af7/net/base/ip_address.h#L135-L137"><code class="language-plaintext highlighter-rouge">IPAddress</code>’s four-argument converting constructor</a>
followed by a redundant explicit cast to <code class="language-plaintext highlighter-rouge">IPAddress</code>.
Personally I would have preserved the old behavior and improved readability
at the same time by simply removing the curly braces:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hosts[DnsHostsKey("localhost", ADDRESS_FAMILY_IPV4)] =
    IPAddress(192, 168, 1, 1);
</code></pre></div></div>

<p>UPDATE, 2026-03-27: After reading this blog post, Peter improved both of these “failure cases”
in the suggested ways; see <a href="https://github.com/chromium/chromium/commit/7107d5e85797db9ddc5382b90c6d0d3f0dde5509">commit 7107d5e857</a>!</p>]]></content><author><name></name></author><category term="c++-style" /><category term="initializer-list" /><category term="parameter-only-types" /><category term="proposal" /><summary type="html"><![CDATA[Previously: ["`span` should have a converting constructor from `initializer_list`"](/blog/2021/10/03/p2447-span-from-initializer-list/) (2021-10-03). This converting constructor was added by [P2447](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2447r6.html) for C++26. Way back in 2024, Peter Kasting added the same constructor to Chromium's [`base::span`](https://github.com/chromium/chromium/blob/4aa6967/base/containers/span.h#L1166-L1171) — he emailed me about it at the time — but I was only recently reminded that in [the /r/cpp thread](https://old.reddit.com/r/cpp/comments/1n8fbg8/showcasing_underappreciated_proposals/) about the feature he'd written: > Yup, this change was so useful it led to me doing a ton of reworking of Chromium's > `base::span` just so I could implement it there. Speaking of [ambiguity](/blog/2026/03/19/seven-types-of-ambiguity/): out of context that comment _could_ be taken as sarcasm. What programmer enjoys "doing a ton of reworking just" to implement a single new constructor? Did he mean the change was so _useful_, or, like, "_so_ useful"? :) So it's worthwhile to track down [pkasting's actual commit](https://github.com/chromium/chromium/commit/7a129f92f54dafe6c3ef98030ebbdbc2704d3411) from November 2024 and see all the places he sincerely did clean up as a result.]]></summary></entry></feed>