volatile
means it really happens
volatile
means it really happensThanks to Fatih Bakir for reviewing a draft of this post. All remaining errors are mine, not his.
During my CppCon 2020 talk “Back to Basics: Concurrency,”
someone asked, “Do we ever need to use volatile
?” To which I said, “No. Please don’t
use volatile
for anything ever.” This was perhaps a bit overstated, but at least
in the context of basic concurrency it was accurate. To describe the role of
volatile
in C and C++, I often use the following slogan:
Marking a variable as
volatile
means that reads and writes to that variable really happen.If you don’t know what this means, then you shouldn’t use
volatile
.
Consider the following snippet of C++, compiled on an average desktop system:
volatile int g = 0;
int test() {
g = 1;
g = 2;
return g;
}
According to our slogan above, because g
is declared volatile
, we can be
sure that the writes to g
really happen. But what does “happen” really mean,
on a desktop system? For one thing, it means that the compiler can’t just coalesce
those two writes into a single write — it can’t throw out the “dead” write g = 1
.
So it must emit the machine code for two writes to memory.
But what is “memory” in this context? We probably have three levels of cache between
the CPU and main memory — L1 cache, L2 cache, L3 cache. So a “write to memory” really
means loading a cache line into L1 and then writing our new value into that cache line.
Does that count as “really happening,” for the purposes of volatile int g
? Or does
a write only count as “really happening” if it makes it all the way back out to L2,
or L3, or main memory? What if we’re on a multiprocessor system and some other CPU
has loaded a copy of that cache line at the same time — does
cache coherency
need to be involved here?
Vice versa, what if volatile int g
were a local variable; could it be stored in a
register?— and what does it mean to “really write” to a register, in the presence of
hardware techniques like register renaming?
This isn’t just a software issue; it’s a hardware issue. In order to know what it means for a read or write to “really happen,” you must first invent the universe.
The volatile
keyword probably isn’t much use to you, unless you have deep knowledge
of the relevant pathway through your hardware platform. This is not the natural environment
in which most desktop programmers find themselves these days.
For an example where volatile
is useful, consider the following low-level C++ code.
I’ve shamelessly condensed it from
this StackOverflow question about BeagleBone GPIO:
void blink_twice() {
volatile int *off = reinterpret_cast<int*>(0x40225190);
volatile int *on = reinterpret_cast<int*>(0x40225194);
int flag = (1 << 22);
*on = flag;
sleep(1);
*off = flag;
sleep(1);
*on = flag;
sleep(1);
*off = flag;
}
This code is intended for use specifically on a board
(specs)
where some ranges of memory are mapped to hardware activities such as turning on and off the
electrical current to an LED.
Just like on a desktop system, when you ask to read or write at a given memory address,
you’re speaking of virtual addresses; it’s the job of a thing called the
MMU to receive every read or write
request coming out of the CPU and translate each virtual address into its corresponding
physical address by consulting the page table.
But, on this particular system, when the MMU looks up address 0x40225194
in the page table,
it sees that that address is magic: it skips the L1 and L2 caches, and traffic to that
address on the bus isn’t handled by RAM at all, but instead by a specific hardware peripheral
(a GPIO pin connected to our
LED light).
So, on this particular system, what it means to “really write” to address 0x40225194
is simply
to get the write out of the CPU and onto the MMU. From there, the MMU will reliably trigger
the hardware activity we actually care about — turning on our LED light.
Notice that even on this system, it’s still unclear what it would mean to “really write” to
an address that isn’t magic to the MMU — say, 0xFFFF0042
. For example, if the CPU
issues two writes to address 0xFFFF0042
in quick succession, they will both hit L1 cache.
When that cache line is eventually written back to main memory, there will be only one
atomically visible change (and it’ll be a whole 32-byte cache line at once). So, did
both writes “really” happen, or did only the second one happen?
Remember, volatile
means nothing more than that our reads and writes will really happen.
If you can’t explain what it would mean for a read or write at some address to really happen,
then you shouldn’t bother qualifying that object with volatile
.
Distinguish between the efficient and the teleological cause;
between the is and the ought.
It is not good enough to say, “I need this write to ‘really happen’ in the sense that
I need it to turn on that LED light; therefore I will use volatile
here.”
The hardware doesn’t care about your needs. You must instead start from the efficient side:
“If I use volatile
here, my compiler will ensure that writes to this variable
are not coalesced; therefore each write will make it to the MMU; the MMU will observe
that the written address is in an uncached range and divert it to the GPIO pin controlling
my LED.”
If you cannot explain the pathways involved at this level of detail — especially if your
explanations fail to survive their first encounter with caching or multiprocessing —
then volatile
is not an appropriate tool for what you’re trying to do.
Marking a variable as
volatile
means that reads and writes to that variable really happen.If you don’t know what this means, then you shouldn’t use
volatile
.