Data race when catching by non-const reference
People say “always catch exceptions by const
reference,” which I think is good advice.
I’ve worked on at least one codebase that catches by non-const
reference, presumably
just because it’s shorter to type. This has never caused any problem for them in practice,
as far as I know. But in theory, it’s a bad idea, because in C++ it is possible to
construct a program where two different catch-blocks, executing concurrently, actually
reference the exact same exception object. In that situation, any mutation to the
exception object will cause a data race. The easiest way to prevent the race is to
prevent mutation: to catch by const reference!
Here’s my toy test program.
We start with our exception object, which has a non-const
-qualified method
that mutates the object and checks to see if anyone else is mutating it at the
same time (because that would mean a data race in a real program):
struct Atomic {
std::atomic<int> value_;
void detect_racey_accesses() {
for (int i=0; i < 10000; ++i) {
int last_value = value_.load();
++value_;
int next_value = value_.load();
if (next_value != last_value + 1) {
printf("Detected a race: %d != %d+1\n", next_value, last_value);
}
}
}
};
(Now, technically, concurrent calls to detect_racey_accesses
do not cause data race
in the C++ sense, because I’m using an atomic<int>
; the data race here is at the logical
level but not the physical level. In real code, where value_
would be a plain old int
,
we would have a physical data race here.)
And here’s the program that throws an Atomic
, catches the same object by reference
from two different threads, and then detects racey mutation of that object:
int main()
{
std::promise<std::exception_ptr> prom;
std::thread T1(
[fut = prom.get_future()]() mutable {
std::exception_ptr ex = fut.get();
try {
std::rethrow_exception(ex);
} catch (Atomic& a) {
a.detect_racey_accesses(); // A
}
}
);
std::thread T2(
[prom = std::move(prom)]() mutable {
try {
try {
throw Atomic();
} catch (...) {
prom.set_value(std::current_exception());
throw;
}
} catch (Atomic& b) {
b.detect_racey_accesses(); // B
}
}
);
T1.join();
T2.join();
}
Lines A
and B
run concurrently, and the object referenced by Atomic& a
on
line A
is the exact same object as the object referenced by Atomic& b
on line B
.
So, catch your exceptions by const
reference — and follow the “const
means thread-safe”
rule in your own code — and you will avoid this extremely contrived but extremely
surprising situation.
This also explains why, in contrived code such as
catch (Foo& f) {
return f;
}
we do not, and cannot, get either copy elision or
implicit move
on return f
— it must make a copy, because the original exception object might still be in use
by another thread.