What breaks without implicit T*
-to-bool
conversion?
T*
-to-bool
conversion?Pop quiz: Does this program return 1
or 2
?
int f(bool) { return 1; }
int f(std::string_view) { return 2; }
int main() { return f("hello"); }
It returns 1
. Overload resolution sees that we have a choice of converting "hello"
(effectively a const char*
) to either bool
or string_view
. The former is a
built-in standard conversion, while the latter is a user-defined conversion (which
calls the constructor of the user-defined string_view
class type), so overload resolution
considers the former to be a better match.
However, this is surprising and usually unwanted. C++20 took a small step by deciding
(via P1957)
that a T*
-to-bool
conversion should be considered narrowing, and thus forbidden
in list-initialization. But could we go further and make T*
-to-bool
conversion
non-implicit?
The idea is to forbid implicit conversions—
void take(bool);
bool give(int *p) {
bool b = p; // OUTLAWED
take(p); // OUTLAWED
return p; // OUTLAWED
}
while still permitting explicit and contextual conversions—
bool okay(int *p) {
bool b(p); // OK
if (p) blah(); // OK
bool c = (p && p); // OK
bool d = (p || p); // OK
int e = (p ? 1 : 2); // OK
}
So, I patched my local copy of Clang to forbid implicit conversion from T*
to bool
.
See the patch here.
Here’s all the things that broke when I compiled LLVM/Clang itself with my patched compiler.
Unless you’re an LLVM maintainer, you might want to skip to the end. The majority of this post is a catalog of bugs and unfortunate code patterns, similar to what you’ll find on the PVS-Studio blog.
The “hypocritical operator bool
” pattern
As we know, you should always mark your operator bool
as explicit
, so that it will be available
for contextual conversions but not implicit ones. (That is, your class type will have the behavior
that T*
today fails to have.) But it seems to be very common that a programmer writing such
an operator bool
will rely on implicit conversion in the return
statement! “Rules for thee but
not for me,” you know.
struct MyPtr {
int *ptr_;
explicit operator bool() const { return ptr_; }
};
The Clang/LLVM codebase itself contains 30 instances of this
“hypocritical operator bool
” pattern. To eliminate the implicit T*
-to-bool
conversion here,
I recommend return ptr_ != nullptr
, return bool(ptr_)
, or return !!ptr_
.
(This audit incidentally turned up incorrectly non-explicit
operator bool
s
here
here,
here,
here;
and incorrectly non-const
-qualified operator bool
s
here,
here.
A good linter would flag all of these.)
Classes convertible to T*
but not bool
Consider:
std::atomic<int*> a;
if (a) {
My patched Clang complains: no viable conversion from 'std::atomic<int *>' to 'bool'
.
That’s because the conversion from atomic<T*>
to bool
happens in two steps: first
the user-defined conversion via atomic<T*>::operator T*() const
, and then a standard conversion
from T*
to bool
(which we chose to outlaw). The workaround here is to write
explicitly if (a.load())
.
Operations on atomics (even loads and stores) should always be explicit. Limit yourself to one atomic operation per source line.
The Clang/LLVM codebase contains two instances of this pattern with std::atomic
, and nine instances
involving user-defined “smart-pointer-like” types such as
class TypedTrackingMDRef
and
class MDOperand
which provide a converting operator T*
but no dedicated operator bool
.
Class types which provide
operator T*
should probably also provideexplicit operator bool
. However, your class type almost certainly should not provideoperator T*
to begin with. Prefer a named method such as.get()
,.value()
, or.load()
.
Also, classes whose operator T*
is never-null should probably not provide an
operator bool
, because that would just mask misuses; see “Probable bug #1” below.
assert
-like functions
The standard library’s assert
macro is defined
to wrap its argument in a contextual conversion to bool
.
But if you’ve written your own assert
-like function which is not a macro
and which takes bool
as its first parameter…
void my_assert(bool cond);
int main(int, char **argv) {
assert(argv); // OK, no problem
my_assert(argv); // OUTLAWED
}
The LLVM/Clang codebase has one such function, used twice.
A surprising T *isFoo()
antipattern
The usual convention in LLVM/Clang — and I hope in general — is to have paired accessors like this:
bool hasFoo() const;
const Foo *getFoo() const;
But in this one place the programmer has “cleverly” combined both accessors into one:
const FunctionDecl *isLocalClass() const;
Since pointers implicitly convert to bools, this lets the programmer write:
if (RD->isLocalClass())
const FunctionDecl *FD = RD->isLocalClass();
But it means that our patched compiler complains here:
error: assigning to 'bool' from incompatible type 'FunctionDecl *'
6902 | NeedInstantiate = RD->isLocalClass();
| ~~~~^~~~~~~~~~~~~~
The same pattern recurs here in LLVM.
Intentional use in bool
initializers and assignments
LLVM/Clang relies on pointer-to-boolean conversion pretty heavily, for things like this:
bool EHa = M->getModuleFlag("eh-asynch");
const bool HaveSpace = ::strchr(ArgV[I], ' ');
bool IgnoreErrors = Errs;
This kind of thing comes up about 73 times in LLVM/Clang. Each instance is trivial (if tedious) to patch:
bool EHa = (M->getModuleFlag("eh-asynch") != nullptr);
bool HaveSpace = (::strchr(ArgV[I], ' ') != nullptr);
bool IgnoreErrors = (Errs != nullptr);
A special case comes up at least 6 times, e.g. here and here:
isInvariantLoad = LI->getMetadata(LLVMContext::MD_invariant_load);
would be better written as
isInvariantLoad = LI->hasMetadata(LLVMContext::MD_invariant_load);
This is why these boolean accessors exist, after all!
Actual bug #1
Consider this code:
MachineBasicBlock &Src = *MI.getParent();
for (auto &Succ : Src.successors())
addInsertPoint(Src, Succ);
The patched compiler complains (in part):
llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:850:7:
error: no matching member function for call to 'addInsertPoint'
850 | addInsertPoint(Src, Succ);
| ^~~~~~~~~~~~~~
llvm/include/llvm/CodeGen/GlobalISel/RegBankSelect.h:375:10:
note: candidate function not viable: no known conversion from 'llvm::MachineBasicBlock *'
to 'MachineBasicBlock &' for 2nd argument; dereference the argument with *
375 | void addInsertPoint(MachineBasicBlock &Src, MachineBasicBlock &Dst);
| ^ ~~~~~~~~~~~~~~~~~~~~~~
llvm/include/llvm/CodeGen/GlobalISel/RegBankSelect.h:371:10:
note: candidate function not viable: no known conversion from 'llvm::MachineBasicBlock *'
to 'bool' for 2nd argument
371 | void addInsertPoint(MachineBasicBlock &MBB, bool Beginning);
| ^ ~~~~~~~~~~~~~~
In vanilla C++, it’s unambiguously that second overload of addInsertPoint
that gets called.
But circumstantial evidence suggests that the programmer meant to write
MachineBasicBlock &Src = *MI.getParent();
for (auto *Succ : Src.successors())
addInsertPoint(Src, *Succ);
to call the first overload instead.
Actual bug #2
Here:
error: no matching member function for call to 'shouldSignReturnAddress'
125 | if (!MFI->shouldSignReturnAddress(MF))
| ~~~~~^~~~~~~~~~~~~~~~~~~~~~~
note: candidate function not viable: no known conversion from 'const MachineFunction *'
to 'const MachineFunction' for 1st argument; dereference the argument with *
551 | bool shouldSignReturnAddress(const MachineFunction &MF) const;
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate function not viable: no known conversion from 'const MachineFunction *'
to 'bool' for 1st argument
552 | bool shouldSignReturnAddress(bool SpillsLR) const;
| ^ ~~~~~~~~~~~~~
Again, in vanilla C++ MF
would unambiguously convert to bool
and call the second overload.
The programmer wrote MF
when he meant *MF
.
Don’t gratuitously give two different functions the same name. These functions should have been named distinctly, e.g.
shouldSignReturnAddressOf(MF)
andshouldSignReturnAddressGivenSpill(bool)
. But even within an unavoidable overload set, you should avoid overloading on abool
argument, because of how promiscuously things convert tobool
. Prefer to take an enum or maybe even anint
, rather than abool
.
Actual bug #3
Here:
if (!UnrollRuntimeLoopRemainder(L, Count, /*AllowExpensiveTripCount*/ false,
/*UseEpilogRemainder*/ true,
UnrollRemainder, /*ForgetAllSCEV*/ false,
LI, SE, DT, AC, TTI, true,
SCEVCheapExpansionBudget, EpilogueLoop)) {
The patched compiler complains:
error: no matching function for call to 'UnrollRuntimeLoopRemainder'
note: candidate function not viable: no known conversion
from 'Loop **' to 'bool' for 14th argument
When you see the compiler talking about a function’s “14th argument,” you know you’re in for a fun time. The intended callee’s signature is:
LLVM_ABI bool UnrollRuntimeLoopRemainder(
Loop *L, unsigned Count, bool AllowExpensiveTripCount,
bool UseEpilogRemainder, bool UnrollRemainder, bool ForgetAllSCEV,
LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
const TargetTransformInfo *TTI, bool PreserveLCSSA,
unsigned SCEVExpansionBudget, bool RuntimeUnrollMultiExit,
Loop **ResultLoop = nullptr);
Default function arguments are the devil.
In this case, it looks like the programmer forgot to pass a boolean value for
the 14th argument RuntimeUnrollMultiExit
, and so his intended 15th argument,
ResultLoop
, was quietly coerced to bool instead. This should be fixed by eliminating
the default function argument and passing all 15 arguments explicitly at each call-site.
Probable bug #1: Testing never-null raw pointers
LLVM/Clang is full of places where a function takes a simple boolean, but the programmer apparently
thought something more structured was required. For example,
here
the programmer writes Builder.getTrue()
when a simple true
would suffice. Builder.getTrue()
returns a pointer to a compiler-internal data structure representing truthiness — which, since it is
a non-null pointer, implicitly converts to true
! Note that Builder.getFalse()
would also
evaluate to true
.
error: cannot initialize a parameter of type 'bool' with an rvalue of type 'ConstantInt *'
1678 | Builder.CreateFlagStore(Builder.getTrue(), NRVOFlag);
| ^~~~~~~~~~~~~~~~~
error: cannot initialize a parameter of type 'bool' with an lvalue of type 'llvm::IntegerType *'
4581 | ? EmitScalarExpr(Filter, CGM.Int32Ty)
| ^~~~~~~~~~~
And here
(along a codepath where Value
is invariably non-null):
error: cannot initialize a parameter of type 'bool' with an lvalue of type 'APValue *'
7061 | MaterializedType.isConstantStorage(getContext(), /*ExcludeCtor*/ Value,
| ^~~~~
Probable bug #2: Testing never-null smart pointers
error: no viable conversion from 'DirectoryEntryRef' to 'bool'
10255 | if (Cache->OrigEntry && Cache->OrigEntry->getDir()) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate function
87 | operator const DirectoryEntry *() const { return &getDirEntry(); }
| ^
This is the same pattern as “Classes convertible to T*
but not bool
” above; but in this case
the existing operator T*
invariably returns non-null. So this if
test was never testing anything.
This bug was detectable only because DirectoryEntryRef
defined operator T*
without operator bool
.
It certainly should not define either of those conversions (and my understanding is that that’s the
maintainers’ long-term plan, too).
Surprising usage in checkEnvVar
Here the Clang programmer has done something very weird. Essentially, it’s:
template<class T>
T checkEnvVar(const char *varname) {
const char *p = ::getenv(varname);
if (p == nullptr) return T();
return value;
}
The checkEnvVar
template is instantiated in only two ways. checkEnvVar<std::string>
returns ""
if the variable
is unset, or the contents of the variable if it’s set. checkEnvVar<bool>
returns false
if the
variable is unset, or true
if it’s set (even if it’s set to the empty string, or to 0
).
The same template code happens to implement both behaviors.
I was surprised to see this, but it really does seem to be intentional.
The workaround is to change return value
to return T(value)
.
The final tally
In the LLVM/Clang codebase as of this writing — I very roughly estimate 3.8 million lines of code —
here’s the total count of places that rely on implicit T*
-to-bool
conversion
in various ways:
Pattern | Count |
---|---|
getMetadata for hasMetadata |
6 |
checkEnvVar |
1 |
Actual bug #1 | 1 |
Actual bug #2 | 1 |
Actual bug #3 | 1 |
Probable bug #1 | 3 |
Probable bug #2 | 3 |
bool x = p; |
48 |
Other x = p; |
19 |
atomic<T*> |
2 |
Other operator T* without operator bool |
9 |
Hypocritical operator bool |
30 |
Other return p; |
117 |
Hash.AddBoolean(p) |
21 |
Diag(Loc, diag::err_riscv_type_requires_extension, D) |
8 |
assert_with_loc |
2 |
Other f(p) |
50 |
Total | 322 |
I also audited another, proprietary, C++17 codebase — very roughly 1.2 million lines of code.
Its base rate of T*
-to-bool
conversions per line is only 30% of LLVM/Clang’s. The audit
turned up one non-explicit
operator bool
, and zero bugs.
Pattern | Count |
---|---|
bool x = getenv(s); |
2 |
Other bool x = p; |
3 |
atomic<T*> |
3 |
Hypocritical operator bool |
2 |
Other return p; |
16 |
f(p) |
4 |
Total | 30 |