Image result for titanic

Via @johnregehr this further tale of  compiler “optimizations” that break correct C code without even a warning by deleting a necessary null pointer check.

In this example, Bar contains a stack of Foos. Sometimes the Foo stack is empty, so sometimes getLastFoo returns null. The caller is supposed to check for that case, but the bug is that doUB does not check for null.
The implementation of SmallVector::back checks that the vector must be non-empty, so you would think that enabling assertions would protect us from this bug, but it does not. Because of the way LLVM’s inliner works, empty() and back() are inlined into getLastFoo first. Some optimization (probably GVN) notices that the empty() check in the inlined code for back() is redundant with the empty check in the ternary, so it folds it to true. Now we have lost our assertion, but we still have the empty check.
Next we inline getLastFoo() into doUB(). At this point, SimplifyCFG (IMO a strange pass to do this) realizes that we always dereference the result of getLastFoo(). It simplifies the conditional branch to an unconditional branch that always goes down the “non-empty” path and removes the phi node with nullptr.
OK, so we’ve gotten rid of the null check, and the assertion is gone. But what about ASan? Why can’t it catch this bug? First, it doesn’t do any instrumentation to catch null dereferences because they are usually trivial bugs. Adding more instrumentation would be expensive, and we can usually rely on the processor to catch these bugs for us. At least, we can when the compiler isn’t actively making it harder to find bugs.
Second, ASan instruments the code after optimization, so by the time ASan shows up, it’s harder to catch the bug.

 

See also this and this and this. The Titanic illustration is there to commemorate the origin of much of the worst of UB in accommodating the Itanium.

More UB madness from LLVM/Clang
Tagged on: