Tag Archives: C

undefined behavior and the purpose of C

C undefined behavior. From one of the LLVM developers:

This behavior enables an analysis known as “Type-Based Alias Analysis” (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code. For example, this rule allows clang to optimize this function:

float *P;
 void zero_array() {
   int i;
   for (i = 0; i < 10000; ++i)
     P[i] = 0.0f;
 }

into “memset(P, 0, 40000)“. This optimization also allows many loads to be hoisted out of loops, common subexpressions to be eliminated, etc. This class of undefined behavior can be disabled by passing the -fno-strict-aliasing flag, which disallows this analysis. When this flag is passed, Clang is required to compile this loop into 10000 4-byte stores (which is several times slower), because it has to assume that it is possible for any of the stores to change the value of P, as in something like this:

int main() {
  P = (float*)&P;  // cast causes TBAA violation in zero_array.
  zero_array();
}

This sort of type abuse is pretty uncommon, which is why the standard committee decided that the significant performance wins were worth the unexpected result for “reasonable” type casts.

The permitted optimization is very effective: the code is 80 (EIGHTY!) times faster when clang O3 optimization replaces the loop with a memset (some of that is other optimization, but still).  But the programmer has the option of using memset directly – which produces exactly the same optimization. In fact, the programmer has the option of using memset directly because fundamentally C wants to expose the underlying memory to the programmer.

The original motivation for leaving some C behavior undefined was that different processor architectures would produce different behaviors and the expert C programmer was supposed to know about those.  Now compiler writers and standards developers claim it is good to introduce “unexpected results” (that surprise the experienced programmer) because these permit a certain kind of optimization.

Violating Type Rules: It is undefined behavior to cast an int* to a float* and dereference it (accessing the “int” as if it were a “float”). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don’t want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc). This behavior enables an analysis known as “Type-Based Alias Analysis” (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code.

To me: “The rules for this are quite nuanced and I don’t want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc)” means, “we mucked up the standard and we are going to cause many systems to fail as these nuanced rules confuse and surprise otherwise careful and highly expert programmers”. Compiler writers like undefined behavior because, in their interpretation of the standard, these permit any arbitrary code transformation. Anything. The well known controversial results include removing checks for null pointers due to an unreliable compiler inference about dereference behavior.  These uses of “undefined” and limitations on reasonable type coercion are based on an incorrect idea of the purpose of the C language.  Unexpected results are a catastrophe for a C programmer. Limitations on compiler optimizations  based on second guessing the programmer are not catastrophes ( and nothing prevents compiler writers from adding suggestions about optimizations).  There are two cases for this loop:

  1. The C programmer is an expert who used a loop instead of memset for a good reason or because this is not a performance critical part of the code.
  2. The C is programmer is not an expert and program almost certainly contains algorithmic weaknesses that are more significant than the loop –

Neither case benefits from the optimization. Programmers who want the compiler to optimize their algorithms using clever transformations should use programming languages that are better suited to large scale compiler transformations where type information is clear indication of purpose. As Chris Lattner notes:

It is worth pointing out that Java gets the benefits of type-based optimizations without these drawbacks because it doesn’t have unsafe pointer casting in the language at all.

Java hides the memory model from the programmer to make programming safer and also to permit compilers to do clever transformations because the programmer is not permitted to interact with the low level system.  Optimization strategies for the C compiler should take into account that C does not put the programmer inside a nice abstract machine.  The  C compiler doesn’t need to be a 5th rate Mathematica or APL or FORTRAN or  Haskell.  Unexpected results are far more serious a problem for C than missed minor optimizations.

Some reading:

DJT on boring C.

Regehr’s guide for the perplexed.