Types considered harmful II

Russell introduced what is often viewed as his most original contribution to mathematical logic: his theory of types—which in essence tries to distinguish between sets, sets of sets, etc. by considering them to be of different “types”, and then restricts how they can be combined. I must say that I consider types to be something of a hack. And indeed I have always felt that the related idea of “data types” has very much served to hold up the long-term development of  programming languages. (Mathematica, for example, gets great flexibility precisely from avoiding the use of types—even if internally it does use something like them to achieve various practical efficiencies.) – Wolfram.

There are two different reasons for being interested in types. The first comes out of metamathematics and it’s not really all that interesting. The second is to prevent type errors and facilitate efficient compilation/execution in programming languages. Type errors mostly involve illicit combination of elements  that have incompatible semantics but compatible representations.  For example, there was a rocket that exploded because quantities in metric units and english units were mixed. If x is an integer representing centimeters and y is an integer representing inches, x+y should either provoke a type conversion (which may be impossible to get right) or an error, preferably in the compilation stage not in the combustion stage.  Types can also facilitate efficient computation by making it possible to compile, for example, an addition to a C unsigned int as a single machine instruction, or a to use SIMD instructions. Unfortunately, the major uses of types in programming languages are to  saddle programmers with silly cumbersome bureaucracy or to make some weird point about category theory or lambda calculus(e.g. the type system in something like Haskell) , or to apparently use random logic (C standard).

C as developed by the C standard committees and C compiler language lawyers has become an illogical hodgepodge. For example, typedefs were added to the standard but apparently are not to be taken seriously. The following program compiles with not a single error or warning.

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv){
typedef int metric_t;
typedef int english_t;
metric_t x;
english_t y;

if (argc < 3){
 printf("test inches centimeters\n");
 exit(0);
 }
y = atoi(argv[1]);
x = atoi(argv[2]);

printf("x= %d y=%d x+y = %d\n",x,y, x+y);
}

Yet, as I complained in an earlier post, gcc/clang compiler authors believe that they can generate code that makes “int” variables overflow but then remove checks for overflow since  overflow is “impossible” under the standard.  So if x is a signed int, Clang/GCC will compile “x+1” into a single add instruction which has overflow behavior, but will feel free to “optimize”

if(x+1 <x)dooverflow()

by deleting it – since overflow is “impossible” in the standard according to a stupid interpretation of a woolly specification. The result is a beautiful combination of lax type enforcement, arbitrary promotion of types, and complex type rules nobody understands that produce “optimizations” that are not optimizations at all but are just introduction of random errors.  And yet, flawed as it is, C remains enormously useful because, with care, it can still provide an honest model of the computer.