The difference between unspecified, undefined, and non-deterministic

There is too much confusion in the “formal methods” computer science literature between these three different terms. Let me start with what this means for a state machine and then move on to engineering objects such as threads. Suppose we have a map α: X → Y  where X is the set of finite sequences over X. In an earlier post I explain how these maps are equivalent to state machines.  For a finite sequence z we can leave  α(z) unspecified – meaning, we don’t make any assurances about the value other than that  α(z)∈ X.  Alternatively, we can say that α(z) is undefined which is a definite assertion that α does not associate any element of X with z. If α is considered as a set of pairs (s,x) with the function property that (s,x)∈α and (s,x’)∈α implies x=x’, then specifying that  α(z) is undefined specifies that (z,x) ∉ α for any x.  So  “α(z) is undefined ” is, in fact, a quite precise specification. Finally, if we say “α(z) is non-deterministic ” we mean that α is not a function at all, it is a relation on E* cross X so that (s,x)∈α and (s,x’)∈α  and  x=x’ is true for at least one s, x and x’. These are all different, but it’s common to mix them up in “formal methods” which has bad mathematical effects and worse effects when trying to understand how computer systems work.

Programs are purely deterministic systems. Given the same inputs, they always compute the same outputs. Without this property, they would not be useful at all. Generally, when researchers say a program is non-deterministic, they are discussing a program that has unspecified or undefined behavior or inputs. It’s that last part that really introduces confusion. For example, it is common to say threaded programs are non-deterministic. But that’s not correct. The apparent non-determinism is due to differences in i/o and scheduling – both of which are inputs. If the operating system schedules a multi-threaded program the same way twice and the input data is the same, the computations will be identical.  The same can be said about i/o.

In fact, the underlying error is sloppy definition of system state. We can see this most clearly in Robin Milner’s CCS book where he speaks of the non-deterministic characteristic  of a device where he has somehow included the choices of the external operator or the weather conditions as part of system state. The consequence is that his state machines are non-deterministic, and this makes the math really complicated and slippery – for no good reason.