formal methods considered harmful and more on soft real-time

[fixed a couple of typos, Dec. 20 2007]

John Regehr writes:

On the other hand, there is plenty of useful work to be done on supporting time sensitive applications (I’ll just avoid saying “soft real-time”) even when no guarantees are made. Let’s take the example of an OS that disables interrupts when any process is executing inside the kernel. This is not hypothetical, I was recently talking to people designing such an OS because they are formally verifying it; they did not believe they were capable of verifying an OS that accepts interrupts more liberally. Anyway, needless to say the average-case response time to interrupts of this OS will suck. Now if we have some breakthrough in verification that permits interrupts to be accepted most places in this kernel, average case responsiveness will improve though worst-case may
not. Is this useless? Of course not, if it makes it possible to watch videos or listen to music or similar, on this OS. For applications of this kind a mathematical guarantee is simply overkill and what we really care about is more like “works almost all of the time in practice, given actual workloads.”

This is perfectly reasonable, in fact, there are many interesting properties of operating systems that are not real-time properties at all if we define real-time properly so that it does not end up meaning “any performance properties.” What’s unreasonable is this idea that crippling a piece of software to make it possible to “prove” that it “works” gains much of anything. That story about the drunk looking for his keys under the light-post is supposed to be a joke, not a design methodology. It’s fine to argue that by simplifying a system one can validate it while not losing too much performance or functionality for the purpose – but all too often the second part of the argument is omitted. What is the point of a system that is provably “correct” at a performance level that assures it cannot be used for its ostensible purpose? Worse, this methodology is often used in order to make it possible to apply “mathematical” methods which have not been ever shown to be more reliable than testing or code reviews or even good luck charms (the claim that “formally validated code is better than other code is, at best, unproven).

The phrase “mathematical guarantee ” is worth some deeper analysis.

  • Mathematical. The types of “hard specifications” that I am advocating do not need to be overly “mathematical” and certainly don’t need to be “formal” to be useful. On the contrary. The “formal methods” advocates have made a fetish of formalization to the detriment of useful engineering specification and that causes people to shy away from attempting to be more precise. What we want from a specification, at minimal, is that it should be falsifiable. Perhaps an incremental approach to adding fast paths to operating system services to reduce interrupt latencies can make the system described above “more responsive” for “standard workload”. If so, we should be able to characterize “more responsive” in some falsifiable way. If the specification is just “there is no wall-clock observable delay on the echoing of typed characters while stress program S runs in the background and a tester types a file in the ed program” or that “programs A,B, and C start within 5 seconds of double clicking the icon” then it is at least a falsifiable specification and is a concrete goal for the software developer. In any case, we can see that “average” is not what is needed, since “average” implies that if the system runs for 1 hour, a 5 minute total stop can be amortized away by 55 minutes of high speed operation. Perhaps what we want is as simple as “the video player on an otherwise unloaded system will not stop more than 10 seconds cumulatively over a one hour period where stops are delays in frame display that are over 1/60th of a second.” But this “soft” real-time requirement carries a real-time property with it – as argued previously.
  • Guarantee. There is a difference between a design guarantee and an evaluation guarantee. A TCP/IP network stack is designed to produce an in-order, reliable, stream of data between sender and receiver. But it is possible that in some situations, under some loads and with some definition of “reliable”, a simple datagram stack can produce the same property. The TCP/IP stack is designed to guarantee a property that the simpler stack may be shown to have in some circumstances. An evaluation guaranty is an attempt to show that a system has some property. One of the most useful ideas in Common Criteria is the evaluation assurance level (EAL) which is supposed to indicate the reliability of such a guaranty. One might find “we ran it a couple of times and it looked good” to be less or more reassuring than “we produced a 300 page document that strenuously exercised the symbol production features of LaTex” but these are both “evaluation guarantees”. Note that design guarantees should be backed up by evaluation guarantees that the design is correct and implemented correctly.

“Real-time software” normally refers to software that offers design guarantees on latency and scheduling times, data rates, or other time sensitive operations. Validating that the design succeeds requires some combination of testing, mathematical proof, and design and code review. But we may also be able to validate real-time properties of code that has no design real-time properties at all. One could take an instance of a LAMP stack (Linux + Apache + MySQL + PHP) and subject it to extensive testing on a particular hardware platform and observe that in practice it satisfies a real-time specification. Suppose our measurements show that all pages of a certain limited complexity will be “served” within 1 second over a 1GB network that is reserved for the LAMP system and where requests are at a rate bounded by R and there are no more than 30 seconds cumulative failure on this property over 1 hour. That’s a falsifiable, concrete, real-time specification – mathematical, even [this is in response to a note from B., a very knowledgeable system designer who may or may not be quotable].