Soft real-time and QOS (revised)

[ revised version of an older post]

“Soft real-time” is a perfect example of the “soft design” noted in an earlier post. There are perfectly good ways of characterizing quality of service (QOS) assurances precisely. Doug Jenson proposes one possible definition:

The general case of a deadline (which is a soft deadline) has utility measured in terms of lateness (completion time minus deadline), tardiness (positive lateness), or earliness (negative lateness). Larger positive values of lateness or tardiness represent lower utility, and consequently larger positive values of earliness represent greater utility.

That is, he says a real-time system will have a function

0 =< Utility(Error) =< 1

where, Error > 0 for a late computation and Error < 0 for an early computation. A hard real-time system will have Utility(Error)=0 whenever |Error| is greater than some acceptable limit. That’s a good start, and there are other obvious ways to quantify. In practice, utility functions may require history: dropping the fourth frame of video during a 1 second interval is different than dropping the first frame during that period. We might specify 75 frames/second which means about 13 milliseconds a frame for flicker free video. Then we could say that the average is x frames per second and there are no outliers more than n standard deviations from the mean. Or we could require that over any interval of time t seconds there will be at most n delays of more than E milliseconds. There will be a big difference between requirements for editing (which may permit no frame delays more than 100 microseconds) and consumer viewing which will be a moving target but may allow dropping frames that are more than 2 milliseconds late, but not permit more than 1 frame to be dropped every 2 seconds or something like that. Note that one of the distinguishing characteristics of any type of real-time system is that timing is not subject to amortization. Being 10 seconds early and then 10 seconds late, does not mean that the system is perfectly on time.

In practice, we rarely see quantitative specifications of real-time behavior in “soft real-time” systems probably because such specifications would reveal the engineering flaw in most soft-real-time systems. In order to make any QOS assurances you need to be able to make hard assurances and as soon as specifications are written down in any detail at all, this inconvenient problem becomes all too clear. If our specification is that over a one second period, no more than 2 frames will be more than 100 microseconds late, then if the first two frames come in 110 microseconds after deadline, the all of the remaining frames in the second must be under 100 microseconds late. The distinguishing characteristic of “soft” real-time, in practice, is that a soft-real-time system can tolerate some timing errors before falling back on a more rigid timing requirement. Here’s a summary.

A hard real-time system has firm worst case timing properties that must always be met to avoid failure

A soft real-time system is a hard real-time system with some recoverable error modes

But if we accept the above definition, then designing “soft real-time” systems will be seen to be more demanding than designing hard real-time systems. Specifications like “seems peppy” will no longer be acceptable and the types of sloppy mechanisms that can be shown to reduce the tail of distributions in some circumstances will be less appealing.