The Enterprise Profile for PTP and TimeKeeper

One of the most interesting things we saw in the proposed IEEE 1588 enterprise profile was a bold suggestion on fault tolerance that looked familiar. Here’s FSMLabs press release from September 2011

TimeKeeper 5.0 offers the ability to monitor multiple time distribution channels, even those operating on different time distribution standards or of different quality due to distance or network issues. As an example, a TimeKeeper client may monitor two different Precision Time Protocol (PTP) “master clocks” and three different Network Time Protocol (NTP) servers. In addition, if the time quality of TimeKeeper’s primary sources becomes questionable, TimeKeeper can now switch from tracking one time source to another, according to a fail-over list provided at configuration time.

This press release described products that were already in the field in production. I remember that although customers liked this capability, talks at timing conferences often provoked complaints from engineers who insisted that the PTP “Best Master Clock” protocol already solved the problem. Anyways, it was gratifying to see that by February 2015 a similar, scaled down, capability was being proposed for the PTP Enterprise Profile. 

Clocks SHOULD include support for multiple domains.  The purpose is to support multiple simultaneous masters for redundancy. Leaf devices (non-forwarding devices) can use timing information from multiple masters by combining information from multiple instantiations of a PTP stack, each operating in a different domain. Redundant sources of timing can be ensembled, and/or compared to check for faulty master clocks. The use of multiple simultaneous masters will help mitigate faulty masters reporting as healthy, network delay asymmetry, and security problems.  Security problems include man-in-the-middle attacks such as delay attacks, packet interception / manipulation attacks. Assuming the path to each master is different, failures malicious or otherwise would have to happen at more than one path simultaneously. Whenever feasible, the underlying network transport technology SHOULD be configured so that timing messages in different domains traverse different network paths.

Note that there are three things missing from this proposal that were in TimeKeeper 5.0 back in 2011: the ability to use NTP sources as well as PTP, the ability to use multiple PTP sources in the same domain, and working software. Stating “SHOULD” in a standard is a long way from “works in the field” but recognition of the problem is a good step.


What does the UNIX file system do?

Unix, Linux, Windows and other operating systems and the world wide web all support file systems with the familiar path file names  like

 or "/system/passwords/secret/dontread.txt"

although sometimes with different separator characters between the individual “flat” file names. For example, Windows uses “\”. As long as we know how to separate flat file names in the sequence, it doesn’t matter. The flat file names are chained together in a path through the file system that shows “where” a file can be found. URL’s in the world wide web are just path file names with some more information around them.  Constructing the file system involves a clever technique for embedding a tree in a simpler file system where file names are just numbers.

For historical reasons, the base file system uses numbers called “inode numbers” to name files.  Ignoring modifications, this file system looks like a function F:InodeNumbers → FileData. The tree emerges from information stored in some of the files. FileData includes some files that are just data and some files that are maps called “directories” (or “folders”). Directory maps have the form d: SimpleFileNames → InodeNumbers.  If we have a path file name “a/b/c” and a starting inode number i, we can first get d1 = F(i), the contents of file i which should be a directory, then get ia= d1(a) the inode number of the file named a, and then da= F(ia) and  ib= da(b) and db= F(ib) and  ic= db(c)  and then the contents of the file “a/b/c” is F(ic ) – assuming that the path is defined.  More concisely, we can write  ia= F(i)(a) and  ib= F(ia)(b) and so on where functions are resolved left to right: for example, F(i)  is a map which is then applied to a. 

Computing the translation of a path file name to an inode number can be defined recursively in terms of a function usually called namei (for names to inode numbers).  If the path file name is the empty path, then we are already where it leads: namei(i,Empty) = i.  If the path file name is not empty, it has the form a/p  where  is a simple file name (of any length) and  is a path file name with one less simple file name in it than the original path: namei(i,a/p) = namei(F(i)(a),p) It’s possible that namei(i,p) is not defined – for example, F(i) might not even be a directory function or it might be one but d=F(i) might not be defined on the leftmost simple file name in the path. In that case, we have “file not found” or “404” in the case of a URL.

A UNIX type file name has a special inode number for the “root” directory.  For any path  file contents is then U(p)= F(namei(root,p)).  A consistent file system will have at least the following properties.

  1. No orphans. For every  in InodeNumbers,  if F(i)  is defined there must be a path  so that namei(root, p) = i. 
  2. No dangling references. For every so that F(i)  is a directory function and for ever simple file name so that F(i)(a)  is defined, F( (F(i))(a)) must also be defined (that is, if F(i)=d  and d(a)=j  it must be the case that  F(j) is defined.)  

Another useful property limits cycles or loops through the file system and aliases. If U(p)  is a directory, let Children(p) = {a: U(p)(a) is defined} where  is a variable over flat file names.  Then define find(p) = {p} if is not a directory or Children(p) = emptyset and define find(p) = union{find(pa): a in Children(p)} . If there are no loops, this is a well defined function that terminates with the set of leaf nodes reachable from p. For example if one were in an organization concerned about security, there might be regular monitoring of find(/home/snowden) to see if any unauthorized data had been collected.

The most stringent non-alias requirement would be that if namei(root,p) = namei(root,q) then p=q. There can be no loops if there are no aliases. This requirement is usually relaxed to accommodate the “parent” and “self” pseudo file names, and hard and soft links. The simple file name “.” is usually reserved to mean “self” so that if F(i)  is a directory F(i)(“.”) = i. The pseudo-file-name “..” is used for “parent” so that if F(i)(a)=j  and F(j) is also a directory, then F(j)(“..”) = i. These pseudo-file names introduce both loops and aliases so we could just limit the requirement for no aliases to the cases where and  don’t contain any pseudo-file-names. Note that the definition of the parent pseudo-file-name limits many kinds of loops because it cannot be that a directory points back at two different parents.

Soft links, a later addition to UNIX files, are a more complex problem. For soft links we add file contents that are path file names and modify namei  so that if j=F(i)(a)  is a soft link with F(j)=q, then namei(i,a/p) = namei(root,concat(q,p)). The original definition of namei has a nice property that the path shrinks by one flat name at every step and this change loses that property and makes it easy to create loops that never finish. The solution to that is to count soft links and just give up if a path takes us to more than some set limit number of soft links.



Smart and dumb clients and the “so-called” Best Master Clock Algorithm in PTP IEEE 1588

The Best Master Clock (BMC) algorithm is a key part and key weakness of the PTP standard. The proposed enterprise profile for PTP calls it  “the so-called ‘best master clock'” algorithm because it doesn’t actually pick the best master clock in an enterprise or telco environment. The BMC  requires that each clock advertise a level of accuracy and the clients pick the best one (hence the name). In enterprise and telco environments, however, the accuracy of the delivered time usually has much more to do with packet delays than with the accuracy of the clock itself so the BMC uses the wrong metric.

If you have two grand masters, one  in the same data center as the clients,  connected to them over 10G networks and one that is connected by a dial up modem on a phone at 3800 baud,  and the first advertises itself as being 100 nanoseconds from GPS time while the second advertises itself as 50 nanoseconds from GPS time, the BMC says the client is stuck with the second one. The original PTP standard does not have any smart way of improving this situation because it’s designed for dumb clients on 1 wire networks, not smart clients on a complex enterprise networks.

Even worse, the original PTP standard has no way to deal with bad grand master clocks that send out bad time but claim to be accurate. When the Euronext system failed 2 years ago, a broken grand master clock went off time by 35  seconds but kept claiming to be accurate down to the nanosecond. According to the original PTP standard, the clients would be forced to keep using that bad time even if they could detect the problem.  In fact, if you interpret the basic PTP standard in a way that makes it easy to write dumb clients, the Grand Masters themselves are supposed to turn off if they see a “better” time source.  So in the situation described in the preceding paragraph, the clock on the 10G network is supposed to go silent and let the packets coming over the dial up telephone modem control the clients. That’s why the proposed Enterprise PTP Profile has that snide line about the “so-called best master clock” algorithm. BMC often chooses a clock that is far from the best.

TimeKeeper is based on a smart client approach. Our clients will switch to a better PTP or NTP source from a clock that goes wrong because the primary goal is to track time accurately even when the standard is inadequate. TimeKeeper stays compatible with the standards but makes decisions in a wider context.  We run multiple client contexts for both PTP and NTP at the same time and use real-time data analysis to figure out which one gets to control the clock.  This system is used in some of the most demanding transaction systems in the financial trading markets.

PTP-2008, the latest version of the standard, makes some halting steps towards fixing the BMC problem through complex (untested and unimplemented) ideas like Grand Master Clustering. PTP-2008 also adds the possibility of “profiles” that can replace or amend BMC at the cost of making the standard less standard. Both the  Telecom Profile and the proposed Enterprise Profile explicitly develop ways defeat the BMC.  Here’s something from the Enterprise Profile proposal:

 Slave clocks MUST be able to operate properly in a network which contains multiple Masters in multiple domains.  Slaves SHOULD make use of information from the all Masters in their clock control subsystems.

We think that’s a great idea – which is why our client software has been  managing multiple PTP and NTP sources for years.  I used to give talks at time conferences on our fault-tolerance methods and encounter either incomprehension or lectures about how the BMC solved the problem already. It is gratifying to see wider appreciation of the limitations of the BMC even if our solutions have not yet been appreciated (except by our customers).

German banking

Germany has a complex banking system that provides a lot of support for SME manufacturing.

The main strength of German economy is the so-called ‘Mittelstand’, a dense fabric of SMEs that manage to export high-quality manufactured products. In France, policymakers from all parties are obsessed with building a ‘French Mittelstand’. Economists are repeatedly asked to produce reports on what France could do to foster its SMEs the way Germany does (see for ex. Kohler D., Weisz J.-D. & FSI, 2012). At first glance, it does seem that the Mittelstand has done quite well throughout the financial crisis. This implies that German SMEs have enjoyed continued access to affordable financing since 2008. Indeed, at the macroeconomic level, studies show “no signs of credit crunch” (Ziebarth, 2013, see also Friderichs & Körting, 2011). link

Right to private jet ski use while collecting disability

New York authorities said the warrants led to the indictments of firefighters, police officers, and civil servants on disability fraud charges. The Facebook data, which included user photos and videos, showed employees who claimed they were disabled performing a variety of activities, including fishing, martial arts, and even jet ski riding. – Ars Technica

Ikea and RedHat

This is state of the art for systems software now – which is not all that impressive.

Glantz explained that Ikea has more than 3,500 Red Hat Enterprise Linux (RHEL) servers deployed in Sweden and around the world. With Shellshock, every single one of those servers needed to be patched and updated to limit the risk of exploitation. So how did Ikea patch all those servers? Glantz showed a simple one-line Linux command and then jokingly walked away from the podium stating “That’s it, thanks for coming,” as the audience erupted into boisterous applause. On a more serious note, Glantz said that it took approximately 2.5 hours to test, deploy and upgrade Ikea’s entire IT infrastructure to defend against Shellshock.  Eweek.

Gilbert and Sullivan’s innovative business model

From the Financial Times: (and you should buy a subscription)

Piracy is a problem as old as the music industWilliam Schwenck Gilbert, Arthur Sullivan - The Pirates of Penzance - (Sheet music)ry itself. In Victorian times, it was illicitly copied sheet music that was the avowed enemy of the artist, and the operetta team Gilbert and Sullivan paid toughs to go round London pubs smashing up pianos with sledge hammers whenever they found bootlegged scores.