Category Archives: timekeeper

Time out of joint

Financial trading venues and trading systems operate so quickly and rely on clocks so deeply that events like the one noted in this FINRA report are more common than many understand

The findings stated that the firm transmitted to OATS New Order Reports and related subsequent reports where the timestamp for the related subsequent report occurred prior to the receipt of the order,

In electronic trading such errors are easy to make. Two computer servers split the work in some data center and the clock on one is 10 milliseconds faster than the clock on the second. The faster device sends an order to a market and stamps it with the time. The slower device gets the response from the market and stamps it with the time.

Real time Server One Server Two
12:00 Send order clock=12:00.010 Clock=12:00
12:00.05  Clock=12:00.15 Get confirmation. Clock=12:00.05

In fact, for many trading organizations this is scenario does not even require two servers because their clocks can jump backward.

 

IEEE 1588 PTP is a mess

IEEE 1588 was not designed for modern enterprise computer networks and contains many hacks to make it sort of work. The standard also suffers from being overly explicit on some things and overly unspecific  on others.  One marker of the flawed process is that IEEE 1588 transparent clocks don’t really comply with Ethernet standards because they modify packets without changing the MAC address. So in 2012 the 802.1 and 1588 standards groups started discussing what could be done. The 1588 committee notes that the “intent” (and practice) violates OSI layering but that 1588 doesn’t “mandate” that intent! Oy vey.

Questions have been raised concerning an IEEE 1588-2008 Transparent Clock layer 2 bridge modifying the CorrectionField of Ethernet transported PTP frames without changing the Ethernet source MAC address.  The question is if this operation is permitted by IEEE 802.1Q [1].  The original intent of the IEEE 1588-2008 standard was that a Transparent Clock will forward PTP event frames with no modifications except for the CorrectionField and FCS updates, however IEEE 1588-2008 does not mandate that.

ESMA clarifies time sources for MiFID II

isaacsz

ESMA just released guidelines that reinforce what was already clear in the MiFIDII regulation – that GPS time is a perfectly acceptable source of “traceable” time. There is a lot else that is of interest in this report, but it’s a good reminder to not be panicked by marketing scare-tactics.

As per Article 1 of the MiFIR RTS 25, systems that provide direct traceability to the UTC time issued and maintained by a timing centre listed in the BIPM Annual Report on Time Activities are considered as acceptable to record reportable events. The use of the time source of the U.S. Global Positioning System (GPS) or any other global navigation satellite system such as the Russian GLONASS or European Galileo satellite system when it becomes operational is also acceptable to record reportable events. GPS time is different to UTC. However, the GPS time message also includes an offset from UTC (the leap seconds) and this offset should be combined with the GPS timestamp to provide a UTC timestamp.

MiFID II, GPS and UTC time

I have a post up on FSMLabs web site about the use of GPS and other satellite time for MiFID II timestamp compliance.  It’s fascinating how much effort has recently gone into trying to convince people that MiFID II will require direct time from a national lab or certified via a national lab despite the clear wording in MiFID II proposed regulations. To me, the deal is sealed in the Cost Benefit Analysis in which the ESMA regulators write

“The final draft RTS also reduces costs of the initial draft RTS proposed in the CP by allowing UTC disseminated via satellite systems (i.e. GPS receiver or the use of other satellite systems when available)”

That is not a promise one can easily walk away from. ESMA justifies the regulations with a cost/benefit analysis in which the costs for time stamping are limited by license to use GPS time.  Of course, legal reasoning and logic are not always the same, but I’m trying to figure out how ESMA regulators could claim that they didn’t mean it, or why they would have such a motivation.

South_Sea_Bubble

 “South Sea Bubble” by Edward Matthew Ward via href=”https://commons.wikimedia.org/wiki/File:South_Sea_Bubble.jpg#/media/File:South_Sea_Bubble.jpg” Wikimedia

Windows support in Timekeeper and MiFID II Compliance

TimeKeeper® PTP/NTP time synchronization software now works on  Microsoft Windows® Operating Systems including Windows 7, Windows 8 and Windows Server 2012. This has been a lot longer in coming than we originally expected because most of our early customers turned out to be Linux only. But bigger customers meant an increasing push for TK support for Windows and tightening regulation of timestamp accuracy meant customers now have more urgent reasons to upgrade time on all their systems. You can see how well it works if you can pick out that straight lime-green line in the graph above – that is TK running on one Windows box with alternative time sync software running on the others.

Production tests(not lab tests) show that synchronization in the sub 10 microsecond or better range is easily achievable. TimeKeeper Windows also provides high accuracy for cloud based and other virtual machine instances.

MiFID2 and security – keeping track of the money

1024px-Quentin_Massys_001

A shorter version of this post is on the  FSMLabs web site.  MiFID2 is a new set of regulations for the financial services industry in Europe that includes a much more rigorous approach to timestamps.  Timestamps are in many ways the foundation for data integrity in modern processing systems – which are distributed, high speed, and generally gigantic. But when regulations or business or other constraints require timestamps to really work, the issues of fault tolerance and security come up. It doesn’t matter how precise your time distribution is if a mistake or a hacker can easily turn it off or control it.

TimeKeeper incorporates a defense-in-depth design to protect it from deliberate security attacks and errors due to equipment failure or misconfiguration. This engineering approach was born out of a conviction that precise time synchronization would become a business and regulatory imperative.KeystoneCops

  1. Recent disclosures of still more security problems in the NTPd implementation of NTP show how vulnerable time synchronization can be without proper attention to security. PTPd and related implementations of the PTP standard have similar vulnerabilities.
  2. Security and general failure tolerance should be on the minds of firms that are considering how to comply with the MiFID2 rules because time synchronization provides both a broad attack surface and a single point of failure unless properly implemented.

The first step towards time non-naive time synchronization is a skeptical attitude on the parts of IT managers and developers. Ask the right questions at acquisition and design time to prevent unpleasant surprises later.

One of the most dangerous aspects of the just disclosed NTPd exploit is that NTPd will accept a message from any random source telling it to stop synchronizing with its actual time sources. Remember, NTPd is an implementation of NTP, other implementations may not suffer from the same flaw. That d is easy to overlook, but it’s key. TimeKeeper’s NTP and PTP implementations will, for example, ignore commands that do not come from the associated time source and will apply analytical skepticism to commands that do appear to come from the source. TimeKeeper dismisses many of these types of attacks immediately and will start throwing off alerts to provoke automated and human counter-measures. The strongest protection TimeKeeper offers, however, comes from its multi-source capabilitiesthat allow it to compare multiple time sources in real-time and reject a primary source that has strayed.

Correct time travels a long, complex path from a source such as a GPS receiver or a feed like the one British Telecom is now providing. Among the questions system designers need to ask are the following two.

  1. Is the chain between source and client safeguarded comprehensively and instrumented end-to-end?
  2. Is there a way of cross-checking sources against other sources and rejecting bad sources?

Without positive answers to both of these questions, the time distribution technology is inherently fragile and robust MiFID2 timestamp compliance will be unavailable.

The painting is: “Quentin Massys 001” by Quentin Matsys (1456/1466–1530) – The Yorck Project: 10.000 Meisterwerke der Malerei. DVD-ROM, 2002. ISBN 3936122202. Distributed by DIRECTMEDIA Publishing GmbH.. Licensed under Public Domain via Commons – 

MiFID2 Timestamp regulations

800px-Hans_Holbein_der_Jüngere_-_Der_Kaufmann_Georg_Gisze_-_Google_Art_Project

There are a number of places in the new guidelines that increase the rigor required for timestamping data. One key part covers SI’s (systematic internalizers) who operate kind-of like private exchanges. TimeKeeper’s ability to produce traceable audit and to use multiple sources is designed for precisely this kind of application.

Moreover, the inclusion of the timestamp in the pre-trade information published by the SI is a key information for the client to better analyse ex-post the quality of prices quoted by SIs, and in particular to assess with accuracy the responsiveness of the SI and the validity periods of quotes. Without a timestamp assigned by the SI itself, market participants would need to rely on the information potentially provided by data vendors, the timestamps of which would be less accurate, especially when quotes are published through a website as pointed out by some respondents to the question on access to the quotes of SIs

Image is by Hans Holbein the Younger (1497/

The Enterprise Profile for PTP and TimeKeeper

One of the most interesting things we saw in the proposed IEEE 1588 enterprise profile was a bold suggestion on fault tolerance that looked familiar. Here’s FSMLabs press release from September 2011

TimeKeeper 5.0 offers the ability to monitor multiple time distribution channels, even those operating on different time distribution standards or of different quality due to distance or network issues. As an example, a TimeKeeper client may monitor two different Precision Time Protocol (PTP) “master clocks” and three different Network Time Protocol (NTP) servers. In addition, if the time quality of TimeKeeper’s primary sources becomes questionable, TimeKeeper can now switch from tracking one time source to another, according to a fail-over list provided at configuration time.

This press release described products that were already in the field in production. I remember that although customers liked this capability, talks at timing conferences often provoked complaints from engineers who insisted that the PTP “Best Master Clock” protocol already solved the problem. Anyways, it was gratifying to see that by February 2015 a similar, scaled down, capability was being proposed for the PTP Enterprise Profile. 

Clocks SHOULD include support for multiple domains.  The purpose is to support multiple simultaneous masters for redundancy. Leaf devices (non-forwarding devices) can use timing information from multiple masters by combining information from multiple instantiations of a PTP stack, each operating in a different domain. Redundant sources of timing can be ensembled, and/or compared to check for faulty master clocks. The use of multiple simultaneous masters will help mitigate faulty masters reporting as healthy, network delay asymmetry, and security problems.  Security problems include man-in-the-middle attacks such as delay attacks, packet interception / manipulation attacks. Assuming the path to each master is different, failures malicious or otherwise would have to happen at more than one path simultaneously. Whenever feasible, the underlying network transport technology SHOULD be configured so that timing messages in different domains traverse different network paths.

Note that there are three things missing from this proposal that were in TimeKeeper 5.0 back in 2011: the ability to use NTP sources as well as PTP, the ability to use multiple PTP sources in the same domain, and working software. Stating “SHOULD” in a standard is a long way from “works in the field” but recognition of the problem is a good step.

 

Smart and dumb clients and the “so-called” Best Master Clock Algorithm in PTP IEEE 1588

The Best Master Clock (BMC) algorithm is a key part and key weakness of the PTP standard. The proposed enterprise profile for PTP calls it  “the so-called ‘best master clock'” algorithm because it doesn’t actually pick the best master clock in an enterprise or telco environment. The BMC  requires that each clock advertise a level of accuracy and the clients pick the best one (hence the name). In enterprise and telco environments, however, the accuracy of the delivered time usually has much more to do with packet delays than with the accuracy of the clock itself so the BMC uses the wrong metric.

If you have two grand masters, one  in the same data center as the clients,  connected to them over 10G networks and one that is connected by a dial up modem on a phone at 3800 baud,  and the first advertises itself as being 100 nanoseconds from GPS time while the second advertises itself as 50 nanoseconds from GPS time, the BMC says the client is stuck with the second one. The original PTP standard does not have any smart way of improving this situation because it’s designed for dumb clients on 1 wire networks, not smart clients on a complex enterprise networks.

Even worse, the original PTP standard has no way to deal with bad grand master clocks that send out bad time but claim to be accurate. When the Euronext system failed 2 years ago, a broken grand master clock went off time by 35  seconds but kept claiming to be accurate down to the nanosecond. According to the original PTP standard, the clients would be forced to keep using that bad time even if they could detect the problem.  In fact, if you interpret the basic PTP standard in a way that makes it easy to write dumb clients, the Grand Masters themselves are supposed to turn off if they see a “better” time source.  So in the situation described in the preceding paragraph, the clock on the 10G network is supposed to go silent and let the packets coming over the dial up telephone modem control the clients. That’s why the proposed Enterprise PTP Profile has that snide line about the “so-called best master clock” algorithm. BMC often chooses a clock that is far from the best.

TimeKeeper is based on a smart client approach. Our clients will switch to a better PTP or NTP source from a clock that goes wrong because the primary goal is to track time accurately even when the standard is inadequate. TimeKeeper stays compatible with the standards but makes decisions in a wider context.  We run multiple client contexts for both PTP and NTP at the same time and use real-time data analysis to figure out which one gets to control the clock.  This system is used in some of the most demanding transaction systems in the financial trading markets.

PTP-2008, the latest version of the standard, makes some halting steps towards fixing the BMC problem through complex (untested and unimplemented) ideas like Grand Master Clustering. PTP-2008 also adds the possibility of “profiles” that can replace or amend BMC at the cost of making the standard less standard. Both the  Telecom Profile and the proposed Enterprise Profile explicitly develop ways defeat the BMC.  Here’s something from the Enterprise Profile proposal:

 Slave clocks MUST be able to operate properly in a network which contains multiple Masters in multiple domains.  Slaves SHOULD make use of information from the all Masters in their clock control subsystems.

We think that’s a great idea – which is why our client software has been  managing multiple PTP and NTP sources for years.  I used to give talks at time conferences on our fault-tolerance methods and encounter either incomprehension or lectures about how the BMC solved the problem already. It is gratifying to see wider appreciation of the limitations of the BMC even if our solutions have not yet been appreciated (except by our customers).

Project Roseline

Accurate and reliable knowledge of time is fundamental to cyber-physical systems for sensing, control, performance, and energy efficient integration of computing and communications. This simple statement underlies the RoseLine project.  Emerging CPS [Cyber Physical Systems – vy]  applications depend on precise knowledge of time to infer location and control communication.  There is a diversity of semantics used to describe time, and quality of time varies as we move up and down the system stack.  System designs tend to overcompensate for these uncertainties and the result is systems that may be over designed, in-efficient, and fragile.  The intellectual merit derives from the new  and fundamental concept of time and the holistic measure of quality of time (QoT) that captures metrics including resolution, accuracy, and stability.  

The project will build a system stack that enables new ways for clock hardware, OS, network services, and applications to learn, maintain and exchange information about time, influence component behavior, and robustly adapt to dynamic QoT requirements, as well as to benign and adversarial changes in operating conditions.  Application areas that will benefit from Quality of Time will include: smart grad, networked and coordinated control of aerospace systems, underwater sensing, and industrial automation.  The broader impact of the proposal is due to the foundational nature of the work which builds a robust and tunable quality of time that can be applied across a broad spectrum of applications that pervade modern life. Roseline.