Distributed consensus involves organizing a collection of independent agents – processes or network sites – to agree on some value or sequence of values.  Many distributed consensus methods depend on a leader-follower scheme in which the leader is an agent that essentially tells the followers what the values are. The challenges in such methods are to determine when enough of the followers have accepted the value and how to recover from failures of agents. In particular, failures of the leader trigger some procedure to select a new leader.  Leader election, however, is a distributed consensus problem. In fact, leader election is the harder problem. Once there is a leader, consensus in the followers can be produced by a dead simple protocol (see the second part of this ).  Oddly, leader election is generally treated as a minor issue. For example, in “Paxos made simple” we read:

The famous result of Fischer, Lynch, and Patterson [1] implies that a reliable algorithm for electing a proposer must use either randomness or real time—for example, by using timeouts. However, safety is ensured regardless of the success or failure of the election.

The FLP result is essentially a tautology: if an agent doesn’t ever get any information that reliably distinguishes between failure and slow response in a second agent, the first agent cannot reliably distinguish between failure of the second agent and slow response.  So the import of the first sentence is that leader election depends on timeouts or “randomness” (perhaps this means some analysis of probability of failure scenarios).  I don’t think this is correct, but it’s an interesting claim. The second sentence says nothing more than that an algorithm that fails to progress will never produce a false result – which I think is also a dubious claim.

Algorithm P solves problem X by assuming some other mechanism solves X and then by using that mechanism to make problem X simpler.  Ok.

 

circularity problems in distributed consensus
Tagged on: