Paxos

Use two consensus building phases to reach safe consensus even when nodes disconnect

Problem

When multiple nodes share state, they often need to agree between themselves on a particular value. With Leader and Followers, the leader decides and passes its value to the followers. But if there is no leader, then the nodes need to determine a value themselves. (Even with a leader-follower, they may need to do this to elect a leader.)

A leader can ensure that replicas safely acquire an update by using Two-Phase Commit, but without a leader we can have competing nodes attempt to gather a Majority Quorum. This process is further complicated because any node may fail or disconnect. A node may achieve majority quorum on a value, but disconnect before it is able to communicate this value to the entire cluster.

Solution

The Paxos algorithm was developed by Leslie Lamport, published in his 1998 paper "The Part-Time Parliament". Paxos works in three phases to make sure multiple nodes agree on the same value in spite of partial network or node failures. The first two phases act to build consensus around a value and the last phase then communicates that consensus to the remaining replicas.

  • Prepare phase: Establish the latest Generation Clock and gather any already accepted values.
  • Accept phase: Propose a value for this generation for replicas to accept.
  • Commit phase: Let all the replicas know that a value has been chosen.

for more details go to Chapter 11 of the online ebook at oreilly.com

This pattern is part of Patterns of Distributed Systems

23 November 2023