Paxos (computer science)

from Wikipedia, the free encyclopedia

Paxos is a group of protocols designed to achieve consensus on a network of unreliable processors. Consensus refers to the agreement on a common result in a group of participants. The solution to this problem can be made more difficult if errors occur with the participants or their communication medium.

Consensus protocols are the basis for the state machine approach in Distributed Systems , proposed by Leslie Lamport and reviewed by Fred B. Schneider .

The Paxos Protocol was first published in 1989 and named after a fictional legislative consensus system that was used on the island of Paxos in Greece. In 1998 it was published as a journal article.

Paxos makes various compromises with regard to the number of processors, the number of message delays before learning the agreed value, the activity level of the individual participants, the number of messages sent and the types of errors. Although no deterministic fault tolerant consensus protocol can guarantee progress in an asynchronous network (this has been proven in a scientific article by Fischer, Lynch, and Paterson), Paxos guarantees consistency, and the conditions that could prevent progress are difficult to establish.

Paxos is usually used because of its robustness (for example to replicate a file or a database). The protocol tries to make progress even when there are times when a limited number of replicas are inaccessible. Paxos also has a mechanism to permanently remove faulty replicas or to add new ones.

history

In 1988 Lynch , Dwork and Stockmeyer demonstrated the solvability of the consensus problem in a broad group of so-called "semi-synchronous" systems. Paxos has many similarities with a protocol used for matching in viewstamped replication , first published in 1988 by Oki and Liskov .

Assumptions

In order to be able to present the functionality of Paxos more simply, the following assumptions and definitions are made explicitly:

Processors

  • Processors work at arbitrary speeds.
  • Processors can malfunction.
  • Processors with stable memory can reconnect to the protocol after a fault.
  • Processors do not try to bypass the protocol, which means that Byzantine errors do not occur .

network

  • Processors can send messages to any other processor.
  • Messages are sent asynchronously and require an arbitrary reception time.
  • Messages can be lost, rearranged, or duplicated.
  • Messages are transmitted without corruption, that is, there are no Byzantine errors .

Number of processors

In general, a consensus algorithm can progress by using 2F + 1 processors even though there is a simultaneous failure of F processors. However, through reconfiguration, a protocol can be used that can withstand any number of failures as long as no more than F processors fail at the same time.

roll

Paxos describes the actions of the processes through their respective roles in the protocol: Client, Acceptor, Proposer, Learner and Leader. In typical implementations, a single processor can serve one or more of these roles concurrently. This does not affect the correctness of the protocol, it is common to combine roles to improve the latency or the number of messages in the protocol.

Client

The client sends a request to the distributed system and waits for a response. For example, this could be a write request for a file on a server in the distributed system.

Acceptor

The acceptor acts as the fault tolerant "store" of the protocol. Acceptors can form quorums. Any message sent to an acceptor must be sent to a quorum of acceptors. Messages to an acceptor are ignored until copies have been received from each acceptor in a quorum.

Proposer

A proposer represents a client request and tries to get the acceptors to agree on it. They also act as coordinators in the event of a conflict.

Learner

Learners act as replication factors in the protocol. Once the acceptors have agreed on a client request, the learners can take action (for example, executing the request and sending a response to the client). It is possible to add more learners to improve processing availability.

leader

In order to ensure progress, Paxos needs a selected proposer, which is called a leader. Processes can believe they are the leaders, but progress is only guaranteed if one of them is chosen. If two processes believe they are leaders, they can delay the process by continuously suggesting clashing updates. But even in this case, the safety properties are maintained.

Quorums

Quorums ensure that at least a few processors and with them knowledge of the result survive. Quoras are defined as partial sums of the sum of all acceptors in such a way that each pair of partial sums shares at least one participant. Usually, a quorum is any majority of participating acceptors.

Suggested Number & Agreed Value

Any attempt to define an agreed value is carried out through proposals which acceptors can accept or reject. Each proposal has a unique number for the respective proposer. The value associated with the numbered proposal can optionally be calculated as part of the Paxos protocol.

Security features

In order to guarantee security, Paxos defines three security properties and ensures that these are adhered to regardless of errors that occur:

Non-triviality
Only suggested values ​​can be learned.
safety
Only one single value can be learned at a given point in time, that is, two different learners cannot learn two different values ​​at the same time.
Liveliness (C; L)
If a value C is proposed, a learner L will ultimately learn a value at some point, as long as enough processes remain error-free.

Typical use

In most Paxos applications, each participating process has three roles: Proposer, Acceptor and Learner. This significantly reduces the complexity of the messages without compromising their correctness:

"In Paxos, clients send commands to leaders. During normal operation, leaders receive the clients' commands, determine a new number of commands i and then begin the i-th instance of the consensus algorithm by sending messages to a group of acceptor processes."

By merging roles, the protocol works in the style of an efficient "client-master-replica" model, such as is typically used with databases. The advantage of Paxos is that it guarantees its security properties .

Individual evidence

  1. ^ Marshall Pease, Robert Shostak, Leslie Lamport: Reaching Agreement in the Presence of Faults . In: Journal of the Association for Computing Machinery . 27, No. 2, April 1980, pp. 228-234. doi : 10.1145 / 322186.322188 . Retrieved February 2, 2007.
  2. ^ Leslie Lamport: Time, Clocks and the Ordering of Events in a Distributed System . In: Communications of the ACM . 21, No. 7, July 1978, pp. 558-565. doi : 10.1145 / 359545.359563 . Retrieved February 2, 2007.
  3. ^ Fred Schneider: Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial . In: ACM Computing Surveys . 22, 1990, pp. 299-319. doi : 10.1145 / 98163.98167 .
  4. ^ Leslie Lamport's history of the paper
  5. ^ A b Leslie Lamport: The Part-Time Parliament . In: ACM Transactions on Computer Systems . 16, No. 2, May 1998, pp. 133-169. doi : 10.1145 / 279227.279229 . Retrieved February 2, 2007.
  6. M. Fischer, N. Lynch, M. Paterson: Impossibility of distributed consensus with one faulty process . In: Journal of the ACM . 32, No. 2, April 1985, pp. 374-382. doi : 10.1145 / 3149.214121 .
  7. , Mike Massa: Cheap Paxos . In: Proceedings of the International Conference on Dependable Systems and Networks (DSN 2004) ..
  8. a b c Leslie Lamport: Fast Paxos . 2005.
  9. a b c d Leslie Lamport: Generalized Consensus and Paxos . 2005.
  10. ^ Miguel Castro: Practical Byzantine Fault Tolerance . 2001.
  11. Cynthia Dwork, Nancy Lynch, Larry Stockmeyer: Consensus in the Presence of Partial Synchrony . In: Journal of the ACM . 35, April 1988, pp. 288-323. doi : 10.1145 / 42282.42283 .
  12. , Barbara Liskov: Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems . In: PODC '88: Proceedings of the seventh annual ACM Symposium on Principles of Distributed Computing ., Pp. 8-17. doi : 10.1145 / 62546.62549
  13. ^ Leslie Lamport: Lower Bounds for Asynchronous Consensus . 2004.
  14. ^ Tushar Chandra, Robert Griesemer, Joshua Redstone: Paxos Made Live - An Engineering Perspective . In: PODC '07: 26th ACM Symposium on Principles of Distributed Computing . 2007.