Consensus. Consensus problems

Similar documents
Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

CS505: Distributed Systems

Distributed Consensus

Asynchronous Models For Consensus

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

CS505: Distributed Systems

Failure detectors Introduction CHAPTER

Early consensus in an asynchronous system with a weak failure detector*

Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links

Unreliable Failure Detectors for Reliable Distributed Systems

Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems

Abstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced

Crashed router. Instructor s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 Pearson Education

Fault-Tolerant Consensus

Model Checking of Fault-Tolerant Distributed Algorithms

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas

The Byzantine Generals Problem Leslie Lamport, Robert Shostak and Marshall Pease. Presenter: Jose Calvo-Villagran

Failure Detectors. Seif Haridi. S. Haridi, KTHx ID2203.1x

Easy Consensus Algorithms for the Crash-Recovery Model

Section 6 Fault-Tolerant Consensus

Optimal Resilience Asynchronous Approximate Agreement

Byzantine behavior also includes collusion, i.e., all byzantine nodes are being controlled by the same adversary.

Byzantine Agreement. Chapter Validity 190 CHAPTER 17. BYZANTINE AGREEMENT

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

How to solve consensus in the smallest window of synchrony

A Realistic Look At Failure Detectors

Eventually consistent failure detectors

Distributed Systems Byzantine Agreement

Byzantine Agreement. Gábor Mészáros. Tatracrypt 2012, July 2 4 Smolenice, Slovakia. CEU Budapest, Hungary

Valency Arguments CHAPTER7

Byzantine Agreement. Gábor Mészáros. CEU Budapest, Hungary

Self-stabilizing Byzantine Agreement

Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony

Snapshots. Chandy-Lamport Algorithm for the determination of consistent global states <$1000, 0> <$50, 2000> mark. (order 10, $100) mark

The Weighted Byzantine Agreement Problem

Consensus when failstop doesn't hold

Byzantine agreement with homonyms

Clocks in Asynchronous Systems

The Weakest Failure Detector to Solve Mutual Exclusion

Shared Memory vs Message Passing

Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan

Failure Detection and Consensus in the Crash-Recovery Model

Towards optimal synchronous counting

Replication predicates for dependent-failure algorithms

Time Free Self-Stabilizing Local Failure Detection

Genuine atomic multicast in asynchronous distributed systems

Do we have a quorum?

Unreliable Failure Detectors for Reliable Distributed Systems

Asynchronous Leasing

ROBUST & SPECULATIVE BYZANTINE RANDOMIZED CONSENSUS WITH CONSTANT TIME COMPLEXITY IN NORMAL CONDITIONS

The Weakest Failure Detector for Wait-Free Dining under Eventual Weak Exclusion

Slides for Chapter 14: Time and Global States

Learning from the Past for Resolving Dilemmas of Asynchrony

Failure detection and consensus in the crash-recovery model

Tolerating Permanent and Transient Value Faults

On Stabilizing Departures in Overlay Networks

On Equilibria of Distributed Message-Passing Games

Uniform consensus is harder than consensus

Reliable Broadcast for Broadcast Busses

THE WEAKEST FAILURE DETECTOR FOR SOLVING WAIT-FREE, EVENTUALLY BOUNDED-FAIR DINING PHILOSOPHERS. A Dissertation YANTAO SONG

Impossibility of Distributed Consensus with One Faulty Process

Combining Shared Coin Algorithms

Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures

Authenticated Broadcast with a Partially Compromised Public-Key Infrastructure

On the weakest failure detector ever

Communication Predicates: A High-Level Abstraction for Coping with Transient and Dynamic Faults

Chapter 11 Time and Global States

Quantum Algorithms for Leader Election Problem in Distributed Systems

Randomized Protocols for Asynchronous Consensus

Byzantine Vector Consensus in Complete Graphs

Round Complexity of Authenticated Broadcast with a Dishonest Majority

Byzantine Agreement in Polynomial Expected Time

Early-Deciding Consensus is Expensive

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

A Short Introduction to Failure Detectors for Asynchronous Distributed Systems

The Heard-Of Model: Computing in Distributed Systems with Benign Failures

CS505: Distributed Systems

Approximation of δ-timeliness

Uniform Actions in Asynchronous Distributed Systems. Extended Abstract. asynchronous distributed system that uses a dierent

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

(Leader/Randomization/Signature)-free Byzantine Consensus for Consortium Blockchains

Resolving Message Complexity of Byzantine. Agreement and Beyond. 1 Introduction

Time. To do. q Physical clocks q Logical clocks

Bee s Strategy Against Byzantines Replacing Byzantine Participants

6.852: Distributed Algorithms Fall, Class 24

Distributed Systems Principles and Paradigms

Silence. Guy Goren Viterbi Faculty of Electrical Engineering, Technion

Computing in Distributed Systems in the Presence of Benign Failures

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007

Crash-resilient Time-free Eventual Leadership

Protocol for Asynchronous, Reliable, Secure and Efficient Consensus (PARSEC)

arxiv: v1 [cs.dc] 3 Oct 2011

Benchmarking Model Checkers with Distributed Algorithms. Étienne Coulouma-Dupont

I R I S A P U B L I C A T I O N I N T E R N E THE NOTION OF VETO NUMBER FOR DISTRIBUTED AGREEMENT PROBLEMS

How can one get around FLP? Around FLP in 80 Slides. How can one get around FLP? Paxos. Weaken the problem. Constrain input values

Transcription:

Consensus problems 8 all correct computers controlling a spaceship should decide to proceed with landing, or all of them should decide to abort (after each has proposed one action or the other) 8 in an electronic money transfer transaction, all involved processes must consistently agee on whether to perform the transaction (debit and credit), or not 8 in mutual exclusion, processes need to agree on which process enters critical section 8 in election, processes need to agree on elected process 8 in totally ordered multicast, processes need to agree on a consistent message delivery order Distributed Systems - Fall 2001 IV - 65 Stefan Leue 2001

Recall process failure models 8 crash failures: processes stop (fail), but remain silent 8 byzantine failures: processes fail, but may still respond to environment with arbitrary, erratic behavior (e.g., send false acknowledgements, etc.) Addison-Wesley Publishers 2000 Distributed Systems - Fall 2001 IV - 66 Stefan Leue 2001

Factors threatening consensus 8 failures communication link or process failures crash failures (fail-silent) or byzantine failures (arbitrary) i(after Byzantine Empire 330-1453, in which unfaithfulness and untruthfulness have allegedly been very common) 8 network characteristics synchronous or asynchronous 8 failure detectors reliable or unreliable 8 are messages authenticated (digitally signed) or not can a process lie about the content of message that it received from a correct process? can adversary claim to send message under a false expedient s id? Model 8 processes communicating by message passing 8 desireable: reaching consensus even in the presence of faults assumption: communication is reliable, but processes may fail Distributed Systems - Fall 2001 IV - 67 Stefan Leue 2001

The Consensus Problem (C) 8 agreement in the value of a decision variable amongst all correct processes p i is in state undecided and proposes a single value v i next, processes communicate with each other to exchange values in doing so, p i sets decision variable d i and enters the decided state after which the value of d i remains unchanged P 1 d 1 :=proceed d 2 :=proceed P 2 v 1 =proceed 1 v 2 =proceed Consensus algorithm v 3 =abort P 3 (crashes) Addison-Wesley Publishers 2000 Distributed Systems - Fall 2001 IV - 68 Stefan Leue 2001

The Consensus Problem (C) 8 properties of a consensus algorithm termination: eventually, each correct process sets its decision variable agreement: ifor all correct correct p i and p k such that state(p i ) = state(p k )= decided d i = d k integrity: if the correct processes all proposed the same value, then any correct process has chosen that value in the decided state ivariation:... then some correct process has chosen that value in the decided state Distributed Systems - Fall 2001 IV - 69 Stefan Leue 2001

The Consensus Problem (C) 8 algorithm to solve consensus in a failure-free environment each process reliably multicasts proposed values after receiving response, solves consensus function majority(v 1,.., v N ), [remark: other problem-specific functions possible] which returns most often proposed value, or undefined if no majority exists properties itermination guaranteed by reliability of multicast iagreement, integrity: definition of majority, and integrity of reliable multicast (all processes solve same function on same data) 8 when crashes occur how to detect failure? will algorithm terminate? 8 when byzantine failures occur processes communicate random values evaluation of consensus function may be inconsistent malevolent processes may deliberately propose false or inconsistent values Distributed Systems - Fall 2001 IV - 70 Stefan Leue 2001

The Byzantine Generals Problem (BG) 8 three or more generals are to agree on an attack or retreat 8 commander issues order others (lieutenants to the commander) have to decide to attack or retreat 8 one of the generals may be treacherous if commander is treacherous, it proposes attacking to one general and retreating to the other if lieutenants are treacherous, they tell one of their peers that commander ordered to attack, and others that commander ordered to retreat 8 difference to consensus problem: one process supplies a value that others have to agree on 8 properties termination: eventually each correct process sets it decision variable agreement: the decision value of all correct processes is the same integrity: if the commander is correct, then all processes decide on the value that the commander proposes inote: implies agreement only if the commander is correct, but commander need not be correct (see above) Distributed Systems - Fall 2001 IV - 71 Stefan Leue 2001

Interactive Consistency (IC) 8 each process suggests one value 8 goal: all correct processes agree on a vector of values, each component corresponding to one processes agreed value example: agreement about each processes' local state 8 requirements termination: eventually each correct process sets it decision variable agreement: the decision vector of all correct processes is the same integrity: if p i is correct, then all correct processes decide on v i as the i-th component of their vector Distributed Systems - Fall 2001 IV - 72 Stefan Leue 2001

Relationship of Consensus to Other Problems 8 assume that the previous problems could be solved, yielding the following decision variables C(v 1,.., v N ) returns the decision value of p i BG i (k, v) returns the decision value of p i where p k is the commander which proposes value v IC i (v 1,.., v N )[k] returns the k-th value in the decision vector of p i where v 1,.., v N are the values that the processes propose 8 possibilities to derive solutions from these problem solutions IC from BG irun BG N times, once with each p i acting as commander IC i (v 1,.., v N )[k] = BG i (k, v k ) C from IC irun IC to produce a vector of values at each process iapply an appropriate function on the vector s values to derive a single value C i (v 1,.., v N ) = majority(ic i (v 1,.., v N )[1],.., IC i (v 1,.., v N )[N]) BG from C icommander p k sends its proposed value v to itself and each of the remaining processes iall processes run C with the values v 1,.., v N that they receive i BG i (k, v) = C i (v 1,.., v N ) termination, agreement and integrity preserved in each case Distributed Systems - Fall 2001 IV - 73 Stefan Leue 2001

Relationship of Consensus to Other Problems 8 solving consensus equivalent to solving reliable, totally ordered multicast implementing consensus with RTO-multicast icollect all processes in one group ieach p i performs RTO-multicast(g, v i ) ieach p i chooses d i = m i, where m i is the first value that the RTOmulticast delivers iproperties * termination follows from reliability of multicast * agreement and integrity follow from reliability and total ordering implementing RTO-multicast from consensus can be shown as well Distributed Systems - Fall 2001 IV - 74 Stefan Leue 2001

Consensus in Synchronous Networks 8 assumption: no more than f of the N processes crash 8 algorithm proceeds in in f+1 rounds processes B-multicast values between them at the end of f+1 rounds, all surviving processe are in a position to agree Addison-Wesley Publishers 2000 Distributed Systems - Fall 2001 IV - 75 Stefan Leue 2001

Consensus in Synchronous Networks 8 Dolev-Strong algorithm Values ir : set of proposed values known to process i before round r every process multicasts the set of values it has not sent in previous rounds then takes delivery of values from other processes round is potentially terminated by timeout at the end of f+1 rounds, each process choses minimum value Addison-Wesley Publishers 2000 Distributed Systems - Fall 2001 IV - 76 Stefan Leue 2001

Consensus in Synchronous Networks 8 Dolev-Strong algorithm termination: guaranteed through synchronicity property of system correctness: will every process arrive at the same set of values at the end of the final round? iif proven, integrety and agreement will follow, since processes consistently apply the minimum function to this set Addison-Wesley Publishers 2000 Distributed Systems - Fall 2001 IV - 77 Stefan Leue 2001

Consensus in Synchronous Networks 8 Dolev-Strong algorithm correctness: will every process arrive at the same set of values at the end of the final round? iif proven, integrity and agreement will follow, since processes consistently apply the minimum function to this set proof sketch iassume two processes differ in their final set of values ihence, some correct process i possesses a value v that another correct process k (i k) does not possess ithe only way to explain this is that some other process m, which sent v to i, crashed before v could be delivered to k iin turn, any process sending v in the previous round must have crashed iwe have to assume at least one crash per round ihave f+1 rounds, at most f crashes, hence contradiction 8 It can be shown that in synchronous systems, any algorithm to reach consensus, tolerating up to f crash or byzantine failures, requires at least f+1 rounds Distributed Systems - Fall 2001 IV - 78 Stefan Leue 2001

Byzantine Generals Problem in Synchronous Network 8 allow arbitrary (byzantine) failures 8 up to f faulty processes 8 correct processes can detect the absence of a message through timeout, but cannot conclude that sender has crashed, since it may be silent for some time and then start sending messages again 8 assume private communication channels fourth process cannot detect if one process sends messages with different content to two peers no faulty process can inject messages into channels connecting correct processes 8 assume that messages are not digitally signed (authenticated and verifyable) 8 general result (Lamport, Shostak and Pease) no solution if N 3f give an algorithm for N 3f+1 Distributed Systems - Fall 2001 IV - 79 Stefan Leue 2001

p 1 (Commander) p 1 (Commander) 1:w 1:x 2: p p 2 3 3:1:u 2:1:w p p 2 3 3:1:x Addison-Wesley Publishers 2000 Faulty processes are shown shaded Byzantine Generals Problem in Synchronous Network 8 impossibility for N = 3 processes read 3:1:u as three says one says u both scenarios show two rounds of messages left: all p 2 knows is that it has received two different values right: same situation, even though now commander is faulty assume a solution existed ip 2 would have to decide on value v, by integrity condition of BG assume that no algorithm can decide locally for p 2 between the two scenarios ithen p 2 would need to decide on w (value sent by commander) in right hand scenario same reasoning for p 3 iwill have to decide for commander s value, which is a violation of agreement in right hand scenario, hence contradiction Distributed Systems - Fall 2001 IV - 80 Stefan Leue 2001

Byzantine Generals Problem in Synchronous Network 8 sketch of impossibility for N < 3f (Pease, Shostak and Lamport) assume a solution existed for N 3 let each of three processes p 1, p 2 and p 3 simulate n 1, n 2 and n 3 generals, where p 1 + p 2 + p 3 = N and n 1, n 2, n 3 N/3 assume that one of the processes is faulty correct processes simulate correct generals iinternal interaction of own generals isend messages from own generals to those generals simulated by other processes faulty general s processes are faulty and may emit spurious messages since p 1 + p 2 + p 3 = N and n 1, n 2, n 3 N/3, at most f generals are faulty since algorithms that is run on the generals is correct, simulation will terminate however, now there is a way for two processes out of three to reach consensus: each process decides on the value chosen by all of their simulated generals contradicts impossibility for N = 3 Distributed Systems - Fall 2001 IV - 81 Stefan Leue 2001

Byzantine Generals Problem in Synchronous Network 8 solution for N 3f+1 solution by Pease, Shostak and Lamport too complex to present here therefore: presentation of solution for N = 4, f = 1 correct generals reach agreement in two rounds: ifirst, commander sends value to each lieutenant isecond, each lieutenant sends value it received to all peers lieutenant receives ivalue from commander in-2 values from peers if commander faulty, then all lieutenants correct, each will have gathered exactly the set of values that the commander sent out if one lieutenant faulty, each of its peers receives N-2 copies of the value the commander sent out, plus the faulty lieutenant value to reach agreement, simple majority function suffices isince N 4, N-2 2, majority function will ignore value of faulty lieutenant, and produce value of commander if commander is correct (will produce if commander incorrect) note: BG requires agreement only if commander correct Distributed Systems - Fall 2001 IV - 82 Stefan Leue 2001

Byzantine Generals Problem in Synchronous Network p 1 (Commander) p 1 (Commander) p 1 (Commander) 2: p 2 3:1:u p 3 1:w 2: p 2 3:1:w p 3 1:u 1:w 2:1:u p 2 3:1:w p 3 2: 4: 4: 3:1:w 2: 4: 4: 3:1:w 2:1:u 4: 4: 3:1:w Addison-Wesley Publishers 2000 p 4 p 4 p 4 Distributed Systems - Fall 2001 IV - 83 Stefan Leue 2001

Byzantine Generals Problem in Synchronous Network p 1 (Commander) p 1 (Commander) p 1 (Commander) 2: p {v,u,v} p 2 3:1:u 3 1:w 2: p {v,w,v} p 2 3:1:w {w,v,v} 3 1:u 1:w 2:1:u p {u,v,w} p 2 3:1:w {u,v,w} 3 2: 4: {v,v,w} 4: 3:1:w 2: 4: {v,v,w} 4: 3:1:w Addison-Wesley Publishers 2000 2:1:u 4: {u,v,w} 4: 3:1:w p 4 p 4 p 4 p 2 : majority({v,u,v}) = v p 3 : majority({v,v,w}) = v p 2 : majority({v,w,v}) = v p 3 : majority({v,v,w}) = v p 4 : majority({w,v,v}) = v p 2, p 3, p 4 : majority({v,u,w}) = Distributed Systems - Fall 2001 IV - 84 Stefan Leue 2001

Impossibility of Agreement in Asynchronous Systems 8 previous algorithms: synchrony assumption message exchanges in rounds timeouts 8 in asynchronous systems, no algorithm can guarantee reaching consensus, even with just one process crash failure (Fischer, Lynch and Paterson, 1985) proof idea i show that there is always some continuation of the process s execution that avoids consensus being reached Distributed Systems - Fall 2001 IV - 85 Stefan Leue 2001

Distributed Systems - Fall 2001 IV - 86 Stefan Leue 2001

Impossibility of Agreement in Asynchronous Systems 8 consequences in asynchronous systems, no solution to BG, IC, TOR-multicast 8 of course, in practice consensus can often be reached, but a residual probability that consensus cannot be reached remains 8 possible approaches to reaching consensus by weakening system assumptions partial synchrony masking faults modified failure detectors randomized algorithms Distributed Systems - Fall 2001 IV - 87 Stefan Leue 2001

Impossibility of Agreement in Asynchronous Systems 8 partial synchrony message delays are bounded, but bound unknown known bound, but longer transmission delays for some, finite, initial period of time 8 masking faults design system so that failures appear like intermittent slowdown in processing of messages istore system state on persistent storage before crash irestart system in that state after recovery 8 modified failure detectors in ISIS system ideem process that has not responded as failed itreat this process as fail-safe, i.e., discard any subsequent messages from this process iproblems: * long timeouts necessary * false negatives possible that reduce effectiveness of system Distributed Systems - Fall 2001 IV - 88 Stefan Leue 2001

Impossibility of Agreement in Asynchronous Systems 8 modified failure detectors in ISIS system (Birman, 1993) ideem process that has not responded as failed itreat this process as fail-safe, i.e., discard any subsequent messages from this process iproblems: * long timeouts necessary * false negatives possible that reduce effectiveness of system eventually weak failure detector (Chandra and Toueg, 1996) iconsensus can be solved, even with a weak failure detector, if fewer than N/2 processes crash and communication is reliable ieventually weak failure detector * eventually weakly complete: each faulty process is eventually suspected permanently * eventually weakly accurate: after some time, at least one correct process is never suspected by any correct process ieventually weak failure detector cannot be implemented in asynchronous system based on message passing, however, failure detectors adapting timeout values can come close to ewfd s Distributed Systems - Fall 2001 IV - 89 Stefan Leue 2001