Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Similar documents
Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination

CS505: Distributed Systems

Distributed Consensus

Byzantine Agreement. Gábor Mészáros. CEU Budapest, Hungary

Section 6 Fault-Tolerant Consensus

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Fault-Tolerant Consensus

Byzantine Agreement. Gábor Mészáros. Tatracrypt 2012, July 2 4 Smolenice, Slovakia. CEU Budapest, Hungary

Consensus. Consensus problems

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas

Asynchronous Models For Consensus

Distributed Systems Byzantine Agreement

Routing Algorithms. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems

Consensus when failstop doesn't hold

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

Byzantine behavior also includes collusion, i.e., all byzantine nodes are being controlled by the same adversary.

Byzantine Agreement. Chapter Validity 190 CHAPTER 17. BYZANTINE AGREEMENT

Model Checking of Fault-Tolerant Distributed Algorithms

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Failure detectors Introduction CHAPTER

Towards optimal synchronous counting

Randomized Protocols for Asynchronous Consensus

Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan

Optimal Resilience Asynchronous Approximate Agreement

Reliable Broadcast for Broadcast Busses

The Byzantine Generals Problem Leslie Lamport, Robert Shostak and Marshall Pease. Presenter: Jose Calvo-Villagran

The Weighted Byzantine Agreement Problem

Unreliable Failure Detectors for Reliable Distributed Systems

Byzantine Agreement in Polynomial Expected Time

Self-stabilizing Byzantine Agreement

CS505: Distributed Systems

A Realistic Look At Failure Detectors

Replication predicates for dependent-failure algorithms

Tolerating Permanent and Transient Value Faults

Deterministic Consensus Algorithm with Linear Per-Bit Complexity

ROBUST & SPECULATIVE BYZANTINE RANDOMIZED CONSENSUS WITH CONSTANT TIME COMPLEXITY IN NORMAL CONDITIONS

Shared Memory vs Message Passing

Broadcast and Verifiable Secret Sharing: New Security Models and Round-Optimal Constructions

Faster Agreement via a Spectral Method for Detecting Malicious Behavior

On the Number of Synchronous Rounds Required for Byzantine Agreement

Eventually consistent failure detectors

Genuine atomic multicast in asynchronous distributed systems

Do we have a quorum?

Impossibility of Distributed Consensus with One Faulty Process

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

Communication-Efficient Randomized Consensus

Consensus and Universal Construction"

Early consensus in an asynchronous system with a weak failure detector*

Authenticated Broadcast with a Partially Compromised Public-Key Infrastructure

Common Knowledge and Consistent Simultaneous Coordination

Bee s Strategy Against Byzantines Replacing Byzantine Participants

Byzantine Vector Consensus in Complete Graphs

How to solve consensus in the smallest window of synchrony

Distributed Computing in Shared Memory and Networks

Failure Detection and Consensus in the Crash-Recovery Model

Time. To do. q Physical clocks q Logical clocks

Uniform consensus is harder than consensus

Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value

The Weakest Failure Detector to Solve Mutual Exclusion

Valency Arguments CHAPTER7

Abstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced

Round Complexity of Authenticated Broadcast with a Dishonest Majority

Easy Consensus Algorithms for the Crash-Recovery Model

A modular approach to shared-memory consensus, with applications to the probabilistic-write model

Byzantine Agreement in Expected Polynomial Time

Robustness of Distributed Systems Inspired by Biological Processes

Clock Synchronization

Clocks in Asynchronous Systems

Real-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking

Byzantine agreement with homonyms

Chapter 11 Time and Global States

Synchrony Weakened by Message Adversaries vs Asynchrony Restricted by Failure Detectors

How can one get around FLP? Around FLP in 80 Slides. How can one get around FLP? Paxos. Weaken the problem. Constrain input values

Time. Today. l Physical clocks l Logical clocks

Failure detection and consensus in the crash-recovery model

Machine Checked Proofs of the Design and Implementation of a Fault-Tolerant Circuit

An Efficient Algorithm for Byzantine Agreement without Authentication

Slides for Chapter 14: Time and Global States

Bridging the Gap: Byzantine Faults and Self-stabilization

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Distributed Systems Principles and Paradigms

Minimal Spanning Tree

Realistic Failures in Secure Multi-Party Computation

Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony

Early-Deciding Consensus is Expensive

Combining Shared Coin Algorithms

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA)

Uniform Actions in Asynchronous Distributed Systems. Extended Abstract. asynchronous distributed system that uses a dierent

Agreement. Today. l Coordination and agreement in group communication. l Consensus

Silence. Guy Goren Viterbi Faculty of Electrical Engineering, Technion

On the Resilience and Uniqueness of CPA for Secure Broadcast

Communication Predicates: A High-Level Abstraction for Coping with Transient and Dynamic Faults

EFFICIENT COUNTING WITH OPTIMAL RESILIENCE. Lenzen, Christoph.

FAIRNESS FOR DISTRIBUTED ALGORITHMS

Colorless Wait-Free Computation

Transcription:

Agreement Protocols CS60002: Distributed Systems Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Classification of Faults Based on components that failed Program / process Processor / machine Link Storage Clock Based on behavior of faulty component Crash just halts Failstop crash with additional conditions Omission fails to perform some steps Byzantine behaves arbitrarily Timing violates timing constraints

Classification of Tolerance Types of tolerance: Masking system always behaves as per specifications even in presence of faults Non-masking system may violate specifications in presence of faults. Should at least behave in a well-defined manner Fault tolerant system should specify: Class of faults tolerated What tolerance is given from each class

Core problems Agreement (multiple processes agree on some value) Clock synchronization Stable storage (data accessible after crash) Reliable communication (point-to to-point, broadcast, multicast) Atomic actions

Overview of Consensus Results Let f be the maximum number of faulty processors. Tight bounds for message passing: Crash failures Byzantine failures Number of rounds f + 1 f + 1 Total number of processors Message size f + 1 3f + 1 polynomial polynomial

Overview of Consensus Results Impossible in asynchronous case. Even if we only want to tolerate a single crash failure. True both for message passing and shared read-write memory.

Consensus Algorithm for Crash Failures Code for each processor: v := my input at each round 1 through f+1: if I have not yet sent v then send v to all wait to receive messages for this round v := minimum among all received values and current value of v if this is round f+1 then decide on v

Correctness of Crash Consensus Algo Termination: : By the code, finish in round f + 1. Validity: Holds since processors do not introduce spurious messages if all inputs are the same, then that is the only value ever in circulation.

Correctness of Crash Consensus Algo Agreement: Suppose in contradiction p j decides on a smaller value, x, than does p i. Then x was hidden from p i by a chain of faulty processors: round round round round q 1 1 q 2 f f+1 2 q f q f+1 p j p i There are f + 1 faulty processors in this chain, a contradiction.

Performance of Crash Consensus Algo Number of processors n > f f + 1 rounds n 2 V messages, each of size log V bits, where V is the input set.

Lower Bound on Rounds Assumptions: n > f + 1 every processor is supposed to send a message to every other processor in every round Input set is {0,1}

Byzantine Agreement Problems Model : Total of n processes, at most m of which can be faulty Reliable communication medium Fully connected Receiver always knows the identity of the sender of a message Byzantine faults Synchronous system In each round, a process receives messages, performs computation, and sends messages.

Byzantine Agreement Also known as Byzantine Generals problem One process x broadcasts a value v Agreement Condition: All non-faulty processes must agree on a common value. Validity Condition: The agreed upon value must be v if x is non-faulty.

Variants Consensus Each process broadcasts its initial value Satisfy agreement condition If initial value of all non-faulty processes is v,, then the agreed upon value must be v Interactive Consistency Each process k broadcasts its own value v k All non-faulty processes agree on a common vector (v 1,v 2,,v n ) If the k th process is non-faulty, then the k th value in the vector agreed upon by non-faulty processes must be v k Solution to Byzantine agreement problem implies solution to other two

Byzantine Agreement Problem No solution possible if: asynchronous system, or n < (3m( + 1) 1 Lower Bound: Needs at least (m+1( m+1) ) rounds of message exchanges Oral messages messages can be forged / changed in any manner, but the receiver always knows the sender

Proof Theorem: There is no t-byzantinet Byzantine-robust broadcast protocol for t N/3 T S 0 0 0 0 0 1 U Scenario-0: T must decide 0 T S 1 1 1 0 1 1 U Scenario-1: U must decide 0 T S 0 1 0 0 1 1 U Scenario-2: -- similar to Scenario-0 0 for T -- similar to Scenario-1 1 for U -- T decides 0 and U decides 1

Lamport-Shostak Shostak-Pease Algorithm Algorithm Broadcast( N, t ) where t is the resilience For t = 0, Broadcast( N, 0 ): Pulse 1 The general sends value, x g to all processes, the lieutenants do not send. Receive messages of pulse 1. The general decides on x g. Lieutenants decide as follows: if a message value, x was received from g in pulse-1 then decide on x else decide on udef

Lamport-Shostak Shostak-Pease Algorithm contd.. For t > 0, Broadcast( N, t ): Pulse 1 The general sends value, x g to all processes, the lieutenants do not send. Receive messages of pulse 1. Lieutenant p acts as follows: if a message value, x was received from g in pulse-1 then x p = x else x p = udef ; Announce x p to the other lieutenants by acting as a general in Broadcast p ( N 1, t 1 ) in the next pulse Pulse t +1+ Receive messages of pulse t +1. The general decides on x g. For lieutenant p: A decision occurs in Broadcast q ( N 1, t 1 ) for each lieutenant q W p [q]] = decision in Broadcast q ( N 1, t 1 ) y p = max (W p )

Features Termination: If Broadcast( N, t ) is started in pulse 1, every process decides in pulse t + 1 Dependence: If the general is correct, if there are f faulty processes, and if N > 2f + t, then all correct processes decide on the input of the general Agreement: All correct processes decide on the same value The Broadcast( N, t ) protocol is a t-byzantinet Byzantine-robust broadcast protocol for t < N/3 Time complexity: O( t + 1 ) Message complexity: O( N t )