Reliable Broadcast for Broadcast Busses

Similar documents
Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Asynchronous Models For Consensus

Fault-Tolerant Consensus

CS505: Distributed Systems

Section 6 Fault-Tolerant Consensus

Distributed Systems Byzantine Agreement

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

Failure detectors Introduction CHAPTER

Consensus when failstop doesn't hold

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Optimal Resilience Asynchronous Approximate Agreement

Byzantine Agreement. Gábor Mészáros. Tatracrypt 2012, July 2 4 Smolenice, Slovakia. CEU Budapest, Hungary

Distributed Consensus

Impossibility of Distributed Consensus with One Faulty Process

Deterministic Consensus Algorithm with Linear Per-Bit Complexity

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Byzantine Agreement. Gábor Mészáros. CEU Budapest, Hungary

CS505: Distributed Systems

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA)

Time. To do. q Physical clocks q Logical clocks

Common Knowledge and Consistent Simultaneous Coordination

ROBUST & SPECULATIVE BYZANTINE RANDOMIZED CONSENSUS WITH CONSTANT TIME COMPLEXITY IN NORMAL CONDITIONS

Private and Verifiable Interdomain Routing Decisions. Proofs of Correctness

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

Unreliable Failure Detectors for Reliable Distributed Systems

Slides for Chapter 14: Time and Global States

Fault Tolerant Weighted Voting Algorithms Azad Azadmanesh, Alireza Farahani, Lotfi Najjar

Time Free Self-Stabilizing Local Failure Detection

Eventually consistent failure detectors

Time in Distributed Systems: Clocks and Ordering of Events

Chapter 11 Time and Global States

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

Easy Consensus Algorithms for the Crash-Recovery Model

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Self-stabilizing Byzantine Agreement

Towards optimal synchronous counting

Shared Memory vs Message Passing

Signature-Free Broadcast-Based Intrusion Tolerance: Never Decide a Byzantine Value

Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems

The Weakest Failure Detector to Solve Mutual Exclusion

Replication predicates for dependent-failure algorithms

Real-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking

Modeling and Stability Analysis of a Communication Network System

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas

Do we have a quorum?

Tolerating Permanent and Transient Value Faults

Realistic Failures in Secure Multi-Party Computation

Consensus. Consensus problems

Early consensus in an asynchronous system with a weak failure detector*

Model Checking of Fault-Tolerant Distributed Algorithms

Crashed router. Instructor s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 Pearson Education

Agreement. Today. l Coordination and agreement in group communication. l Consensus

The Byzantine Generals Problem Leslie Lamport, Robert Shostak and Marshall Pease. Presenter: Jose Calvo-Villagran

The Weighted Byzantine Agreement Problem

Round-Efficient Multi-party Computation with a Dishonest Majority

On the Number of Synchronous Rounds Required for Byzantine Agreement

Byzantine behavior also includes collusion, i.e., all byzantine nodes are being controlled by the same adversary.

Byzantine Agreement. Chapter Validity 190 CHAPTER 17. BYZANTINE AGREEMENT

Resolving Message Complexity of Byzantine. Agreement and Beyond. 1 Introduction

Early-Deciding Consensus is Expensive

Round-Efficient Perfectly Secure Message Transmission Scheme Against General Adversary

Ability to Count Messages Is Worth Θ( ) Rounds in Distributed Computing

DEXON Consensus Algorithm

CS 347 Parallel and Distributed Data Processing

Timeliness, Failure-Detectors, and Consensus Performance ALEXANDER SHRAER

Failure Detectors. Seif Haridi. S. Haridi, KTHx ID2203.1x

All-to-All Gradecast using Coding with Byzantine Failures

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

EFFICIENT COUNTING WITH OPTIMAL RESILIENCE. Lenzen, Christoph.

Benny Pinkas. Winter School on Secure Computation and Efficiency Bar-Ilan University, Israel 30/1/2011-1/2/2011

The Weakest Failure Detector for Wait-Free Dining under Eventual Weak Exclusion

Time. Today. l Physical clocks l Logical clocks

Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan

Distributed Systems Principles and Paradigms

Byzantine Vector Consensus in Complete Graphs

Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony

6.852: Distributed Algorithms Fall, Class 24

cs/ee/ids 143 Communication Networks

Radio Network Clustering from Scratch

CS505: Distributed Systems

Efficient Construction of Global Time in SoCs despite Arbitrary Faults

On Stabilizing Departures in Overlay Networks

TTA and PALS: Formally Verified Design Patterns for Distributed Cyber-Physical

Abstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced

Towards Optimal Synchronous Counting

CSE 123: Computer Networks

Byzantine agreement with homonyms

Resilient Distributed Optimization Algorithm against Adversary Attacks

Introduction to Cryptography Lecture 13

Lecture 7: Fingerprinting. David Woodruff Carnegie Mellon University

Uniform consensus is harder than consensus

On Equilibria of Distributed Message-Passing Games

Variations on Itai-Rodeh Leader Election for Anonymous Rings and their Analysis in PRISM

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Silence. Guy Goren Viterbi Faculty of Electrical Engineering, Technion

Overcoming observability problems in distributed test architectures

Transcription:

Reliable Broadcast for Broadcast Busses Ozalp Babaoglu and Rogerio Drummond. Streets of Byzantium: Network Architectures for Reliable Broadcast. IEEE Transactions on Software Engineering SE- 11(6):546-554, June 1985. CSE 223 Winter 2001 Streets of Byzantium 1

System Model (I) process fault: crash, arbitrary channel fault: can drop incoming message link fault: can drop messages between channel and process n processes synchronized clocks, bounded message delivery, digital signatures R channels R links CSE 223 Winter 2001 Streets of Byzantium 2

System Model (II) xcast(v, i) sends v on link/channel i a null message φ is generated by a timeout. all processes that receive a non-null message receive the same message. xcast(v, S) for S {1, 2, R} is a parallel execution of xcast(v, 1) xcast(v, 2).. xcast(v, R) done in no particular order and not done atomically. xcast(v, *) is shorthand for xcast(v, {1,2, R}) CSE 223 Winter 2001 Streets of Byzantium 3

System Model (III) A process is properly connected to a channel if its link to that channel is nonfaulty. Let P = {p 1, p 2, p n } be the set of processes where p 1 is the transmitter. V is the set of possible message values (excluding φ) and v 0 V is the default value. CSE 223 Winter 2001 Streets of Byzantium 4

System Model (IV) In a run, let π = the number of processes that are faulty 0 π n ψ = the number of channels that are faulty λ = the number of links that are faulty 0 ψ and 0 λ λ + ψ < R, which we show next... CSE 223 Winter 2001 Streets of Byzantium 5

Being Connected If R > λ + ψ then there will always be a direct path connecting any pair of processes. n R Remove ψ right nodes. Each node on the left has R ψ > λ edges incident. CSE 223 Winter 2001 Streets of Byzantium 6

Send Omission: 2 rounds Round 1: p 1 : xcast(v, *); decide(v); Round 2: p i, i 1: if ( C 1 i 0) then xcast(m i, {1,2, R} - C 1 i ); decide(m i ); After round 2: p i, i not yet decided: if ( C 2 i 0) then decide(m i ); else decide(v 0 ); C r i is set of channels over which p i receives non-null message m i V in round r Since there are only crash and omission failures, m i is unique. CSE 223 Winter 2001 Streets of Byzantium 7

Send Omission: Proof (I) If n λ + π then the protocol implements reliable broadcast. Proof by case analysis: 1. Nonfaulty p 1. All nonfaulty p receive v in Round 1 because p 1 and p are connected. All nonfaulty p decide on v in Round 2 CSE 223 Winter 2001 Streets of Byzantium 8

Send Omission: Proof (II) 2. Faulty p 1 that broadcasts no messages over nonfaulty channels to which it is properly connected. No value from p 1 is received by anyone. By end of Round 2, all nonfaulty p decide on v 0. CSE 223 Winter 2001 Streets of Byzantium 9

Send Omission: Proof (III) 3. Faulty p 1 and some nonfaulty p j receives initial broadcast. Thus m i = v. Nonfaulty p j will either receive in Round 1, if p j is well connected to channel over which p j received v, or in Round 2 because p i and p j are connected. CSE 223 Winter 2001 Streets of Byzantium 10

Send Omission: Proof (IV) 4. Faulty p 1 and only faulty processes receive v in Round 1 p 1 sent v on at least one channel because some processes received v. Hence all n π nonfaulty processes have faulty links to that channel, so λ n π. p 1 v φ v φ CSE 223 Winter 2001 Streets of Byzantium 11

Send Omission: Proof (V) But by assumption n λ + π or λ n π and so there can be no further link failures: only these n π links are faulty. If any faulty process succeeds in sending v on a channel that does not drop the message, then all nonfaulty p will receive and decide on v in Round 2. Else they decide on v 0. CSE 223 Winter 2001 Streets of Byzantium 12

Send Omission: Summary 0 λ + ψ < R and 0 λ + π n Choose R based on probability of link failures and channel failures. Then choose n based on probability of process crashes. Some boundary cases: when π = 0 and R < n then λ + ψ < R when λ = 0 then ψ < R and π n when ψ = 0 then λ < R and λ + π n CSE 223 Winter 2001 Streets of Byzantium 13

Arbitrary Failures: 2 rounds Requires n > t + π + 2λ where t is the maximum possible number of process failures. Values sent are in W = V {φ, } where denotes bad value. S k denotes all multisets of k elements drawn from S. For example, {0, 1} 2 = {{0, 0}, {0, 1}, {1, 1}} CSE 223 Winter 2001 Streets of Byzantium 14

Arbitrary Failures (II) Every nonfaulty process broadcasts at most one message over a channel in any single round. B r i(j) W R is the multiset of messages p i received from p j in Round r, one message per channel. If p i received more than one message per channel in Round r from p j, then B r i (j) = { }R. B r i(j) is consistent in v if B r i(j) {φ, v} R for some v W and v. CSE 223 Winter 2001 Streets of Byzantium 15

Arbitrary Failures (III) Some useful functions: σ: W R V: v if parameter is consistent in v; v 0 otherwise. γ z : W n V: v if v is most frequent non-φ value in parameter and there are at least z of them; v 0 otherwise.... well-defined only if 2z > n. CSE 223 Winter 2001 Streets of Byzantium 16

Arbitrary Failures: Protocol Round 1: p 1 : xcast(v, *); Round 2: p i : m i = σ(b 1 i (1)); xcast(m i, *); After round 2: p i : M i = { j: 1 j n: σ(b 2 i (j))}; decide on γ t+1 (M i ) CSE 223 Winter 2001 Streets of Byzantium 17

Arbitrary Failures: Proof (I) Sending process p j : is overtly malicious when its multiset of message values as observed on channels is not consistent; p i is a witness if B r i (j) is not consistent in v for some v. exhibits overt omission failure if it is not overtly malicious and broadcasts null messages over at least one channel to which it is well-connected; p i is a witness if B r i (j) = {φ}r. p j crashing subsequently exhibits overt omission failure. is correct if it does neither. CSE 223 Winter 2001 Streets of Byzantium 18

Arbitrary Failures: Proof (II) Lemma: Let k be the number of witnesses to an overt failure of p 1 by the end of Round 1 (not counting the transmitter). 0 k λ or k = n 1 for overt omission failure; n λ 1 k n 1 for overtly malicious failure. CSE 223 Winter 2001 Streets of Byzantium 19

Arbitrary Failures: Proof (II) Overt omission: Let x be the number of nonfaulty channels to which p 1 is properly connected and to which p 1 sends non-null messages: 0 x R. If x = 0 then all of the remaining n 1 processes witness the failure. If x > 0 then each process that witnesses the failure has x faulty links. Given k witnesses, there were kx λ link failures kx may be less than λ because there may be subsequent link failures. So, k is bounded from above by λ/x, and thus is in the range from 0 to λ. CSE 223 Winter 2001 Streets of Byzantium 20

Arbitrary Failures: Proof (III) Proof for overtly malicious failure: p 1 failed by successfully broadcasting two different non-null messages over two channels. p j will witness unless it is not well-connected to at least one of these channels. At most λ processes can be not well-connected in this manner. Thus, between n λ 1 and n 1 processes will witness p 1 s overtly malicious failure. CSE 223 Winter 2001 Streets of Byzantium 21

Arbitrary Failures: Proof (IV) Proof of correctness when n > t + π + 2λ by case analysis. 1. Nonfaulty p 1. By end of Round 1 and because all processes are connected, all B 1 i (1) are consistent in v. Thus all m i = v and are sent in Round 2 by nonfaulty processes p i. After Round 2, at all nonfaulty processes p i, at least n π occurrences of v are in in M i along with at most π occurrences of some other value or. Since n > t + π + 2λ and t π, n > 2π and so n π > π. Thus, the v values dominate. And, n π > t + 2λ t. So n π t + 1. Thus γ t+1 (M i ) computes v. CSE 223 Winter 2001 Streets of Byzantium 22

Arbitrary Failures: Proof (V) 2. Overt omission failure by p 1. There will be k {0, 1, λ, n 1} witnesses. If k = n 1 then at least n π broadcast φ in Round 2. As in Case 1, the nonfaulty processes decide on v 0. If k λ, then at least n π k 1 nonfaulty processes do not witness the failure. These, along with p 1, set m i = v and send v-consistent messages in Round 2. All nonfaulty processes will have at least n π k v-values in M i. n > t + π + 2λ or n π k > t + (2λ k) t or n > t Thus γ t+1 (M i ) computes v. CSE 223 Winter 2001 Streets of Byzantium 23

Arbitrary Failures: Proof (VI) 3. Overtly malicious p 1. Number of witnesses k n λ 1. At least k (π 1) processes send v 0 -values in Round 2. All nonfaulty processes p i have at least k π + 1 v 0 -values in M i. k π + 1 n λ π n > t + π + 2λ or n λ π t + λ + 1 t + 1 So k π + 1 t + 1 Thus γ t+1 (M i ) computes v 0. CSE 223 Winter 2001 Streets of Byzantium 24

Arbitrary Failures: Summary 0 λ + ψ < R and 0 t + π + 2λ < n More simply, n 2(t + λ) + 1 Compare with n t + λ for send omission failure model. CSE 223 Winter 2001 Streets of Byzantium 25