Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour

Similar documents
Distributed Algorithms

CptS 464/564 Fall Prof. Dave Bakken. Cpt. S 464/564 Lecture January 26, 2014

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Chandy-Lamport Snapshotting

Clocks in Asynchronous Systems

Ordering and Consistent Cuts Nicole Caruso

Snapshots. Chandy-Lamport Algorithm for the determination of consistent global states <$1000, 0> <$50, 2000> mark. (order 10, $100) mark

Agreement. Today. l Coordination and agreement in group communication. l Consensus

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at

Today. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks

CS505: Distributed Systems

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

Chapter 11 Time and Global States

Slides for Chapter 14: Time and Global States

Time is an important issue in DS

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

Causality & Concurrency. Time-Stamping Systems. Plausibility. Example TSS: Lamport Clocks. Example TSS: Vector Clocks

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Figure 10.1 Skew between computer clocks in a distributed system

Causality and Time. The Happens-Before Relation

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

Distributed Algorithms Time, clocks and the ordering of events

Absence of Global Clock

Distributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017

416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017

The State Explosion Problem

Time. To do. q Physical clocks q Logical clocks

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at h9p://people.rennes.inria.fr/eric.fabre/

Time. Today. l Physical clocks l Logical clocks

Distributed Computing. Synchronization. Dr. Yingwu Zhu

Finite-State Model Checking

Generalized Consensus and Paxos

Revising Distributed UNITY Programs is NP-Complete

Formal Methods for Monitoring Distributed Computations

Recognizing Safety and Liveness by Alpern and Schneider

Safety and Liveness Properties

Clock Synchronization

Revising UNITY Programs: Possibilities and Limitations 1

Using Happens-Before Relationship to debug MPI non-determinism. Anh Vo and Alan Humphrey

7680: Distributed Systems

Causality and physical time

CS505: Distributed Systems

Time in Distributed Systems: Clocks and Ordering of Events

Distributed Systems Principles and Paradigms

Modal and Temporal Logics

Causal Broadcast Seif Haridi

Distributed Systems Fundamentals

Complexity Results in Revising UNITY Programs

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007

Revising Distributed UNITY Programs is NP-Complete

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Model Checking: An Introduction

Introduction to Model Checking. Debdeep Mukhopadhyay IIT Madras

The Weakest Failure Detector to Solve Mutual Exclusion

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

Lecture 2 Automata Theory

Colorless Wait-Free Computation

Shared Memory vs Message Passing

CS505: Distributed Systems

On Equilibria of Distributed Message-Passing Games

FORMAL METHODS LECTURE III: LINEAR TEMPORAL LOGIC

Lecture 2 Automata Theory

The algorithmic analysis of hybrid system

Distributed Systems Time and Global State

Asynchronous Models For Consensus

TECHNICAL REPORT YL DISSECTING ZAB

Automata-Theoretic Model Checking of Reactive Systems

Time, Clocks, and the Ordering of Events in a Distributed System

Analysis and Optimization of Discrete Event Systems using Petri Nets

Recap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?

Abstractions and Decision Procedures for Effective Software Model Checking

The Underlying Semantics of Transition Systems

Verification of Probabilistic Systems with Faulty Communication

6.852: Distributed Algorithms Fall, Class 24

arxiv: v2 [cs.dc] 18 Feb 2015

Axiomatic Semantics: Verification Conditions. Review of Soundness and Completeness of Axiomatic Semantics. Announcements

Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS. 1. Introduction. Svein Willassen

Analyzing Isochronic Forks with Potential Causality

Section 6 Fault-Tolerant Consensus

Lecture 4 Event Systems

Algorithmic verification

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Petri nets. s 1 s 2. s 3 s 4. directed arcs.

Automatic Synthesis of Distributed Protocols

Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Warm-Up Problem. Please fill out your Teaching Evaluation Survey! Please comment on the warm-up problems if you haven t filled in your survey yet.

Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication

Distributed Mutual Exclusion Based on Causal Ordering

Real Time Operating Systems

Temporal logics and explicit-state model checking. Pierre Wolper Université de Liège

Complexity Results in Revising UNITY Programs

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

CIS 842: Specification and Verification of Reactive Systems. Lecture Specifications: Specification Patterns

Variations on Itai-Rodeh Leader Election for Anonymous Rings and their Analysis in PRISM

6.852: Distributed Algorithms Fall, Class 10

Do we have a quorum?

A Brief Introduction to Model Checking

A Polynomial-Time Algorithm for Checking Consistency of Free-Choice Signal Transition Graphs

Transcription:

Distributed Algorithms (CAS 769) Week 1: Introduction, Logical clocks, Snapshots Dr. Borzoo Bonakdarpour Department of Computing and Software McMaster University Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 1/44

Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 2/44

Acknowledgments Most of the contents of these slides are obtained from the following books: Distributed Algorithms: An Intuitive Approach - Wan Fokkink Elements of Distributed Computing - Vijay K. Garg Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 3/44

Distributed Systems Some Definitions There is no universally accepted definition of a distributed system. What makes a system distributed? One man s constant is another man s variable. - Alan Perlis A distributed system is a system where I can t get my work done because a computer has failed that Ive never even heard of. A distributed system is one in which the failure of a computer you didn t even know existed can render your own computer unusable. - Leslie Lamport Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 4/44

Distributed Systems Some Definitions A distributed system is one that has multiple machines is connected by a network is cooperating on some task Communication in Distributed Systems Message passing Shared memory Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 5/44

Distributed Systems We begin with message passing systems. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 6/44

Preliminaries Message Passing Framework In a message passing framework, a distributed system consists of a finite graph of N processes (a process is a running program and each process has its local state) Each process carries a unique ID Processes communicate through FIFO channels Characteristics of Communication Communication is asynchronous; i.e., sending and receiving messages are distinct events, respectively Delay in channels is arbitrary but finite There are no garbled, duplicated or lost messages Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 7/44

Preliminaries Other Assumptions Absence of a shared clock Absence of shared memory Absence of accurate failure detection Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 8/44

Example {x1=0} Process P1() { e0 1 : send(p2,m1); e1 1 : x1=5; e2 1 : x1=10; e3 1 : recv(m2); } {x2=0} Process P2() { e0 2 : recv(m1); e1 2 : x2=15; e2 2 : x2=20; e3 2 : send(p1,m2); } Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 9/44

Preliminaries Transition Systems The behavior of a distributed algorithm, which runs on a distributed system is often captured by a transition system, which consists of: A set C of configurations (i.e., the composition of local states of its processes plus the messages in transit) A binary transition relation on C A set I C of initial configurations A configuration γ is terminal, if there does not exist γ C such that γ γ An execution of the distributed system is a sequence γ = γ 0 γ 1 γ 2 such that: γ 0 I for all i 0, we have γ i γ i+1 A configuration δ is reachable if there is a γ 0 I and a finite execution γ 0 γ 1 γ k, such that γ k = δ. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 10/44

Example For example, in the distributed algorithm on Slide 9: Configuration (x1 = 0, x2 = 0) is the only initial configuration. Configuration (x1 = 10, x2 = 20) is the only terminal configuration. (x1 = 0, x2 = 0) (x1 = 5, x2 = 0) (x1 = 10, x2 = 0) (x1 = 10, x2 = 15) (x1 = 10, x2 = 20) is a valid execution. And so is (x1 = 0, x2 = 0) (x1 = 5, x2 = 0) (x1 = 5, x2 = 15) (x1 = 10, x2 = 15) (x1 = 10, x2 = 20). Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 11/44

Preliminaries Question: Is configuration reachability decidable? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 12/44

Preliminaries A transition between two configurations is associate to an event. A process can perform an internal (i.e., change of local state of a process), send, or receive event. A process if called an initiator if its first event is either internal or send. An assertion is a predicate on the configuration of an algorithms (e.g., x y + 1). We use assertions to define safety properties. An assertion P is an invariant if: P(γ) for all γ I, and if γ γ and P(γ), then P(γ ). Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 13/44

Example For example, in the distributed algorithm on Slide 9: Instruction x1 = 5 is an internal event. Process P1 is an initiator. (x1 100 x2 50) is an invariant. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 14/44

Preliminaries Properties A property is a set of executions. Safety Properties A safety property typically expresses that something bad will never happen. For example: The temprature of a boiler never reaches 100 degress. If an interrupt occurs, a message will be printed in one second. Formally, a safety property is a set S of infinite executions where: γ S : α γ : γ : α γ γ S where α γ denotes the fact that α is a prefix of γ. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 15/44

Preliminaries Liveness Properties A liveness property typically expresses that something good will eventually happen. Formally, if L is a liveness property, then the following holds: α : γ : α γ L where α is a finite execution and γ is an infinite execution. Examples of liveness properties: Non-starvation. If an interrupt occurs, a message will be printed. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 16/44

Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 17/44

Causal Order In an asynchronous distributed system, in each configuration, different events can occur in different processes. Such occurrence of events are independent. The causal order is a binary relation on events in an execution, such that a b iff event a happened before event b. I.e., events in an execution cannot be reordered, so that a happens after b. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 18/44

Causal Order Causal Order (Happened Before) Formally, the causal order (also called happened before) is the smallest binary relation, where if a and b are events at the same process and a occurs before b, then a b, if a is a send event and b the corresponding receive event, then a b, and if a b and b c, then a c. Notice that the happened before relation is a partial order. We write a b if either a b or a = b. If a b and b a, then we say a and b are concurrent events. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 19/44

Computation A permutation of concurrent events in an execution does not affect the result of the execution. P 2 P 1 e 1 0 e 2 0 e 2 1 e 1 1 e 2 2 e 1 2 e 2 3 e 1 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 20/44

Computation The set of all permutations form the computation lattice. e 1 3, e2 3 e 1 2, e2 3 e 1 2, e2 2 e 1 1, e2 3 e 1 2, e2 1 e 1 1, e2 2 e 1 0, e2 3 e 1 2, e2 0 e 1 1, e2 1 e 1 0, e2 2 e 1 2 e 1 1, e2 0 e 1 0, e2 1 e 1 1 e 1 0, e2 0 e 1 0 {} Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 21/44

Happened before Vs. Physical Time Question: If a safety property holds in the happened before relation, does it hold in physical time as well? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 22/44

Logical Clocks Since a physical shared clock does not exists in a distributed system, we use logical clocks. A logical clock C maps occurrences of events in a computation to a partially ordered set such that a b C(a) < C(b) Lamport s clock LC assigns to each event a the length k of a longest causality chain a 1 a k = a in the computation. Obviously, a b LC(a) < LC(b) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 23/44

Logical Clocks Algorithm for Handling Lamport s clocks Consider an event a, and let k be the clock value of the previous event at the same process (k = 0 if there is no such previous event). If a is an internal or send event, then LC(a) = k + 1 If a is a receive event and b the corresponding send event, then LC(a) = max{k, LC(b)} + 1 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 24/44

Vector Clocks The vector clock VC has the property a b VC(a) < VC(b) Let a distributed system consist of processes p 0,..., p N 1. The vector clock assigns events a computation values in N N, whereby this set is provided with a partial order defined by: (k 0,..., k N 1 ) (l 0,..., l N 1 ) k i l i, for all i {0,..., N 1} The vector clock is defined as follows: VC(a) = (k 0,..., k N 1 ), where k i is the length of a longest causality chain a1 i ak i i of events at process p i with ak i i a. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 25/44

Example Demonstrate the evolution of the vector clock for this computation: P 2 P 1 e 1 0 e 2 0 e 2 1 e 1 1 e 2 2 e 1 2 e 2 3 e 1 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 26/44

Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 27/44

Sanpshot (Global State) Definitions Snapshot cannot be defined based on physical time (e.g., the composition of all local state at the same time instant). We use the happened before relation to compute concurrent local states and, hence, snapshots. A (global) snapshot of an execution of a distributed algorithm is a configuration of this execution, consisting of the local states of the processes and the messages in transit. Intuitively, a snapshot is consistent if it represents a configuration of the current execution or a configuration of an execution in the same computation. Snapshots are useful to determine stable properties of a distributed system (i.e., properties that when become true, will remain true). E.g., deadlock, termination, loss of a token, etc. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 28/44

Sanpshot The Challenge Why is it difficult to compute a snapshot of a distributed system at run time? Taking a global snapshot is like taking the picture of the sky: the scene is so big that it cannot be captured by a single photograph. The challenge is taking multiple photographs at the same time is not quite possible. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 29/44

Sanpshot Terminology Suppose we design an algorithm that takes a snapshot of another distributed algorithm. We call the messages of the underlying algorithm basic messages and messages of the snapshot algorithm control messages. An event is called presnapshot if it occurs at a process before the local snapshot at this process is taken. Otherwise it s called postsnapshot. Consistent Snapshot A snapshot is consistent if for each presnapshot event a, all events that are causally before a are also presnapshot, a basic message included in a channel state iff the corresponding send event is presnapshot while the corresponding receive event is postsnapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 30/44

Example G 1 G 2 P 1 m 1 P 2 m 2 m 3 P 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 31/44

Example G 1 G 2 P 1 m 1 P 2 m 2 m 3 P 3 G 1 is not a consistent snapshot, but G 2 is. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 31/44

Chandy-Lamport Algorithm Assumption All channels are FIFO. Challenges All recorded local state are mutually concurrent The state of all channels are captured correctly. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 32/44

Chandy-Lamport Algorithm Solution We associate with each process a variable called color that is either red or white. All processes are initially white. Intuitively, the computed global snapshot corresponds to the state of the system just before the processes turn red. The algorithm relies on special control messages called markers Once a process turns red, it send a marker along all its outgoing channels before it sends out any message. A process turns red on receiving a marker if it has not already done so. No white process receives a marker from a red process. Why? This guarantees that local states are mutually concurrent. Why? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 33/44

Chandy-Lamport Algorithm Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 34/44

Chandy-Lamport Algorithm Classification of Basic Messages (ww messages) These messages are sent by a white process to a white process. These message correspond to the messages sent and received before the global snapshot. (rr messages) These message correspond to the messages sent and received after the global snapshot. (rw messages) These messages cross the global snapshot in the backward directions. Such a message will make the snapshot inconsistent. It is not possible to have such messages, if a marker is used. Why? (wr messages) These messages cross the global snapshot in the forward directions and participate in the state of the channel in the snapshot, because they are in transit when the snapshot is taken. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 35/44

Chandy-Lamport Algorithm P 1 ww rw wr rr P 2 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 36/44

Chandy-Lamport Algorithm (Example) C A B

Chandy-Lamport Algorithm (Example) A m 1, mkr C mkr B Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 37/44

Chandy-Lamport Algorithm (Example) m 1, mkr C m2 A B mkr B computes the state of channel AB as {}. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 38/44

Chandy-Lamport Algorithm (Example) C mkr, m 2 m 1 A B C computes the state of channel AC as {}. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 39/44

Chandy-Lamport Algorithm (Example) C m 1 {m 2 } A B B computes the state of channel CB as {m 2 }. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 40/44

Chandy-Lamport Algorithm (Example) Question: Is the computed snapshot a configuration of the actual execution? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 41/44

Lai-Yang Algorithm Assumptions This algorithm does not assume FIFO channels. But it assumes message piggybacking. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 42/44

Lai-Yang Algorithm The Algorithm Any initiator can decide to take a local snapshot. As long as a process hs not taken a local snapshot, it appends false to its outgoing basic messages. When a process has taken its local snapshot, it appends true to each outgoing basic message. When a process that hasn t yet taken a snapshot receives a message with true or a control message (see next slide) for the first time, it takes a local snapshot of its state before reception of this message. A process q computes as channel state of pq the basic messages without the tag true that it receives via pq after its local snapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 43/44

Lai-Yang Algorithm The Algorithm Question: How does q know when it can determine the channel state of pq? p sends a control message to q, informing q how many basic messages without the tag true p sent into pq. These control messages also ensure that all processes eventually take a local snapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 44/44