A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

Similar documents
Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Distributed Algorithms Time, clocks and the ordering of events

Figure 10.1 Skew between computer clocks in a distributed system

Time is an important issue in DS

Agreement. Today. l Coordination and agreement in group communication. l Consensus

CptS 464/564 Fall Prof. Dave Bakken. Cpt. S 464/564 Lecture January 26, 2014

Slides for Chapter 14: Time and Global States

Clocks in Asynchronous Systems

Snapshots. Chandy-Lamport Algorithm for the determination of consistent global states <$1000, 0> <$50, 2000> mark. (order 10, $100) mark

Chapter 11 Time and Global States

CS505: Distributed Systems

Chandy-Lamport Snapshotting

Distributed Systems Fundamentals

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Distributed Systems Principles and Paradigms

Today. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks

Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour

DISTRIBUTED COMPUTER SYSTEMS

Distributed Algorithms

Section 6 Fault-Tolerant Consensus

CS505: Distributed Systems

Time. To do. q Physical clocks q Logical clocks

Time. Today. l Physical clocks l Logical clocks

7680: Distributed Systems

Distributed Systems Time and Global State

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

Clock Synchronization

Efficient Notification Ordering for Geo-Distributed Pub/Sub Systems

Do we have a quorum?

Absence of Global Clock

Causality and physical time

Asynchronous Models For Consensus

Time in Distributed Systems: Clocks and Ordering of Events

On Equilibria of Distributed Message-Passing Games

Distributed Systems 8L for Part IB

Causality and Time. The Happens-Before Relation

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

Automatic Synthesis of Distributed Protocols

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG

416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017

Ordering and Consistent Cuts Nicole Caruso

Distributed Computing. Synchronization. Dr. Yingwu Zhu

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Distributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017

CS5412: REPLICATION, CONSISTENCY AND CLOCKS

Distributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson

6.852: Distributed Algorithms Fall, Class 24

Distributed Mutual Exclusion Based on Causal Ordering

Time, Clocks, and the Ordering of Events in a Distributed System

6.852: Distributed Algorithms Fall, Class 10

Early consensus in an asynchronous system with a weak failure detector*

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Using Happens-Before Relationship to debug MPI non-determinism. Anh Vo and Alan Humphrey

Causal Broadcast Seif Haridi

Actors for Reactive Programming. Actors Origins

Fault-Tolerant Consensus

Unreliable Failure Detectors for Reliable Distributed Systems

High Performance Computing

A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Trackability

Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication 1

Analysis of Bounds on Hybrid Vector Clocks

Genuine atomic multicast in asynchronous distributed systems

CS 347 Parallel and Distributed Data Processing

Shared Memory vs Message Passing

Causality & Concurrency. Time-Stamping Systems. Plausibility. Example TSS: Lamport Clocks. Example TSS: Vector Clocks

Failure detectors Introduction CHAPTER

Counting in Practical Anonymous Dynamic Networks is Polynomial

Efficient Dependency Tracking for Relevant Events in Concurrent Systems

Eventually consistent failure detectors

Safety and Liveness. Thread Synchronization: Too Much Milk. Critical Sections. A Really Cool Theorem

Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication

Verifying Programs that use Causally-ordered Message-passing

A leader election algorithm for dynamic networks with causal clocks

Lecture 11 Safety, Liveness, and Regular Expression Logics

Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures

Design of a Sliding Window over Asynchronous Event Streams

I R I S A P U B L I C A T I O N I N T E R N E N o VIRTUAL PRECEDENCE IN ASYNCHRONOUS SYSTEMS: CONCEPT AND APPLICATIONS

Seeking Fastness in Multi-Writer Multiple- Reader Atomic Register Implementations

Petri nets. s 1 s 2. s 3 s 4. directed arcs.

1 Introduction During the execution of a distributed computation, processes exchange information via messages. The message exchange establishes causal

1 Introduction During the execution of a distributed computation, processes exchange information via messages. The message exchange establishes causal

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

CprE 281: Digital Logic

Valency Arguments CHAPTER7

THE chase for the weakest system model that allows

S1 S2. checkpoint. m m2 m3 m4. checkpoint P checkpoint. P m5 P

Variations on Itai-Rodeh Leader Election for Anonymous Rings and their Analysis in PRISM

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links

6.852: Distributed Algorithms Fall, Class 25

Universally Composable Synchronous Computation

Lecture Note #6: More on Task Scheduling EECS 571 Principles of Real-Time Embedded Systems Kang G. Shin EECS Department University of Michigan

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

6.852: Distributed Algorithms Fall, Class 8

Embedded Systems 6 REVIEW. Place/transition nets. defaults: K = ω W = 1

Addressing the Challenges of Conservative Event Synchronization for the SARL Agent-Programming Language

Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication

Consensus when failstop doesn't hold

Transcription:

A subtle problem An obvious problem when LC = t do S doesn t make sense for Lamport clocks! there is no guarantee that LC will ever be S is anyway executed after LC = t Fixes: if e is internal/send and LC = t 2 execute e and then S if e = receive(m) (TS(m) t) (LC t 1) put message back in channel re-enable e ; set LC = t 1; execute S t t ss No! Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks An obvious problem An obvious problem t ss No! t ss No! Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... Doing so assumes upper bound on message delivery time upper bound relative process speeds We better relax it...

Snapshot II Relaxing synchrony processor selects Ω sends take a snapshot at Ω to all processes; it waits for all of them to reply and then sets its logical clock to Ω p i when clock of reads Ω then records its local state σ i sends an empty message along its outgoing channels starts recording messages received on each incoming channel stops recording a channel when receives first message with timestamp greater than or equal to Ω p i p i take a snapshot at Ω Process does nothing for the protocol during this time! empty message: TS(m) Ω records local state σ i sends empty message: TS(m) Ω Use empty message to announce snapshot! monitors channels Snapshot III Snapshots: a perspective processor sends itself take a snapshot when receives take a snapshot for the first time from : p i records its local state sends take a snapshot along its outgoing channels sets channel from p j σ i to empty starts recording messages received over each of its other incoming channels p j The global state saved by the snapshot protocol is a consistent global state when receives take a snapshot beyond the first time from : p i stops recording channel from p i p k when has received take a snapshot on all channels, it sends collected state to and stops. p k

Snapshots: a perspective Snapshots: a perspective The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred We are evaluating predicates on states that may have never occurred!

Σ 10 Σ 11 Σ 11

Σ 12 Σ 21 Σ 12 Σ 21 Σ 22 Σ 12

Σ 21 Σ 22 Σ 12 Σ 21 Σ 22 Σ 12 Σ 42 Σ 42 Σ 21 Σ 22 Σ 12 Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 22 Σ 43 Σ 33 Σ 54 Σ 65 Σ 12 Σ 13 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 55 Σ 45 Σ 03 Σ 04 Σ 14

Reachability Reachability is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 54 Σ 45 Σ 55 Σ 65 Σ 03 Σ 04 Σ 14 is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 54 Σ 45 Σ 55 Σ 65 Σ 03 Σ 04 Σ 14 Reachability Reachability is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 54 Σ 55 Σ 65 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 45 Σ 03 Σ 04 Σ 14 is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 54 Σ 55 Σ 65 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 45 Σ 03 Σ 04 Σ 14

So, why do we care about again? So, why do we care about again? Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Deadlock in implies deadlock in Σ f So, why do we care about again? Same problem, different approach Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Monitor process does not query explicitly Instead, it passively collects information and uses it to build an observation. (reactive architectures, Harel and Pnueli [1985]) Deadlock in No deadlock in implies deadlock in Σ f implies no deadlock in Σ i An observation is an ordering of event of the distributed computation based on the order in which the receiver is notified of the events.

Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications e 1 1 e 1 1 e 2 1 Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications e 1 1 e 2 1 e 1 1 e 2 1 To obtain a run, messages must be delivered to the monitor in FIFO order

Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) e 1 1 e 2 1 To obtain a run, messages must be delivered to the monitor in FIFO order What about consistent runs? Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event deliver event p 3

Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event m send event receive event deliver event deliver event m p 3 p 3 Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event m send event receive event deliver event deliver event m m p 3 1 p 3 1 2

Causal Delivery in Synchronous Systems Causal Delivery in Synchronous Systems We use the upper bound message delivery time on We use the upper bound message delivery time on DR1: At time t, delivers all messages it received with timestamp up to t in increasing timestamp order Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. 1

Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. 1 4 Should deliver? 1 4 Should deliver? Wouldn;t one see LC 2 and 3? If so, I should know when I can deliver... Or is it that there my be multiple 4s? Problem: Lamport Clocks don t provide gap detection Given two events e and e and their clock values LC(e) and LC(e ) where LC(e) < LC(e ) determine whether some event e exists s.t. LC(e) < LC(e ) < LC(e ) Stability Implementing Stability DR2: Deliver all received stable messages in increasing (logical clock) timestamp order. Real-time clocks wait for time units A message m received by p is stable at p if will never receive a future message m s.t. TS(m ) <TS(m) p

Implementing Stability Clocks and STRONG Clocks Real-time clocks wait for time units Lamport clocks implement the clock condition: e e LC(e) <LC(e ) Lamport clocks wait on each channel for m Design better clocks! s.t. TS(m) >LC(e) We want new clocks that implement the strong clock condition: e e SC(e) <SC(e ) Causal Histories Causal Histories The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} e 1 1 e 2 1 e 3 1 e 4 1 e 5 1 p 3 e 1 3 e 2 3 e 3 3 e 4 3

Causal Histories How to build θ(e) The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} p 3 e 1 1 e 2 1 e 3 1 e 4 1 e 5 1 e 1 3 e 2 3 e 3 3 e 4 3 Each process : initializes θ : if e k i e k i p i θ := is an internal or send event, then θ(e k i ):={e k i } θ(e k 1 i ) if is a receive event for message m, then θ(e k i ):={e k i } θ(e k 1 i ) θ(send(m)) e e θ(e) θ(e ) Pruning causal histories Prune segments of history that are known to all processes (Peterson, Bucholz and Schlichting) Use a more clever way to encode θ(e)