Slides for Chapter 14: Time and Global States

Similar documents
Figure 10.1 Skew between computer clocks in a distributed system

Chapter 11 Time and Global States

Time is an important issue in DS

Clocks in Asynchronous Systems

Agreement. Today. l Coordination and agreement in group communication. l Consensus

Time. To do. q Physical clocks q Logical clocks

Clock Synchronization

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Time. Today. l Physical clocks l Logical clocks

CS505: Distributed Systems

CS505: Distributed Systems

Distributed Computing. Synchronization. Dr. Yingwu Zhu

416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017

Distributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson

Time in Distributed Systems: Clocks and Ordering of Events

DISTRIBUTED COMPUTER SYSTEMS

Today. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

Time, Clocks, and the Ordering of Events in a Distributed System

Distributed Systems Time and Global State

Chandy-Lamport Snapshotting

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

7680: Distributed Systems

Snapshots. Chandy-Lamport Algorithm for the determination of consistent global states <$1000, 0> <$50, 2000> mark. (order 10, $100) mark

CptS 464/564 Fall Prof. Dave Bakken. Cpt. S 464/564 Lecture January 26, 2014

Absence of Global Clock

Distributed Systems Fundamentals

Distributed Systems 8L for Part IB

CS 347 Parallel and Distributed Data Processing

Distributed Systems. Time, Clocks, and Ordering of Events

Causality and physical time

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

Distributed Algorithms Time, clocks and the ordering of events

Ordering and Consistent Cuts Nicole Caruso

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Distributed Systems Principles and Paradigms

Distributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour

arxiv: v1 [cs.dc] 22 Oct 2018

Distributed Systems. Time, clocks, and Ordering of events. Rik Sarkar. University of Edinburgh Spring 2018

Distributed Algorithms

Clock Synchronization

MODELING TIME AND EVENTS IN A DISTRIBUTED SYSTEM

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at

Crashed router. Instructor s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 Pearson Education

1 Introduction During the execution of a distributed computation, processes exchange information via messages. The message exchange establishes causal

1 Introduction During the execution of a distributed computation, processes exchange information via messages. The message exchange establishes causal

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Causality and Time. The Happens-Before Relation

TTA and PALS: Formally Verified Design Patterns for Distributed Cyber-Physical

Do we have a quorum?

CS5412: REPLICATION, CONSISTENCY AND CLOCKS

Information System Design IT60105

Reliable Broadcast for Broadcast Busses

Early consensus in an asynchronous system with a weak failure detector*

S1 S2. checkpoint. m m2 m3 m4. checkpoint P checkpoint. P m5 P

Seeking Fastness in Multi-Writer Multiple- Reader Atomic Register Implementations

Exclusive Access to Resources in Distributed Shared Memory Architecture

cs/ee/ids 143 Communication Networks

Time-Bounding Needham-Schroeder Public Key Exchange Protocol

Section 6 Fault-Tolerant Consensus

Parallel Performance Evaluation through Critical Path Analysis

A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Trackability

Using Happens-Before Relationship to debug MPI non-determinism. Anh Vo and Alan Humphrey

Consensus. Consensus problems

Determining Consistent States of Distributed Objects Participating in a Remote Method Call

Recap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?

Real-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking

Outline F eria AADL behavior 1/ 78

Asynchronous Communication 2

A VP-Accordant Checkpointing Protocol Preventing Useless Checkpoints

An Indian Journal FULL PAPER. Trade Science Inc. A real-time causal order delivery approach in largescale ABSTRACT KEYWORDS

INF Models of concurrency

Efficient Dependency Tracking for Relevant Events in Shared-Memory Systems

Estimation of clock offset from one-way delay measurement on asymmetric paths

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

Distributed Mutual Exclusion Based on Causal Ordering

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at h9p://people.rennes.inria.fr/eric.fabre/

Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication

Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination

TECHNICAL REPORT YL DISSECTING ZAB

On Equilibria of Distributed Message-Passing Games

Unreliable Failure Detectors for Reliable Distributed Systems

Approximate Synchrony: An Abstraction for Distributed Time-Synchronized Systems

Causal Broadcast Seif Haridi

Design of a Sliding Window over Asynchronous Event Streams

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA)

Efficient Dependency Tracking for Relevant Events in Concurrent Systems

Accurate Event Composition

Design of discrete-event simulations

SDS developer guide. Develop distributed and parallel applications in Java. Nathanaël Cottin. version

Generating Random Variates II and Examples

CS505: Distributed Systems

Computer Networks ( Classroom Practice Booklet Solutions)

I Can t Believe It s Not Causal! Scalable Causal Consistency with No Slowdown Cascades

Transcription:

Slides for Chapter 14: Time and Global States From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, Addison-Wesley 2012

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states (skip) Distributed debugging (skip) 2

Introduction Some applications need to record the time when events occur (e.g. ecommerce, banking) May need to determine the relative order in which certain events occurred Physical events order based on observer s frame of reference Multiple computer clocks may be skewed and need to by synchronized no absolute global time for all computers in a distributed systems 3

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging 4

Clocks, events, process states Each process can observe/cause multiple events Some events may change process states (data) History of a process: the series of events that take place within the process ordered in a total ordering Clocks: Each computer has a physical clock Timestamp: date/time that an event occurred Clock skew: instantaneous difference between two clocks Clock drift: different clocks count time at slightly different speeds UTC (Coordinated Universal Time): based on atomic time international standard for time keeping 5

Figure 14.1 Skew between computer clocks in a distributed system

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging 7

Synchronizing physical clocks External synchronization: Synchronizes each clock in the distributed system with a UTC source clocks must be within drift bound D of UTC Internal synchronization: Synchronizes the clocks in the distributed system with one another any two physical clocks must be within drift bound D of one another May drift from UTC but are synchronized together Faulty clock: does not stay within specified drift bound 8

Synchronizing physical clocks Synchronization in a synchronous system: Synchronous system has upper bound on message transmission time max also min is the minimum message transmission time between two machines A message sent from node p at (local clock) time t arrives at another node q at (local clock) time t Can now set clock at node q to t + (max + min)/2 to synchronize clock at node q with clock in node p 9

Synchronizing physical clocks using a time server node NTP (Network Time Protocol): Synchronize clock at each node with clock at time server Assumes asynchronous system Christian s algorithm: Process p requests time from time server in message m r, receives time value t in message m t P records round trip time T round between sending and receiving P sets local clock time to t + T round /2 Probabilistic method All processes (nodes) synchronize with clock server node Used for synchronizing nodes on a local intranet Variation called Berkeley algorithm 10

Figure 14.2 Clock synchronization using a time server m r p m t Time server,s

Synchronizing physical clocks using NTP NTP (Network Time Protocol): Synchronize clocks of client nodes on the internet with UTC Employs statistical techniques to factor in network latency Other features: Can survive lengthy losses of connectivity Enables clients to synchronize frequently to offset drift Protects against interference with time services Three modes of synchronization: Multicast mode (on a high speed LAN) Procedure call mode (similar to Christian s algorithm) Symmetric mode (used by time servers) 12

Figure 14.3 An example synchronization subnet in an NTP implementation 1 2 2 3 3 3 Note: Arrows denote synchronization control, numbers denote strata.

Figure 14.4 Messages exchanged between a pair of NTP peers Server B T i-2 T i-1 Time m m' Server A T i- 3 T i Time

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging 15

Logical time and logical clocks Ordering events that occur in different processes (nodes): Within each process pi events are ordered (->i) Sending a message from one process occurs before the message is received at another process Happened-before relationship -> based on above two observations: If e ->i e, then e -> e For any message m, send(m) -> receive(m) If e -> e and e -> e, then e -> e In next figure: a -> b, c -> d (same process); b -> c, d - > f (send/receive) a e (concurrent; cannot tell which occurred first) 16

Figure 14.5 Events occurring at three processes

Logical time and logical clocks Logical clock (based on Lamport timestamp): Timestamp of event e at pi denoted Li(e) Timestamp of event e at p denoted L(e) Timestamp rules: Li := Li + 1 after each event at pi When pi sends m it piggybacks timestamp t of send as t = Li and sends (m, t) When pj receives (m, t) it computes Lj = max (Lj, t) then increments Lj as timestamp of receive 18

Figure 14.6 Lamport timestamps for the events shown in Figure 14.5

Logical time and logical clocks Totally ordered logical clock: Events can have same clock at two processes pi, pj By appending process id (t, i), a total order is achieved Order of processes not significant Vector clocks: Keep a vector Vi[j], j = 1, 2,, n of all time points of processes that have communicated with pi Process vector always piggybacked with a sent message Receiving process can use piggybacked vector to update its vector clock 20

Figure 14.7 Vector timestamps for the events shown in Figure 14.5

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states (skip) Distributed debugging 22

Figure 14.8 Detecting global properties

Figure 14.9 Cuts e 1 0 e 1 1 e 1 2 e 1 3 p 1 m 1 m 2 p 2 e 2 0 e 2 1 e 2 2 Physical time Inconsistent cut Consistent cut

Figure 14.10 Chandy and Lamport s snapshot algorithm Marker receiving rule for process p i On p i s receipt of a marker message over channel c: if (p i has not yet recorded its state) it records its process state now; records the state of c as the empty set; turns on recording of messages arriving over other incoming channels; else p i records the state of c as the set of messages it has received over c since it saved its state. end if Marker sending rule for process p i After p i has recorded its state, for each outgoing channel c: p i sends one marker message over c (before it sends any other message over c).

Figure 14.11 Two processes and their initial states

Figure 14.12 The execution of the processes in Figure 14.11

Figure 14.13 Reachability between states in the snapshot algorithm actual execution e 0,e 1,... S init recording begins ' recording ends S final S snap pre-snap: e' 0,e ' 1,...e ' R-1 post-snap: e ' R,e ' R+1,...

Overview of Chapter Introduction Clocks, events, process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging (skip) 29

Figure 14.14 Vector timestamps and variable values for the execution of Figure 14.9 p 1 (1,0) (2,0) (3,0) (4,3) x 1 = 1 x 1 = 100 x 1 = 105 x 1 = 90 m 1 m 2 p 2 x 2 = 100 x 2 = 95 x 2 = 90 (2,1) (2,2) (2,3) Cut C 1 Cut C 2 Physical time

Figure 14.15 The lattice of global states for the execution of Figure 14.14 Level 0 S 00 1 S 10 2 3 S 30 S 20 S 21 Sij = global state after i events at process 1 and j events at process 2 4 S 31 S 22 5 S 32 S 23 6 S 33 7 S 43

Figure 14.16 Algorithms to evaluate possibly and definitely

Figure 14.17 Evaluating definitely