Distributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson

Similar documents
Distributed Systems 8L for Part IB

Distributed Computing. Synchronization. Dr. Yingwu Zhu

Time in Distributed Systems: Clocks and Ordering of Events

Time. To do. q Physical clocks q Logical clocks

Time. Today. l Physical clocks l Logical clocks

Distributed Systems. Time, Clocks, and Ordering of Events

Clock Synchronization

CS505: Distributed Systems

416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017

DISTRIBUTED COMPUTER SYSTEMS

Agreement. Today. l Coordination and agreement in group communication. l Consensus

Slides for Chapter 14: Time and Global States

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG

Chapter 11 Time and Global States

Absence of Global Clock

Distributed Systems. Time, clocks, and Ordering of events. Rik Sarkar. University of Edinburgh Spring 2018

Time is an important issue in DS

CS 347 Parallel and Distributed Data Processing

CS505: Distributed Systems

Clock Synchronization

Clocks in Asynchronous Systems

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Time, Clocks, and the Ordering of Events in a Distributed System

Distributed Systems Principles and Paradigms

Recap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Distributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017

7680: Distributed Systems

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

TSCCLOCK: A LOW COST, ROBUST, ACCURATE SOFTWARE CLOCK FOR NETWORKED COMPUTERS

Causality and physical time

Today. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks

Distributed Systems Time and Global State

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

MODELING TIME AND EVENTS IN A DISTRIBUTED SYSTEM

A Methodology for Clock Benchmarking

Distributed Systems Fundamentals

arxiv: v1 [cs.dc] 22 Oct 2018

Lecture 8: Time & Clocks. CDK: Sections TVS: Sections

Cuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H

CS5412: REPLICATION, CONSISTENCY AND CLOCKS

Accurate Event Composition

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Approximate Synchrony: An Abstraction for Distributed Time-Synchronized Systems

Distributed Algorithms Time, clocks and the ordering of events

A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

Lecture 6: Time-Dependent Behaviour of Digital Circuits

Causality & Concurrency. Time-Stamping Systems. Plausibility. Example TSS: Lamport Clocks. Example TSS: Vector Clocks

Time Synchronization between SOKUIKI Sensor and Host Computer using Timestamps

CptS 464/564 Fall Prof. Dave Bakken. Cpt. S 464/564 Lecture January 26, 2014

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

Causality and Time. The Happens-Before Relation

Chandy-Lamport Snapshotting

Cryptography in a quantum world

Building a Computer Adder

Consequences of special relativity.

Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS. 1. Introduction. Svein Willassen

TTA and PALS: Formally Verified Design Patterns for Distributed Cyber-Physical

Consequences of special relativity.

Sequential Logic (3.1 and is a long difficult section you really should read!)

Lecture 6. Winter 2018 CS 485/585 Introduction to Cryptography. Constructing CPA-secure ciphers

Bits. Chapter 1. Information can be learned through observation, experiment, or measurement.

1 Computational problems

Seeking Fastness in Multi-Writer Multiple- Reader Atomic Register Implementations

Estimation of clock offset from one-way delay measurement on asymmetric paths

CPSC 467: Cryptography and Computer Security

Implementation of the IEEE 1588 Precision Time Protocol for Clock Synchronization in the Radio Detection of Ultra-High Energy Neutrinos

Lecture 2 - Length Contraction

Today s Agenda: 1) Why Do We Need To Measure The Memory Component? 2) Machine Pool Memory / Best Practice Guidelines

CORRECTNESS OF A GOSSIP BASED MEMBERSHIP PROTOCOL BY (ANDRÉ ALLAVENA, ALAN DEMERS, JOHN E. HOPCROFT ) PRATIK TIMALSENA UNIVERSITY OF OSLO

DCS: Distributed Asynchronous Clock Synchronization in Delay Tolerant Networks

CSC 5170: Theory of Computational Complexity Lecture 4 The Chinese University of Hong Kong 1 February 2010

Time Synchronization

CPSC 467: Cryptography and Computer Security

Notes on BAN Logic CSG 399. March 7, 2006

cs/ee/ids 143 Communication Networks

PHYSICS 107. Lecture 10 Relativity: The Postulates

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA)

Quadratic Equations Part I

Integrating External and Internal Clock Synchronization. Christof Fetzer and Flaviu Cristian. Department of Computer Science & Engineering

Lan Performance LAB Ethernet : CSMA/CD TOKEN RING: TOKEN

MESSAGE AUTHENTICATION CODES and PRF DOMAIN EXTENSION. Mihir Bellare UCSD 1

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 3

Absolute motion versus relative motion in Special Relativity is not dealt with properly

The Real-Time Clocks of the 71M65XX Metering ICs

2DI90 Probability & Statistics. 2DI90 Chapter 4 of MR

Analysis of Bounds on Hybrid Vector Clocks

Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour

CSCI 2150 Intro to State Machines

Module - 19 Gated Latches

INTERVALOMETER POWER CONSUMPTION

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

MAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at

I Can t Believe It s Not Causal! Scalable Causal Consistency with No Slowdown Cascades

Improper Nesting Example

Causal Broadcast Seif Haridi

Distributed Mutual Exclusion Based on Causal Ordering

Real Time Operating Systems

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: NP-Completeness I Date: 11/13/18

Transcription:

Distributed systems Lecture 4: Clock synchronisation; logical clocks Dr Robert N. M. Watson 1

Last time Started to look at time in distributed systems Coordinating actions between processes Physical clocks tick based on physical processes (e.g. oscillations in quartz crystals, atomic transitions) Imperfect, so gain/lose time over time (wrt nominal perfect reference clock (such as UTC)) The process of gaining/losing time is clock drift The difference between two clocks is called clock skew Clock synchronization aims to minimize clock skew between two (or a set of) different clocks 2

From last lecture The clock synchronization problem In distributed systems, we d like all the different nodes to have the same notion of time, but quartz oscillators oscillate at slightly different frequencies (time, temperature, manufacture) Hence clocks tick at different rates: create ever-widening gap in perceived time this is called clock drift The difference between two clocks at a given point in time is called clock skew Clock synchronization aims to minimize clock skew between two (or a set of) different clocks 3

Dealing with drift A clock can have positive or negative drift with respect to a reference clock (e.g. UTC) Need to [re]synchronize periodically Can t just set clock to correct time Jumps (particularly backward!) can confuse apps Instead aim for gradual compensation If clock fast, make it run slower until correct If clock slow, make it run faster until correct 4

Compensation Most systems relate real-time to cycle counters or periodic interrupt sources E.g. calibrate CPU Time-Stamp Counter (TSC) against CMOS Real-Time Clock (RTC) at boot, and compute scaling factor (e.g. cycles per ms) Can now convert TSC differences to real-time Similarly can determine how much real-time passes between periodic interrupts: call this delta On interrupt, add delta to software real-time clock Making small changes to delta gradually adjusts time Once synchronized, change delta back to original value (Or try to estimate drift & continually adjust delta) Minimise time discontinuities from stepping 5

Obtaining accurate time Of course, need some way to know correct time (e.g. UTC) in order to adjust clock! could attach a GPS receiver (or GOES receiver) to computer, and get ±1ms (or ±0.1ms) accuracy but too expensive/clunky for general use (RF in server rooms and data centres non-ideal) Instead can ask some machine with a more accurate clock over the network: a time server e.g. send RPC gettime() to server What s the problem here? 6

Cristian s Algorithm (1989) client T 0 T 1 server request T s reply time Attempt to compensate for network delays Remember local time just before sending: T 0 Server gets request, and puts T s into response When client receives reply, notes local time: T 1 Correct time is then approximately (T s + (T 1 - T 0 ) / 2) (assumes symmetric behaviour...) 7

Cristian s Algorithm: Example T 0 08:02:01.670 C S 08:02:04.325 T s Time T 1 08:02:02.130 C RTT = 460ms, so one way delay is [approx] 230ms. Estimate correct time as (08:02:04.325 + 230ms) = 08:02:04.555 Client gradually adjusts local clock to gain 2.425 seconds 8

Berkeley Algorithm (1989) Don t assume have an accurate time server Try to synchronize a set of clocks to the average One machine, M, is designated the master M periodically polls all other machines for their time (can use Cristian s technique to account for delays) Master computes average (including itself, but ignoring outliers), and sends an adjustment to each machine Avg = (01:17+01:12+02:01)/3 08:01:17 M = (04:30/3) = 01:30 M +00:00:13 08:02:01-00:00:31 A B C A B C 9

Network Time Protocol (NTP) Previous schemes designed for LANs; in practice today s systems use NTP: Global service designed to enable clients to stay within (hopefully) a few ms of UTC Hierarchy of clocks arranged into strata Stratum0 = atomic clocks (or maybe GPS, GEOS) Stratum1 = servers directly attached to stratum0 clock Stratum2 = servers that synchronize with stratum1 and so on Timestamps made up of seconds and fraction e.g. 32 bit seconds-since-epoch; 32 bit picoseconds 10

NTP algorithm client T 0 T 3 server request T 1 T 2 reply time UDP/IP messages with slots for four timestamps systems insert timestamps at earliest/latest opportunity Client computes: Offset O = ((T 1 -T 0 ) + (T 2 -T 3 )) / 2 Delay D = (T 3 -T 0 ) (T 2 -T 1 ) Measured difference in average timestamps: (T1+T2)/2 (T0+T3)/2 Estimated two-way communication delay minus processing time Relies on symmetric messaging delays to be correct (but now excludes variable processing delay at server) 11

NTP example client server 02 03 04 05 06 07 08 09 10 11 12 13 request reply 35 36 37 38 39 40 41 42 43 44 45 46 time First request/reply pair: Total message delay is ((6-3) - (38-37)) = 2 Offset is ((37-3) + (38-6)) / 2 = 33 Second request/reply pair: Total message delay is ((13-8) - (45-42)) = 2 Offset is ((42-8) + (45-13)) / 2 = 33 12

NTP: additional details (1) NTP uses multiple requests per server Remember <offset, delay> in each case Calculate the filter dispersion of the offsets & discard outliers Chooses remaining candidate with the smallest delay NTP can also use multiple servers Servers report synchronization dispersion = estimate of their quality relative to the root (stratum 0) Combined procedure to select best samples from best servers (see RFC 5905 for the gory details) 13

NTP: additional details (2) Various operating modes: Broadcast ( multicast ): server advertises current time Client-server ( procedure call ): as described on previous Symmetric: between a set of NTP servers Security is supported Authenticate server, prevent replays Cryptographic cost compensated for 14

Physical clocks: summary Physical devices exhibit clock drift Even if initially correct, they tick too fast or too slow, and hence time ends up being wrong Drift rates depend on the specific device, and can vary with time, temperature, acceleration, Instantaneous difference between clocks is clock skew Clock synchronization algorithms attempt to minimize the skew between a set of clocks Decide upon a target correct time (atomic, or average) Communicate to agree, compensating for delays In reality, will still have 1-10ms skew after sync ;-( 15

Ordering One use of time is to provide ordering If I withdrew 100 cash at 23:59.44 And the bank computes interest at 00:00.00 Then interest calculation shouldn t include the 100 But in distributed systems we can t perfectly synchronize time => cannot use this for ordering Clock skew can be large, and may not be trusted And over large distances, relativistic events mean that ordering depends on the observer (similar effect due to finite speed of Internet ;-) 16

The happens-before relation Often don t need to know when event a occurred Just need to know if a occurred before or after b Define the happens-before relation, a b If events a and b are within the same process, then a b if a occurs with an earlier local timestamp Messages between processes are ordered causally, i.e. the event send(m) the event receive(m) Transitivity: i.e. if a b and b c, then a c Note that this only provides a partial order: Possible for neither a b nor b a to hold We say that a and b are concurrent and write a ~ b 17

Example P1 P2 P3 a?? e b m 1 c?? physical time Three processes (each with 2 events), and 2 messages Due to process order, we know a b, c d and e f Causal order tells us b c and d f And by transitivity a c, a d, a f, b d, b f, c f However event e is concurrent with a, b, c and d d m 2 f 18

Implementing Happens-Before One early scheme due to Lamport [1978] Each process P i has a logical clock L i L i can simply be an integer, initialized to 0 L i is incremented on every local event e We write L i (e) or L(e) as the timestamp of e When P i sends a message, it increments L i and copies the value into the packet When P i receives a message from P j, it extracts L j and sets L i := max(l i,l j ), and then increments L i Guarantees that if a b, then L(a) < L(b) However if L(x) < L(y), this doesn t imply x y! 19

Lamport Clocks: Example P1 P2 P3 1 2 a e b m 1 3 4 c physical time When P 2 receives m 1, it extracts timestamp 2 and sets its clock to max(0, 2) before increment Possible for events to have duplicate timestamps e.g. event e has the same timestamp as event a If desired can break ties by looking at pids, IP addresses, this gives a total order, but doesn t imply happens-before! d m 2 1 5 f 20

Vector clocks With Lamport clocks, given L(a) and L(b), we can t tell if a b or b a or a ~ b One solution is vector clocks: An ordered list of logical clocks, one per-process Each process P i maintains V i [], initially all zeroes On a local event e, P i increments V i [i] If the event is message send, new V i [] copied into packet If P i receives a message from P j then, for all k = 0, 1,, it sets V i [k] := max(v j [k], V i [k]), and increments V i [i] Intuitively V i [k] captures the number of events at process P k that have been observed by P i 21

Vector clocks: example P1 P2 P3 (1,0,0) a e (2,0,0) b m 1 (2,1,0) (2,2,0) c physical time When P 2 receives m 1, it merges the entries from P 1 s clock choose the maximum value in each position Similarly when P 3 receives m 2, it merges in P 2 s clock this incorporates the changes from P 1 that P 2 already saw Vector clocks explicitly track the transitive causal order: f s timestamp captures the history of a, b, c & d d m 2 (0,0,1) (2,2,2) f 22

Using vector clocks for ordering Can compare vector clocks piecewise: V i = V j iff V i [k] = V j [k] for k = 0, 1, 2, V i V j iff V i [k] V j [k] for k = 0, 1, 2, V i < V j iff V i V j and V i V j V i ~ V j otherwise For any two event timestamps T(a) and T(b) if a b then T(a) < T(b) ; and if T(a) < T(b) then a b Hence can use timestamps to determine if there is a causal ordering between any two events i.e. determine whether a b, b a or a ~ b e.g. [2,0,0] versus [0,0,1] Does this seem familiar? Recall Time-Stamp Ordering and Optimistic Concurrency Control for transactions last term. 23

Summary + next time (ironically) The clock synchronisation problem Cristian s Algorithm, Berkeley Algorithm, NTP Logical time via the happens-before relation Vector clocks More on vector clocks Consistent cuts Group communication Enforcing ordering vs. asynchrony Distributed mutual exclusion 24