A subtle problem. An obvious problem. An obvious problem. An obvious problem. No!

A subtle problem An obvious problem when LC = t do S doesn t make sense for Lamport clocks! there is no guarantee that LC will ever be S is anyway executed after LC = t Fixes: if e is internal/send and LC = t 2 execute e and then S if e = receive(m) (TS(m) t) (LC t 1) put message back in channel re-enable e ; set LC = t 1; execute S t t ss No! Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks An obvious problem An obvious problem t ss No! t ss No! Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... Choose Ω large enough that it cannot be reached by applying the update rules of logical clocks mmmmhhhh... Doing so assumes upper bound on message delivery time upper bound relative process speeds We better relax it...

Snapshot II Relaxing synchrony processor selects Ω sends take a snapshot at Ω to all processes; it waits for all of them to reply and then sets its logical clock to Ω p i when clock of reads Ω then records its local state σ i sends an empty message along its outgoing channels starts recording messages received on each incoming channel stops recording a channel when receives first message with timestamp greater than or equal to Ω p i p i take a snapshot at Ω Process does nothing for the protocol during this time! empty message: TS(m) Ω records local state σ i sends empty message: TS(m) Ω Use empty message to announce snapshot! monitors channels Snapshot III Snapshots: a perspective processor sends itself take a snapshot when receives take a snapshot for the first time from : p i records its local state sends take a snapshot along its outgoing channels sets channel from p j σ i to empty starts recording messages received over each of its other incoming channels p j The global state saved by the snapshot protocol is a consistent global state when receives take a snapshot beyond the first time from : p i stops recording channel from p i p k when has received take a snapshot on all channels, it sends collected state to and stops. p k

Snapshots: a perspective Snapshots: a perspective The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred The global state saved by the snapshot protocol is a consistent global state But did it ever occur during the computation? a distributed computation provides only a partial order of events many total orders (runs) are compatible with that partial order all we know is that could have occurred We are evaluating predicates on states that may have never occurred!

Σ 10 Σ 11 Σ 11

Σ 12 Σ 21 Σ 12 Σ 21 Σ 22 Σ 12

Σ 21 Σ 22 Σ 12 Σ 21 Σ 22 Σ 12 Σ 42 Σ 42 Σ 21 Σ 22 Σ 12 Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 22 Σ 43 Σ 33 Σ 54 Σ 65 Σ 12 Σ 13 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 55 Σ 45 Σ 03 Σ 04 Σ 14

Reachability Reachability is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 54 Σ 45 Σ 55 Σ 65 Σ 03 Σ 04 Σ 14 is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 54 Σ 45 Σ 55 Σ 65 Σ 03 Σ 04 Σ 14 Reachability Reachability is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 54 Σ 55 Σ 65 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 45 Σ 03 Σ 04 Σ 14 is reachable from there is a path from in the lattice if to Σ 41 Σ 63 Σ 31 Σ 42 53 Σ Σ 64 Σ 21 Σ 12 Σ 22 Σ 13 Σ 43 Σ 33 Σ 54 Σ 55 Σ 65 Σ 23 Σ 24 Σ 34 Σ 44 Σ 35 Σ 45 Σ 03 Σ 04 Σ 14

So, why do we care about again? So, why do we care about again? Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Deadlock in implies deadlock in Σ f So, why do we care about again? Same problem, different approach Deadlock is a stable property Deadlock Deadlock If a run R of the snapshot protocol starts in Σ i and terminates in Σ f, then Σ i R Σ f Monitor process does not query explicitly Instead, it passively collects information and uses it to build an observation. (reactive architectures, Harel and Pnueli [1985]) Deadlock in No deadlock in implies deadlock in Σ f implies no deadlock in Σ i An observation is an ordering of event of the distributed computation based on the order in which the receiver is notified of the events.

Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications e 1 1 e 1 1 e 2 1 Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications e 1 1 e 2 1 e 1 1 e 2 1 To obtain a run, messages must be delivered to the monitor in FIFO order

Observations: a few observations An observation puts no constraint on the order in which the monitor receives notifications Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) e 1 1 e 2 1 To obtain a run, messages must be delivered to the monitor in FIFO order What about consistent runs? Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event deliver event p 3

Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event m send event receive event deliver event deliver event m p 3 p 3 Causal delivery Causal delivery FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) FIFO delivery guarantees: send i (m) send i (m ) deliver j (m) deliver j (m ) Causal delivery generalizes FIFO: send i (m) send k (m ) deliver j (m) deliver j (m ) m send event receive event m send event receive event deliver event deliver event m m p 3 1 p 3 1 2

Causal Delivery in Synchronous Systems Causal Delivery in Synchronous Systems We use the upper bound message delivery time on We use the upper bound message delivery time on DR1: At time t, delivers all messages it received with timestamp up to t in increasing timestamp order Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. 1

Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. Causal Delivery with Lamport Clocks DR1.1: Deliver all received messages in increasing (logical clock) timestamp order. 1 4 Should deliver? 1 4 Should deliver? Wouldn;t one see LC 2 and 3? If so, I should know when I can deliver... Or is it that there my be multiple 4s? Problem: Lamport Clocks don t provide gap detection Given two events e and e and their clock values LC(e) and LC(e ) where LC(e) < LC(e ) determine whether some event e exists s.t. LC(e) < LC(e ) < LC(e ) Stability Implementing Stability DR2: Deliver all received stable messages in increasing (logical clock) timestamp order. Real-time clocks wait for time units A message m received by p is stable at p if will never receive a future message m s.t. TS(m ) <TS(m) p

Implementing Stability Clocks and STRONG Clocks Real-time clocks wait for time units Lamport clocks implement the clock condition: e e LC(e) <LC(e ) Lamport clocks wait on each channel for m Design better clocks! s.t. TS(m) >LC(e) We want new clocks that implement the strong clock condition: e e SC(e) <SC(e ) Causal Histories Causal Histories The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} e 1 1 e 2 1 e 3 1 e 4 1 e 5 1 p 3 e 1 3 e 2 3 e 3 3 e 4 3

Causal Histories How to build θ(e) The causal history of an event e in (H, ) is the set θ(e) ={e H e e} {e} p 3 e 1 1 e 2 1 e 3 1 e 4 1 e 5 1 e 1 3 e 2 3 e 3 3 e 4 3 Each process : initializes θ : if e k i e k i p i θ := is an internal or send event, then θ(e k i ):={e k i } θ(e k 1 i ) if is a receive event for message m, then θ(e k i ):={e k i } θ(e k 1 i ) θ(send(m)) e e θ(e) θ(e ) Pruning causal histories Prune segments of history that are known to all processes (Peterson, Bucholz and Schlichting) Use a more clever way to encode θ(e)