Clock Synchronization in the Presence of. Omission and Performance Failures, and. Processor Joins. Flaviu Cristian, Houtan Aghili and Ray Strong

Size: px
Start display at page:

Download "Clock Synchronization in the Presence of. Omission and Performance Failures, and. Processor Joins. Flaviu Cristian, Houtan Aghili and Ray Strong"

Transcription

1 Clock Synchronization in the Presence of Omission and Performance Failures, and Processor Joins Flaviu Cristian, Houtan Aghili and Ray Strong IBM Research Almaden Research Center Abstract This paper presents a simple practical protocol for synchronizing clocks in a distributed system. Synchronization consists of maintaining logical clocks which run at roughly the speed of a correct hardware clock and are within some known bound of each other. Synchronization is achieved by periodically computing adjustments to the hardware clocks present in the system. The protocol is tolerant of any number of omission failures (e.g. processor crashes, link crashes, occasional message losses) and performance failures (e.g. overloaded processors, slow links) that do not partition the communications network and handles any number of simultaneous processor joins. An earlier version of this paper was ppresented at the 16th IEEE Int. Symp. on Fault-tolerant Computing Systems, Vienna, July 1-4, Flaviu Cristian is now with the University of California, San Diego. Houtan Aghili is now with IBM Research, T. J. Watson Research Center. 1

2 1 Introduction Consider a set of processors interconnected in a distributed system to perform certain distributed computations, where each processor is equipped with a hardware clock. If one wants to measure the time elapsed between the occurrences of two events of a computation local to a processor p, one can instantaneously read the values of the local hardware clock register when these events occur and compute their dierence. A dierent method must be devised if the intention is to measure the time elapsed between two events of a distributed computation. For instance, if an event e p occurs on a processor p and another event e q occurs on a dierent processor q, it is practically impossible for q to instantaneously read its hardware clock when the remote event e p occurs. Indeed, the sending of a message from p to q that noties the occurrence of e p entails a random transmission delay that makes it impossible for q to exactly know the value displayed by q's hardware clock at the instant when e p occurred in p. Hence, q (and by a similar argument p) cannot compute exactly the time elapsed between e p and e p by relying only on their own clocks. The problem of measuring the time elapsed between occurrences of distributed events would be easily solved if the processor clocks could be exactly synchronized. That is, if at any instant of a postulated Newtonian time referential, all clocks would read the same value. It would then be sucient for p to send the value of its local clock reading when e p occurs to q, and for q to subtract this from the value of its clock reading when e q occurs. Unfortunately, the uncertainty on message transmission delays inherent in distributed systems makes exact clock synchronization impossible. This paper presents a protocol for maintaining approximately synchronized clocks in a loosely coupled distributed system. The protocol maintains on each correctly functioning processor p a logical clock C p which measures the passage of real time with an accuracy comparable to that of a hardware clock and which, at any instant t, displays a clock time C p (t) that is within some known bound DMAX from the clock times displayed by all other logical clocks C q running on other correct system processors q: 8 t : jc p (t)? C q (t)j < DMAX Such logical clocks allow one to measure with an a priori known accuracy the time that elapses between events which occur on distinct processors. 2

3 A logical clock is maintained by periodically computing an adjustment function for a hardware clock. We refer the reader to discussions of various methods for maintaining smooth logical clocks by amortizing adjustments in [DHSS] and [CS]. One of our main objectives was to design a protocol that is practical. Such a protocol should work in the presence of events that are reasonably likely to occur, (e.g. processor crashes and joins, message losses or delays in communications or processing) yet be simple enough to be understandable, correctly implementable, and maintainable. 2 Failure Classication Let P denote the set of processors of a distributed system and L the set of physical links between the processors in P. Processors and links undergo certain state transitions in response to service requests. For example, a link (p,q)l that joins processor pp to processor qp delivers a message to q if p so requests. Similarly, a processor p computes a given function if its user so requests. System components, such as processors and communication links, are correct if, to any incoming service request, they respond in a manner consistent with their specication. Service specications prescribe the state transition that a component should undergo in response to a service request and the real time interval within which the transition should occur. A component failure occurs when a component does not deliver the service requested in the manner specied. We distinguish among three general failure classes [CASD]. If the component never responds to a service request it suers an omission failure. If the component delivers a requested service either too early or too late (i.e. outside the real time interval specied) it suers a timing failure. We call late timing failures performance failures. If a component delivers a service dierent from the one requested or delivers unrequested "services" it suers an arbitrary or Byzantine failure. Typical examples of omission failures are processor crashes, hardware clocks that stop running, link breakdowns, processors that occasionally do not relay messages they should, and links that sometimes lose messages. Examples of performance failures are occasional message delays caused by overloaded relaying processors, hardware clocks running at a speed lower than (1+)?1, where is the maximum drift specied by the clock manufacturer. An example of an early timing failure is a hardware clock that runs at a speed that exceeds 1+. Examples of Byzantine failure are an undetectable message corruption on a link, due to electro-magnetic noise or human sabotage, or a 3

4 processor that sends two messages "the time is 10:00am" and "the time is 11:00am" when the correct time is midnight. The system components that suer failures are eventually taken o the system by maintenance personnel for repair or replacement. After maintenance such components join the system of active components by explicitely executing a join protocol. 3 Assumptions A hardware clock consists of an oscillator, which generates a cyclic waveform at a uniform rate, and a counting register, which records the number of cycles elapsed since the start of the clock. We assume that the hardware clocks of the processors that compose the system are driven by quartz crystal controlled oscillators that are highly stable and accurate. The use of this type of clock is very common in modern computer architectures (e.g. the IBM 4300, 3080 and 3090 series). Typically, a correct quartz clock may drift from real time at a rate of at most, where is on the order of 10?6 seconds per second. Quartz clocks are not only highly stable, but are also extremely reliable. Current experience indicates that the average quartz clock that is used in medium to highend digital computers has a mean time between failures (MTBF) in excess of years, and that good clocks, like those used in military applications, can have MTBF's expressed in hundreds of years [MIL]. Many of the clock failures likely to occur in practice can be detected by the error detecting circuitry incorporated in clock chips. For example, if the counting register that composes a clock is self-checking, the occurrence of a physical failure within it will generate (with high probability) a clock-error exception. If a detectable physical failure aects a hardware clock, any attempt at reading its value terminates with a clock-error exception [IBM370]. Given the very signicant MTBFs observed for current quartz based clock chips, and the extensive error detecting circuitry built in such chips, in this paper we will assume that the likelihood of undetectable clock failure occurrences is negligible compared to other sources of system failures (for a precise interpretation of what "negligible" means, we refer to [Cr]). Let HC(t) denote the value displayed by a hardware clock HC at some real time t. (As in [LM,DHSS], we write the variables and constants that range over real time in lower case, and the variables and constants ranging over clock time in upper case.) We can formulate our assumption concerning the high reliability of a hardware clock by saying that a hardware clock is within a linear envelope of real time (which runs 4

5 by denition with speed 1): (A1) After it is powered on, the hardware clock HC of a processor measures the passage of time between any two successive real time instants t 1, t 2 correctly: (1 + )?1 (t 2? t 1 )? G < HC(t 2 )? HC(t 1 ) < (1 + )(t 2? t 1 ) + G: A clock that fails signals a clock-error exception whenever an attempt at reading its value is made. G is a constant depending on the granularity of the hardware clock. For simplicity in what follows, we will assume G = 0. Given that by hypothesis (A1) correct clocks drift from real time by at most, one can infer that the rate at which two correct clocks can drift apart from each other during t real time units is at most (1+)t?(1+)?1 t = (2+)=(1+)t. We denote by dr (2 + )=(1 + ) (relative clock drift rate) the factor which when multiplied by a real time interval length gives the net amount by which hardware clocks could drift apart in the worst case during that time interval. The next assumption is not necessary for the correct functioning of our algorithm, but it simplies computing performance estimates. (A2) Let N be the maximum number of processors participating in the protocol. Then the rate of drift is suciently small that 3N(2 + 2 ) < 1: The next assumption concerns the normal speed at which messages can be sent over correct links between two processes running on adjacent correct processors: (A3) A message sent from a correct processor p to a correct processor q over a correct link (p,q) arrives at q and is processed at q in less than ldel (link delay) real time units. If a message sent from p to a neighbor q needs more than ldel real time units to arrive at q, or never arrives, then at least one of the processors p,q or the link (p,q) has experienced a failure. The fourth assumption states that, during clock synchronization, any two correct active processors in P are linked by at least one chain of correct links and correct intermediate processors. That is, 5

6 (A4) No partition of the system of correct processors and links occurs during clock synchronization. If suciently many redundant physical communication paths exist between any two processors in a network, the likelyhood of this hypothesis being violated can be made negligible. Assumptions (A3,A4) allow us to conclude that, if a synchronization message is sent by a correct processor p to a correct processor q, then there is a chain of correct links and intermediate processors over which the message can require no more than ndel = (N? 1) ldel (network delay) real time units between p and q, where N is the maximum number of processors in the system. (For a better, but more complex, upper bound on the network delay which would guarantee a closer synchronization of clocks, the interested reader is referred to [CASD].) Our protocol is designed to tolerate omission and performance failures that do not partition the network of correct processors. We acknowledge the possibility that other types of failures (e.g. very fast clocks, or sabotaged processors). Such failures can in principle occur, and there are several more complex protocols that have been designed to handle them [LM,LL,DHSS]. We have chosen improved simplicity and performance at the expense of a chance of loss of synchronization in the presence of these rare failure types. Thus, rather than aiming at generality and power, the goal was to favor simplicity and practicality and to aim for those applications where the likelihood of early timing or Byzantine clock failures causing major damage is negligible. Our intention is to develop a protocol that can handle the overwhelming majority of failures that are likely to occur in practice, yet is simple to understand, prove correct, implement, and maintain. 4 Objectives The goal of the clock synchronization protocol is to ensure the following three properties: (C1) For any two correct joined processors pq P, the clocks C p, C q indicating the current logical time should be synchronized within some a priori known bound DMAX (for maximum deviation): 9 DMAX : 8 t : jc p (t)? C q (t)j < DMAX 6

7 (C2) The clocks of joined correct processors should read times within a linear envelope of real time. That is, there should exist a constant such that for any clock C p and any real time t: X + (1 + )?1 t < C p (t) < X 0 + (1 + )t; where X, X 0 are constants which depend on the initial conditions of the clock synchronization algorithm execution. (C3) A correct processor that joins a set of synchronized processors, should have its clock synchronized with those of the other correct processors within some a priori known real time delay jdel (join delay). Also, in the absence of processor joins, it is required that each periodic clock synchronization terminate within some known real time delay sdel (synchronization delay). A protocol that achieves (C1,C2) is said to achieve linear envelope clock synchronization. 5 Informal Algorithm Presentation Our algorithm is based on information diusion [CASD], [DHSS]. It is simpler than [DHSS] because we limit the class of failures to be tolerated to omission and performance failures. It also uses a simpler method for handling processor joins. The protocol to be presented is based on the following consequence of assumptions (A3,A4): If at real time t a correct processor pp diuses a message containing its clock time T to all other processors, and each correct processor qp sets its clock to T upon receipt of a message from p, then the clocks of all correct processors will indicate times within ndel(1 + ) by real time t + ndel. By message diusion we mean the following process: processor p sends a new synch message on all outgoing links and any processor q that receives a new synch message 7

8 on some link, relays it on all other outgoing links. We call such a synchronization message diusion a synch wave. An informal (not quite accurate) picture of how our protocol works is provided by assuming that one clock is suciently faster than all others so that its time messages diuse in a synch wave causing each other correct clock to set its time ahead to match the time of the wave. In this section we make this assumption. In the following two sections we give a more formal and more accurate description and proof of correctness of our protocol. Although, immediately after a synch wave propagation the clocks of all correct processors are within NDEL=ndel (1 + ), as time passes, clocks will naturally tend to drift apart. For instance, t real time units after the end of a synch wave propagation, the correct clocks might be as far apart as ndel(1 + ) + dr t. If the intention is to keep the processor clocks close at all times, one has to periodically re-synchronize the clocks. If PER is the clock synchronization period length (in clock time units) then in the interval between two successive synchronization waves numbered s, s+1, the clocks might drift as far apart as D = ndel(1 + ) + dr(p ER(1 + ) + ndel). It will be the role of the (s+1)th synchronization wave to bring the clocks back again within ndel(1 + ). In the absence of processor crashes or joins, one could use a predened synchronizer processor to generate synch waves. If processor crashes are likely, and they certainly are, the existence of a unique synchronizer becomes a single point of failure. As observed in [DHSS], it is better to distribute the role of synchronizer among all processors. The idea is that any processor should be able to initiate a synch wave if it discovers that PER clock time units have elapsed since the last synchronization occurred. If (as we assume for this section) one clock is suciently fast, then its synch wave will happen before any others and make the others unnecessary. Synch waves also have to be generated when new processors join a cluster of already synchronized processors, in order to synchronize the clocks of the new processors with the clocks of the old processors. In such a case a joiner p will send a special \new" message to all its neighbors, forcing them to initiate synch waves. The neighbors of these neighbors either propagate these waves, if their clocks are slower than the clocks of the wave initiators, or generate new synch waves, if their clocks are faster. After at most ndel real time units, a 'winning' synch wave is generated in this way by some processor with a fastest clock. When this propagates to all the other processors, including the ones who are joining, they will all synchronize their clocks within ndel (1 + ). Thus, within at most 2 ndel real time units from the moment a join demand is made by a processor pp, a winning synch wave is reected back to p. At 8

9 that moment, p is joined. That is, its clock is at most ndel(1 + ) apart from the clocks of previously joined correct processors. In the next section we give the protocol and in the following section we discuss how this informal discussion must be modied to provide for the case when there is no winning synch wave. 6 Detailed Algorithm Description A detailed description of the clock synchronization protocol is given in Figure 1. This description is made in terms of two abstract data types: Logical-Clock and Timer. Instances C and TP of these data types can be declared as shown in line 2 of Figure 2. Users of an instance C of the Logical-Clock data type can perform the following operations on it. An invocation of a C.initialize operation initializes the time displayed by C to 0. The operation C.adjust(L,T:Time) adjusts the local time L currently displayed by C so that after PER time units C will show the same time as a logical clock which currently shows time T (assuming that the clocks run at roughly the same speed). Such an adjustment can be implemented either by bumping the local clock to T, or by slightly increasing the speed of the local clock so as to catch up with the remote clock [C], [CS]. The operation C.read reads the current value displayed by C. The operation C.duration(T:Time), used to measure time intervals, reads the number of time units elapsed between a previous time T and the present time. The Timer data type has a unique operation \set(t:time)". If TP is a Timer instance, the meaning of invoking the operation TP.set(T) is \ignore all previous TP.set calls and signal a Timeout condition T clock time units from now." Thus, if after invoking TP.set(100) at time 200, a new invocation is made at time 250, there is no Timeout condition at time 300, but there might be one at time 350. If no other invocation of TP.set is made between 250 and 349, then a Timeout condition occurs at time 350. For convenience of presentation, we use two independent timers TP and TJ (although one is in principle sucient). The former is used to measure the maximum time interval which can elapse between periodic resynchronizations. The latter is used to time the join process. The protocol uses the following communication primitives: receive(m,l) which receives a message m on some link and returns the identity l of that link, forward(m,l) which sends a message m on all outgoing links except l, and send-all(m), that sends m on all 9

10 outgoing links. We do not assume that the forward and send operations are atomic with respect to failure occurrences, i.e. a processor can fail after sending a given message on certain links and before sending it on the remaining links. 1 task Time-Manager ::= 2 var L,T:Time; C:Logical-Clock; TP,TJ:Timer; 3 s,s': Natural-Number; joined:boolean; l:link; 4 s 0; C.initialize; joined false; 5 send-all(\new"); TJ.set(NDEL + T DEL); 6 cycle 7 select 8 receive(\new",l)! s s + 1; send-all(s,c.read); receive(s',t,l)! L C.read; 11 if 12 (s'<s) _ (s' = s&t L)! loop; (s'=s)&(t>l)! C.adjust(L,T); forward((s,t),l); (s'>s)&(t<l)! s s'; send-all(s,l); (s'>s)&(t L)! s s'; C.adjust(L,T); forward((s,t),l); 19 ; Timeout TJ! joined true; Timeout TP! s s+1; L C.read; C.adjust(L,L); send-all(s,l); 24 endselect; 25 TP.set(PER); 26 endcycle; Figure 1. At processor start, the local synch wave sequence number s and the current local clock time are initialized to 0 (line 4). Then, a join phase that lasts for NDEL + T DEL time units begins with the sending of a special \new" message on all outgoing links (line 5). As before N DEL = (1 + )ndel. The constant TDEL is slightly larger than NDEL. A real time duration of tdel is dened in the next section and T DEL = 10

11 (1 + )tdel. Under assumption A2 we may choose tdel = 2 ndel. This gives the particularly simple join delay of jdel = 3 ndel. During the join phase, the \joined" Boolean variable is false, and nothing can be said about how close the local clock is to other clocks of the cluster being joined. At the end of the join phase (line 21), \joined" becomes true and measurements of delays elapsed between distributed event occurrences can begin. The Time-manager can be awakened by three kind of events: a \new" message that arrives from a neighbor that joins (line 8), a message belonging to a synch wave numbered s' that announces that the time is T (line 10), and a Timeout condition generated by the timers TJ or TP (lines 21, 23). The reception of a \new" message results in an attempt to generate a `winning' wave with a new sequence number s'=s+1 and local time L=C.read. The Boolean tests executed by a processor when a message (s',t) belonging to such a wave is received ensure that either the processor forwards the wave (s',t) to all its neighbors (if T L holds, see lines 14,18), or that it will attempt itself to initiate a wave (if T<L is true, see line 16). In this way, a \new" message issued by a non-isolated processor causes the new sequence number s' to diuse to all correct processors within real time ndel. The Timeout condition can become true either at the end of the join phase (line 21) or if a period of more than PER time units (as measured on the local clock) has elapsed since the last join or periodic synchronization without receiving any \new"or signicant \(s',t)" message (line 23). In joined state, this event triggers the generation of a synch wave with a new sequence number s'=s+1 and local logical time L=C.read. If the new wave is winning (i.e. in all the processors reached by it the condition T L is true), by the end of its propagation every two correct processor will have their clocks within ndel(1 + ) time units. In general we will show that, in spite of concurrent synch waves none of which diuses throughout the network, the clocks of correct processors will be synchronized to within 3ndel(1 + ) time units. Then they will drift apart for at most (1 + )P ER real time units before they are resynchronized by the protocol. 7 Algorithm Analysis To prove the correctness of our protocol we use a technique similar to that of [DHSS]. Since the objective of our synchronization protocol is to provide tolerance of only omission and performance failures, our simpler protocol provides stronger properties than those of [DHSS] (in the sense that the accuracy of the logical clocks is not 11

12 worse - as is the case in [DHSS] - than that of a correct hardware clock, and that the maximum adjustment by which a logical clock can be set forward is a constant smaller then that of [DHSS]). Theorem 1. The algorithm achieves linear envelope synchronization among all joined clocks. We say that an execution of a communication protocol is a diusion for proposition X, if, when a processor rst knows (because of some change of state such as the receipt of a message or a new reading of a clock) the information contained in X, it forwards that information to its neighbors. It is easy to see that the maximum real time required for information to diuse throughout a maximal connected component of a network is bounded above by the product of the maximum time required for message transmission and processing between neighbors and the diameter of the component. Our upper bound on this real time is ndel. Let M(t) be the maximum time on any clock C p at real time t with p joined and correct. The theorem will be proved using the following lemmas. Lemma 1. There is a constant ndel such that if t is the rst real time any processor has s=i, then some processor r initiates a diusion by executing send-all(i,t) at t (where T is C r (t)), and all correct processors have s i by real time t+ndel. Proof: The lemma follows from the observation that our protocol diuses the information that s i. 2 Lemma 2. There is a constant tdel such that if any processor executes send-all(i,t) at real time t, then all correct processors have s i and C T by real time t+tdel. Proof: If each execution of send-all(i,t) constituted a diusion of the proposition s i and C T, then we could set tdel=ndel and be done. Unfortunately, if processor p already has s i when it receives (i,t), and if C p T at the time of receipt, then 12

13 p ignores and does not forward the information. Thus, while the information s i diuses in time ndel, the information C T may require a longer time to reach all processors. However, a correct processor p fails to forward the message (i,t) only if it has already sent or forwarded a message (i,t') with T'<T. We now prove (by induction on d) that if correct processor p executes send-all(i,t) at real time t, and if correct processor q is separated from p by a chain of correct processors and links with no more than d links, then q has C T by real time t + d(ldel+ndel) where 1?d abbreviates The case d = 1 is trivial because q receives and processes the message from p by real time t + ldel. Now assume that we have proved the result for correct r at distance d from p and consider a neighbor q of r at distance d + 1 from p. If r sends message (i; T ) to q then we are done, so assume that r sent a message (i; T 0 ) to q and does not adjust its clock again until after it has C = T. It will be helpful to distinguish the following real times: e 1 is the time at which r sends (i; T 0 ). e 2 is the time at which q processes (i; T 0 ) from r. e 3 is the time at which p sends (i; T ). e 4 is the time at which r has C T. e 5 is the time at which q has C T. By induction hypothesis we have u = e 4? e 3 d (ldel + ndel): 1? d By Lemma 1, v = e 3? e 1 ndel: The duration e 5? e 2 represents the same clock time T? T 0 on q that the duration e 4? e 1 represents on r, so one can be no larger than (1 + ) 2 = 1 + times the other: w = e 5? e2 (1 + )(u + v): Also, x = e 5? e 1 w + ldel: 13

14 Thus y = e 5? e 3 = x? v (1 + )u + ldel + v: Straightforward algebraic substitution gives y d + 1 1? d (ldel + ndel) d + 1 (ldel + ndel): 1? (d + 1) This completes the inductive proof. By assumption A2 we can then take tdel = 2ndel. 2 Lemma 3. There is a constant and a sequence of real times ft i g with 0 < t i +1=? t i < (1 + )P ER, such that if [t i ; t i + ] is a subinterval of the time interval in which processors p and q are both joined and correct, then C q (t i + ) C p (t i ). Proof: Let = tdel + ndel. Let t i be the rst real time at which some correct joined processor sets s=i. At t i some processor initiates a diusion propagating the information s i&l T, where (i,t) is the contents of the initial message. Within ndel this information will have reached every other correct joined processor including p. Let t be the real time at which p rst sets s i. Then t i t t i +ndel. Also, C p (t) C p (t i ). If T C p (t) then C q (t i +ndel) T C p (t i ). If T < C p (t) then C q (t+tdel) C p (t i ). In either case C q (t i + ndel + tdel) C p (t i ). 2 Lemma 4. If [u,v] is a subinterval of the interval in which processor p is joined and correct, then (1 + )?1 (v? u) C p (v)? C p (u) (1 + )(v? u) + M(u)? C p (u). Proof: Since clocks are never set back, we need only show that C p (v) (1 + )(v? u) + M(u). Because we are only considering omission failures, and because a message from a processor that has not violated the corresponding relationship cannot cause the recipient to violate it, the relationship C p (v) (1 + )(v? u) + M(u) holds for each correct processor p. 2 Proof of Theorem: 14

15 Assume that processor p is joined and correct during the interval [t i,t i+1 + ]. C p (t i + ) M(t i ). M(t i + ) M(t i ) + (1 + ) Thus M(t i + )? C p (t i + ) (1 + ). Consider t in [t i + ; t i+1 + ]. M(t)? C p (t) (1 + ) + (1 + )(t? (t i + ))? (1 + )?1 (t? (t i + )). Let DMAX = (1 + ) + (2 + )P ER. Then M(t)? C p (t) DMAX. Thus (1 + )?1 (v? u) C p (v)? C p (u) (1 + )(v? u) + DMAX, when [u,v] is a subinterval of the interval in which p is joined and correct. Moreover, the maximum dierence between the readings of correct joined clocks is bounded by DMAX. 2 l 8 Conclusion This paper has presented a new simple solution to the problem of synchronizing the clocks of a distributed system in the presence of likely failures such as omission and performance failures, and in the presence of processor joins. Because of the simpler failure model considered, the protocol presented is considerably simpler then those presented in [LM,DHSS,LL], especially in the handling of processor joins. The engineering approach adopted is similar in spirit to the one adopted in [KO], where protocol simplicity is achieved by limiting the total number of failures that can be tolerated during a synchronization. Synchronized clocks are useful for a number of reasons. They can be used to totally order the events of a distributed system [L] (e.g. merging of the data base logs generated on distinct computers into a common unique log), and they can be used to measure the time that elapses between events that occur on dierent processors (e.g. do performance evaluation of distributed systems). Another important application is in ensuring consistency among the knowledge states of the computers of a distributed system. An earlier paper [CASD] has presented protocols for ensuring the consistency of replicated data that depend on synchronized clocks. 15

16 References [C] F. Cristian: Probabilistic Clock Synchronization, Distributed Computing, Vol. 3, pp , [Cr] F. Cristian: Understanding Fault-tolerant Distributed Systems, Communications of the ACM, Vol. 34, No. 2, Feb 1991 and erratum in CACM Vol. 34, No. 4, April [CS] F. Cristian and F. Shmuck: Continuous Clock Amortization Need not Aect the Precision of a Clock Synchronization Algorithm, IBM Research Report RJ7290, January [CASD] F. Cristian, H. Aghili, R. Strong D. Dolev: Fault-Tolerant Atomic Broadcast Protocols. Proc. 15th Int. Conf. on Fault-Tolerant Computing, Ann Arbor, Michigan, June [DHSS] D. Dolev, J. Halpern, B. Simons, R. Strong: Fault-Tolerant Clock Synchronization, IBM Research Report RJ4094, [IBM370] IBM System/370: Principles of Operation, GA , [KO] H. Kopetz, W. Ochsenreiter: Internal Clock Synchronization with a VLSI Synchronization Unit, TR 1985/7, Technical Univ. Vienna, [L] L. Lamport: Time, Clocks, and the Ordering of Events in a Distributed Systems, Comm. of the ACM, Vol. 21, No. 7, July 1978, pp [LM] L. Lamport, M. Melliar-Smith: Synchronizing Clocks in the presence of failures, Journal of the Association of Computing Machinery, Vol. 32, No. 1, January 1985, pp [LL] J. Lundelius, N. Lynch: A new failure-tolerant algorithm for clock synchronization, Proc. of the 3rd ACM Symposium on Principles of Distributed Computing, [MIL] MIL Handbook 217D, Notice 1, 13 June 1983, pp

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report #

Degradable Agreement in the Presence of. Byzantine Faults. Nitin H. Vaidya. Technical Report # Degradable Agreement in the Presence of Byzantine Faults Nitin H. Vaidya Technical Report # 92-020 Abstract Consider a system consisting of a sender that wants to send a value to certain receivers. Byzantine

More information

Integrating External and Internal Clock Synchronization. Christof Fetzer and Flaviu Cristian. Department of Computer Science & Engineering

Integrating External and Internal Clock Synchronization. Christof Fetzer and Flaviu Cristian. Department of Computer Science & Engineering Integrating External and Internal Clock Synchronization Christof Fetzer and Flaviu Cristian Department of Computer Science & Engineering University of California, San Diego La Jolla, CA 9093?0114 e-mail:

More information

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA)

Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA) Verification of clock synchronization algorithm (Original Welch-Lynch algorithm and adaptation to TTA) Christian Mueller November 25, 2005 1 Contents 1 Clock synchronization in general 3 1.1 Introduction............................

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing CS 347 Parallel and Distributed Data Processing Spring 2016 & Clocks, Clocks, and the Ordering of Events in a Distributed System. L. Lamport, Communications of the ACM, 1978 Notes 15: & Clocks CS 347 Notes

More information

DISTRIBUTED COMPUTER SYSTEMS

DISTRIBUTED COMPUTER SYSTEMS DISTRIBUTED COMPUTER SYSTEMS SYNCHRONIZATION Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Topics Clock Synchronization Physical Clocks Clock Synchronization Algorithms

More information

Abstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced

Abstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced Primary Partition \Virtually-Synchronous Communication" harder than Consensus? Andre Schiper and Alain Sandoz Departement d'informatique Ecole Polytechnique Federale de Lausanne CH-1015 Lausanne (Switzerland)

More information

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering

Our Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering Our Problem Global Predicate Detection and Event Ordering To compute predicates over the state of a distributed application Model Clock Synchronization Message passing No failures Two possible timing assumptions:

More information

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)

Time. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George) Time Lakshmi Ganesh (slides borrowed from Maya Haridasan, Michael George) The Problem Given a collection of processes that can... only communicate with significant latency only measure time intervals approximately

More information

Bounding the End-to-End Response Times of Tasks in a Distributed. Real-Time System Using the Direct Synchronization Protocol.

Bounding the End-to-End Response Times of Tasks in a Distributed. Real-Time System Using the Direct Synchronization Protocol. Bounding the End-to-End Response imes of asks in a Distributed Real-ime System Using the Direct Synchronization Protocol Jun Sun Jane Liu Abstract In a distributed real-time system, a task may consist

More information

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures

Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures Lower Bounds for Achieving Synchronous Early Stopping Consensus with Orderly Crash Failures Xianbing Wang 1, Yong-Meng Teo 1,2, and Jiannong Cao 3 1 Singapore-MIT Alliance, 2 Department of Computer Science,

More information

Optimal Clock Synchronization

Optimal Clock Synchronization Optimal Clock Synchronization T. K. SRIKANTH AND SAM TOUEG Cornell University, Ithaca, New York Abstract. We present a simple, efficient, and unified solution to the problems of synchronizing, initializing,

More information

Failure detectors Introduction CHAPTER

Failure detectors Introduction CHAPTER CHAPTER 15 Failure detectors 15.1 Introduction This chapter deals with the design of fault-tolerant distributed systems. It is widely known that the design and verification of fault-tolerent distributed

More information

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.

Coordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1. Coordination Failures and Consensus If the solution to availability and scalability is to decentralize and replicate functions and data, how do we coordinate the nodes? data consistency update propagation

More information

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 06: Synchronization Version: November 16, 2009 2 / 39 Contents Chapter

More information

Agreement. Today. l Coordination and agreement in group communication. l Consensus

Agreement. Today. l Coordination and agreement in group communication. l Consensus Agreement Today l Coordination and agreement in group communication l Consensus Events and process states " A distributed system a collection P of N singlethreaded processes w/o shared memory Each process

More information

Time in Distributed Systems: Clocks and Ordering of Events

Time in Distributed Systems: Clocks and Ordering of Events Time in Distributed Systems: Clocks and Ordering of Events Clocks in Distributed Systems Needed to Order two or more events happening at same or different nodes (Ex: Consistent ordering of updates at different

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distributed Systems Principles and Paradigms Chapter 6 (version April 7, 28) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.2. Tel: (2)

More information

Time. To do. q Physical clocks q Logical clocks

Time. To do. q Physical clocks q Logical clocks Time To do q Physical clocks q Logical clocks Events, process states and clocks A distributed system A collection P of N single-threaded processes (p i, i = 1,, N) without shared memory The processes in

More information

CS505: Distributed Systems

CS505: Distributed Systems Cristina Nita-Rotaru CS505: Distributed Systems. Required reading for this topic } Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson for "Impossibility of Distributed with One Faulty Process,

More information

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links

Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links Implementing Uniform Reliable Broadcast with Binary Consensus in Systems with Fair-Lossy Links Jialin Zhang Tsinghua University zhanggl02@mails.tsinghua.edu.cn Wei Chen Microsoft Research Asia weic@microsoft.com

More information

Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems

Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems Simple Bivalency Proofs of the Lower Bounds in Synchronous Consensus Problems Xianbing Wang, Yong-Meng Teo, and Jiannong Cao Singapore-MIT Alliance E4-04-10, 4 Engineering Drive 3, Singapore 117576 Abstract

More information

Gradient Clock Synchronization

Gradient Clock Synchronization Noname manuscript No. (will be inserted by the editor) Rui Fan Nancy Lynch Gradient Clock Synchronization the date of receipt and acceptance should be inserted later Abstract We introduce the distributed

More information

Time. Today. l Physical clocks l Logical clocks

Time. Today. l Physical clocks l Logical clocks Time Today l Physical clocks l Logical clocks Events, process states and clocks " A distributed system a collection P of N singlethreaded processes without shared memory Each process p i has a state s

More information

Self-stabilizing Byzantine Agreement

Self-stabilizing Byzantine Agreement Self-stabilizing Byzantine Agreement Ariel Daliot School of Engineering and Computer Science The Hebrew University, Jerusalem, Israel adaliot@cs.huji.ac.il Danny Dolev School of Engineering and Computer

More information

Distributed Consensus

Distributed Consensus Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort in distributed transactions Reaching agreement

More information

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139

Upper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139 Upper and Lower Bounds on the Number of Faults a System Can Withstand Without Repairs Michel Goemans y Nancy Lynch z Isaac Saias x Laboratory for Computer Science Massachusetts Institute of Technology

More information

Closed Form Bounds for Clock Synchronization Under Simple Uncertainty Assumptions

Closed Form Bounds for Clock Synchronization Under Simple Uncertainty Assumptions Closed Form Bounds for Clock Synchronization Under Simple Uncertainty Assumptions Saâd Biaz Λ Jennifer L. Welch y Key words: Distributed Computing Clock Synchronization Optimal Precision Generalized Hypercube

More information

Network Issues in Clock Synchronization on Distributed Database

Network Issues in Clock Synchronization on Distributed Database Network Issues in Clock Synchronization on Distributed Database Rumpa Hazra 1, Debnath Bhattacharyya 1, Shouvik Dey 2, Sattarova Feruza Y. 3, and Tadjibayev Furkhat A. 4 1 Heritage Institute of Technology,

More information

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur

Agreement Protocols. CS60002: Distributed Systems. Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Agreement Protocols CS60002: Distributed Systems Pallab Dasgupta Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Classification of Faults Based on components that failed Program

More information

Clock Synchronization

Clock Synchronization Today: Canonical Problems in Distributed Systems Time ordering and clock synchronization Leader election Mutual exclusion Distributed transactions Deadlock detection Lecture 11, page 7 Clock Synchronization

More information

The Weighted Byzantine Agreement Problem

The Weighted Byzantine Agreement Problem The Weighted Byzantine Agreement Problem Vijay K. Garg and John Bridgman Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 78712-1084, USA garg@ece.utexas.edu,

More information

Shared Memory vs Message Passing

Shared Memory vs Message Passing Shared Memory vs Message Passing Carole Delporte-Gallet Hugues Fauconnier Rachid Guerraoui Revised: 15 February 2004 Abstract This paper determines the computational strength of the shared memory abstraction

More information

Time Synchronization

Time Synchronization Massachusetts Institute of Technology Lecture 7 6.895: Advanced Distributed Algorithms March 6, 2006 Professor Nancy Lynch Time Synchronization Readings: Fan, Lynch. Gradient clock synchronization Attiya,

More information

Recap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?

Recap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean? CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA Recap We ve started a process of isolating questions that arise in big systems Tease out an abstract issue Treat

More information

Proving Safety Properties of the Steam Boiler Controller. Abstract

Proving Safety Properties of the Steam Boiler Controller. Abstract Formal Methods for Industrial Applications: A Case Study Gunter Leeb leeb@auto.tuwien.ac.at Vienna University of Technology Department for Automation Treitlstr. 3, A-1040 Vienna, Austria Abstract Nancy

More information

Eventually consistent failure detectors

Eventually consistent failure detectors J. Parallel Distrib. Comput. 65 (2005) 361 373 www.elsevier.com/locate/jpdc Eventually consistent failure detectors Mikel Larrea a,, Antonio Fernández b, Sergio Arévalo b a Departamento de Arquitectura

More information

Information-Theoretic Lower Bounds on the Storage Cost of Shared Memory Emulation

Information-Theoretic Lower Bounds on the Storage Cost of Shared Memory Emulation Information-Theoretic Lower Bounds on the Storage Cost of Shared Memory Emulation Viveck R. Cadambe EE Department, Pennsylvania State University, University Park, PA, USA viveck@engr.psu.edu Nancy Lynch

More information

Early consensus in an asynchronous system with a weak failure detector*

Early consensus in an asynchronous system with a weak failure detector* Distrib. Comput. (1997) 10: 149 157 Early consensus in an asynchronous system with a weak failure detector* André Schiper Ecole Polytechnique Fe dérale, De partement d Informatique, CH-1015 Lausanne, Switzerland

More information

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:

AGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications: AGREEMENT PROBLEMS (1) AGREEMENT PROBLEMS Agreement problems arise in many practical applications: agreement on whether to commit or abort the results of a distributed atomic action (e.g. database transaction)

More information

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas

Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast. Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas Network Algorithms and Complexity (NTUA-MPLA) Reliable Broadcast Aris Pagourtzis, Giorgos Panagiotakos, Dimitris Sakavalas Slides are partially based on the joint work of Christos Litsas, Aris Pagourtzis,

More information

The Weakest Failure Detector to Solve Mutual Exclusion

The Weakest Failure Detector to Solve Mutual Exclusion The Weakest Failure Detector to Solve Mutual Exclusion Vibhor Bhatt Nicholas Christman Prasad Jayanti Dartmouth College, Hanover, NH Dartmouth Computer Science Technical Report TR2008-618 April 17, 2008

More information

Clocks in Asynchronous Systems

Clocks in Asynchronous Systems Clocks in Asynchronous Systems The Internet Network Time Protocol (NTP) 8 Goals provide the ability to externally synchronize clients across internet to UTC provide reliable service tolerating lengthy

More information

Fault-Tolerant Consensus

Fault-Tolerant Consensus Fault-Tolerant Consensus CS556 - Panagiota Fatourou 1 Assumptions Consensus Denote by f the maximum number of processes that may fail. We call the system f-resilient Description of the Problem Each process

More information

Chapter 11 Time and Global States

Chapter 11 Time and Global States CSD511 Distributed Systems 分散式系統 Chapter 11 Time and Global States 吳俊興 國立高雄大學資訊工程學系 Chapter 11 Time and Global States 11.1 Introduction 11.2 Clocks, events and process states 11.3 Synchronizing physical

More information

Reliable Broadcast for Broadcast Busses

Reliable Broadcast for Broadcast Busses Reliable Broadcast for Broadcast Busses Ozalp Babaoglu and Rogerio Drummond. Streets of Byzantium: Network Architectures for Reliable Broadcast. IEEE Transactions on Software Engineering SE- 11(6):546-554,

More information

Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication

Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication Min Shen, Ajay D. Kshemkalyani, TaYuan Hsu University of Illinois at Chicago Min Shen, Ajay D. Kshemkalyani, TaYuan Causal

More information

6.852: Distributed Algorithms Fall, Class 24

6.852: Distributed Algorithms Fall, Class 24 6.852: Distributed Algorithms Fall, 2009 Class 24 Today s plan Self-stabilization Self-stabilizing algorithms: Breadth-first spanning tree Mutual exclusion Composing self-stabilizing algorithms Making

More information

Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination

Early stopping: the idea. TRB for benign failures. Early Stopping: The Protocol. Termination TRB for benign failures Early stopping: the idea Sender in round : :! send m to all Process p in round! k, # k # f+!! :! if delivered m in round k- and p " sender then 2:!! send m to all 3:!! halt 4:!

More information

Asynchronous Models For Consensus

Asynchronous Models For Consensus Distributed Systems 600.437 Asynchronous Models for Consensus Department of Computer Science The Johns Hopkins University 1 Asynchronous Models For Consensus Lecture 5 Further reading: Distributed Algorithms

More information

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation

Logical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation Logical Time Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation 2013 ACM Turing Award:

More information

Clock Synchronization with Bounded Global and Local Skew

Clock Synchronization with Bounded Global and Local Skew Clock Synchronization with ounded Global and Local Skew Distributed Computing Christoph Lenzen, ETH Zurich Thomas Locher, ETH Zurich Roger Wattenhofer, ETH Zurich October 2008 Motivation: No Global Clock

More information

Time Free Self-Stabilizing Local Failure Detection

Time Free Self-Stabilizing Local Failure Detection Research Report 33/2004, TU Wien, Institut für Technische Informatik July 6, 2004 Time Free Self-Stabilizing Local Failure Detection Martin Hutle and Josef Widder Embedded Computing Systems Group 182/2

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Do we have a quorum?

Do we have a quorum? Do we have a quorum? Quorum Systems Given a set U of servers, U = n: A quorum system is a set Q 2 U such that Q 1, Q 2 Q : Q 1 Q 2 Each Q in Q is a quorum How quorum systems work: A read/write shared register

More information

Real-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking

Real-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking Real-Time Course Clock synchronization 1 Clocks Processor p has monotonically increasing clock function C p (t) Clock has drift rate For t1 and t2, with t2 > t1 (1-ρ)(t2-t1)

More information

Causality and Time. The Happens-Before Relation

Causality and Time. The Happens-Before Relation Causality and Time The Happens-Before Relation Because executions are sequences of events, they induce a total order on all the events It is possible that two events by different processors do not influence

More information

Optimal Resilience Asynchronous Approximate Agreement

Optimal Resilience Asynchronous Approximate Agreement Optimal Resilience Asynchronous Approximate Agreement Ittai Abraham, Yonatan Amit, and Danny Dolev School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel {ittaia, mitmit,

More information

Distributed Systems Byzantine Agreement

Distributed Systems Byzantine Agreement Distributed Systems Byzantine Agreement He Sun School of Informatics University of Edinburgh Outline Finish EIG algorithm for Byzantine agreement. Number-of-processors lower bound for Byzantine agreement.

More information

Formal Verication of the Interactive Convergence Clock. John Rushby and Friedrich von Henke. SRI International

Formal Verication of the Interactive Convergence Clock. John Rushby and Friedrich von Henke. SRI International Formal Verication of the Interactive Convergence Clock Synchronization Algorithm 1 John Rushby and Friedrich von Henke Computer Science Laboratory SRI International Technical Report CSL-89-3R February

More information

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007 Lecture 1 Introduction 1 Introduction 1.1 The Problem Domain Today, we are going to ask whether a system can recover from perturbation. Consider a children s top: If it is perfectly vertically, you can

More information

Combining Shared Coin Algorithms

Combining Shared Coin Algorithms Combining Shared Coin Algorithms James Aspnes Hagit Attiya Keren Censor Abstract This paper shows that shared coin algorithms can be combined to optimize several complexity measures, even in the presence

More information

Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan

Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2017-004 March 31, 2017 Optimal and Player-Replaceable Consensus with an Honest Majority Silvio Micali and Vinod Vaikuntanathan

More information

Genuine atomic multicast in asynchronous distributed systems

Genuine atomic multicast in asynchronous distributed systems Theoretical Computer Science 254 (2001) 297 316 www.elsevier.com/locate/tcs Genuine atomic multicast in asynchronous distributed systems Rachid Guerraoui, Andre Schiper Departement d Informatique, Ecole

More information

On Equilibria of Distributed Message-Passing Games

On Equilibria of Distributed Message-Passing Games On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu

More information

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi.

Optimal Rejuvenation for. Tolerating Soft Failures. Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi. Optimal Rejuvenation for Tolerating Soft Failures Andras Pfening, Sachin Garg, Antonio Puliato, Miklos Telek, Kishor S. Trivedi Abstract In the paper we address the problem of determining the optimal time

More information

S1 S2. checkpoint. m m2 m3 m4. checkpoint P checkpoint. P m5 P

S1 S2. checkpoint. m m2 m3 m4. checkpoint P checkpoint. P m5 P On Consistent Checkpointing in Distributed Systems Guohong Cao, Mukesh Singhal Department of Computer and Information Science The Ohio State University Columbus, OH 43201 E-mail: fgcao, singhalg@cis.ohio-state.edu

More information

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong

Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms CS 249 Project Fall 2005 Wing Wong Outline Introduction Asynchronous distributed systems, distributed computations,

More information

Distributed Computing. Synchronization. Dr. Yingwu Zhu

Distributed Computing. Synchronization. Dr. Yingwu Zhu Distributed Computing Synchronization Dr. Yingwu Zhu Topics to Discuss Physical Clocks Logical Clocks: Lamport Clocks Classic paper: Time, Clocks, and the Ordering of Events in a Distributed System Lamport

More information

Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour

Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) Week 1: Introduction, Logical clocks, Snapshots Dr. Borzoo Bonakdarpour Department of Computing and Software McMaster University Dr. Borzoo Bonakdarpour Distributed Algorithms

More information

Convergence of Time Decay for Event Weights

Convergence of Time Decay for Event Weights Convergence of Time Decay for Event Weights Sharon Simmons and Dennis Edwards Department of Computer Science, University of West Florida 11000 University Parkway, Pensacola, FL, USA Abstract Events of

More information

The Theta-Model: Achieving Synchrony without Clocks

The Theta-Model: Achieving Synchrony without Clocks Distributed Computing manuscript No. (will be inserted by the editor) Josef Widder Ulrich Schmid The Theta-Model: Achieving Synchrony without Clocks Supported by the FWF project Theta (proj. no. P17757-N04)

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world. Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check

More information

Concurrent Non-malleable Commitments from any One-way Function

Concurrent Non-malleable Commitments from any One-way Function Concurrent Non-malleable Commitments from any One-way Function Margarita Vald Tel-Aviv University 1 / 67 Outline Non-Malleable Commitments Problem Presentation Overview DDN - First NMC Protocol Concurrent

More information

Consistent Fixed Points and Negative Gain

Consistent Fixed Points and Negative Gain 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies Consistent Fixed Points and Negative Gain H. B. Acharya The University of Texas at Austin acharya @ cs.utexas.edu

More information

Robust Network Codes for Unicast Connections: A Case Study

Robust Network Codes for Unicast Connections: A Case Study Robust Network Codes for Unicast Connections: A Case Study Salim Y. El Rouayheb, Alex Sprintson, and Costas Georghiades Department of Electrical and Computer Engineering Texas A&M University College Station,

More information

Distributed Systems. Time, Clocks, and Ordering of Events

Distributed Systems. Time, Clocks, and Ordering of Events Distributed Systems Time, Clocks, and Ordering of Events Björn Franke University of Edinburgh 2016/2017 Today Last lecture: Basic Algorithms Today: Time, clocks, NTP Ref: CDK Causality, ordering, logical

More information

6.852: Distributed Algorithms Fall, Class 10

6.852: Distributed Algorithms Fall, Class 10 6.852: Distributed Algorithms Fall, 2009 Class 10 Today s plan Simulating synchronous algorithms in asynchronous networks Synchronizers Lower bound for global synchronization Reading: Chapter 16 Next:

More information

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This Recap: Finger Table Finding a using fingers Distributed Systems onsensus Steve Ko omputer Sciences and Engineering University at Buffalo N102 86 + 2 4 N86 20 + 2 6 N20 2 Let s onsider This

More information

CS505: Distributed Systems

CS505: Distributed Systems Cristina Nita-Rotaru CS505: Distributed Systems Ordering events. Lamport and vector clocks. Global states. Detecting failures. Required reading for this topic } Leslie Lamport,"Time, Clocks, and the Ordering

More information

Time is an important issue in DS

Time is an important issue in DS Chapter 0: Time and Global States Introduction Clocks,events and process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging Summary Time is an important

More information

Gradient Clock Synchronization in Dynamic Networks Fabian Kuhn, Thomas Locher, and Rotem Oshman

Gradient Clock Synchronization in Dynamic Networks Fabian Kuhn, Thomas Locher, and Rotem Oshman Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2009-022 May 29, 2009 Gradient Clock Synchronization in Dynamic Networks Fabian Kuhn, Thomas Locher, and Rotem Oshman

More information

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit

Finally the Weakest Failure Detector for Non-Blocking Atomic Commit Finally the Weakest Failure Detector for Non-Blocking Atomic Commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory EPFL Abstract Recent papers [7, 9] define the weakest failure detector

More information

A posteriori agreement. for clock synchronization. on broadcast networks. INESC Technical Report RT/ L. Rodrigues, P. Verssimo.

A posteriori agreement. for clock synchronization. on broadcast networks. INESC Technical Report RT/ L. Rodrigues, P. Verssimo. A posteriori agreement for clock synchronization on broadcast networks INESC Technical Report RT/62-92 L. Rodrigues, P. Verssimo January 1992 LIMITED DISTRIBUTION NOTICE A shorter version of this report

More information

Unreliable Failure Detectors for Reliable Distributed Systems

Unreliable Failure Detectors for Reliable Distributed Systems Unreliable Failure Detectors for Reliable Distributed Systems A different approach Augment the asynchronous model with an unreliable failure detector for crash failures Define failure detectors in terms

More information

Bridging the Gap: Byzantine Faults and Self-stabilization

Bridging the Gap: Byzantine Faults and Self-stabilization Bridging the Gap: Byzantine Faults and Self-stabilization Thesis submitted for the degree of DOCTOR of PHILOSOPHY by Ezra N. Hoch Submitted to the Senate of The Hebrew University of Jerusalem September

More information

Resolving Message Complexity of Byzantine. Agreement and Beyond. 1 Introduction

Resolving Message Complexity of Byzantine. Agreement and Beyond. 1 Introduction Resolving Message Complexity of Byzantine Agreement and Beyond Zvi Galil Alain Mayer y Moti Yung z (extended summary) Abstract Byzantine Agreement among processors is a basic primitive in distributed computing.

More information

Impossibility of Distributed Consensus with One Faulty Process

Impossibility of Distributed Consensus with One Faulty Process Impossibility of Distributed Consensus with One Faulty Process Journal of the ACM 32(2):374-382, April 1985. MJ Fischer, NA Lynch, MS Peterson. Won the 2002 Dijkstra Award (for influential paper in distributed

More information

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Roberto Bruni http://www.di.unipi.it/~bruni 17 - Diagnosis for WF nets 1 Object We study suitable diagnosis techniques

More information

Dynamic Fault-Tolerant Clock Synchronization

Dynamic Fault-Tolerant Clock Synchronization Dynamic Fault-Tolerant Clock Synchronization DANNY DOLEV, JOSEPH Y. HALPERN, BARBARA SIMONS, AND RAY STRONG IBM Almaden Research Cente~ San Jose, California Abstract. This paper gives two simple efficient

More information

Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS. 1. Introduction. Svein Willassen

Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS. 1. Introduction. Svein Willassen Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS Svein Willassen Abstract Timestamps stored on digital media play an important role in digital investigations. However, the evidentiary value

More information

Easy Consensus Algorithms for the Crash-Recovery Model

Easy Consensus Algorithms for the Crash-Recovery Model Reihe Informatik. TR-2008-002 Easy Consensus Algorithms for the Crash-Recovery Model Felix C. Freiling, Christian Lambertz, and Mila Majster-Cederbaum Department of Computer Science, University of Mannheim,

More information

with the ability to perform a restricted set of operations on quantum registers. These operations consist of state preparation, some unitary operation

with the ability to perform a restricted set of operations on quantum registers. These operations consist of state preparation, some unitary operation Conventions for Quantum Pseudocode LANL report LAUR-96-2724 E. Knill knill@lanl.gov, Mail Stop B265 Los Alamos National Laboratory Los Alamos, NM 87545 June 1996 Abstract A few conventions for thinking

More information

On-line Bin-Stretching. Yossi Azar y Oded Regev z. Abstract. We are given a sequence of items that can be packed into m unit size bins.

On-line Bin-Stretching. Yossi Azar y Oded Regev z. Abstract. We are given a sequence of items that can be packed into m unit size bins. On-line Bin-Stretching Yossi Azar y Oded Regev z Abstract We are given a sequence of items that can be packed into m unit size bins. In the classical bin packing problem we x the size of the bins and try

More information

RANDOM DISTRIBUTED ALGORITHMS FOR CLOCK SYNCHRONIZATION

RANDOM DISTRIBUTED ALGORITHMS FOR CLOCK SYNCHRONIZATION Engineering Journal of Qatar University, Vol. 5, 1992, p. 145-162 RANDOM DISTRIBUTED ALGORITHMS FOR CLOCK SYNCHRONIZATION Shoichiro Nakai*, Nasser Marafih** Shingo Fukui* and Satoshi Hasegawa* *C&C Systems

More information

Byzantine agreement with homonyms

Byzantine agreement with homonyms Distrib. Comput. (013) 6:31 340 DOI 10.1007/s00446-013-0190-3 Byzantine agreement with homonyms Carole Delporte-Gallet Hugues Fauconnier Rachid Guerraoui Anne-Marie Kermarrec Eric Ruppert Hung Tran-The

More information

On Stabilizing Departures in Overlay Networks

On Stabilizing Departures in Overlay Networks On Stabilizing Departures in Overlay Networks Dianne Foreback 1, Andreas Koutsopoulos 2, Mikhail Nesterenko 1, Christian Scheideler 2, and Thim Strothmann 2 1 Kent State University 2 University of Paderborn

More information

Dynamic Group Communication

Dynamic Group Communication Dynamic Group Communication André Schiper Ecole Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland e-mail: andre.schiper@epfl.ch Abstract Group communication is the basic infrastructure

More information

Impossibility Results for Universal Composability in Public-Key Models and with Fixed Inputs

Impossibility Results for Universal Composability in Public-Key Models and with Fixed Inputs Impossibility Results for Universal Composability in Public-Key Models and with Fixed Inputs Dafna Kidron Yehuda Lindell June 6, 2010 Abstract Universal composability and concurrent general composition

More information

1 Introduction A one-dimensional burst error of length t is a set of errors that are conned to t consecutive locations [14]. In this paper, we general

1 Introduction A one-dimensional burst error of length t is a set of errors that are conned to t consecutive locations [14]. In this paper, we general Interleaving Schemes for Multidimensional Cluster Errors Mario Blaum IBM Research Division 650 Harry Road San Jose, CA 9510, USA blaum@almaden.ibm.com Jehoshua Bruck California Institute of Technology

More information

Model Checking of Fault-Tolerant Distributed Algorithms

Model Checking of Fault-Tolerant Distributed Algorithms Model Checking of Fault-Tolerant Distributed Algorithms Part I: Fault-Tolerant Distributed Algorithms Annu Gmeiner Igor Konnov Ulrich Schmid Helmut Veith Josef Widder LOVE 2016 @ TU Wien Josef Widder (TU

More information