On the Quality of Service of Failure Detectors. Sam Toueg Wei Chen, Marcos K. Aguilera (part of Wei ChenÕs PhD Thesis)
|
|
- Hollie Ferguson
- 5 years ago
- Views:
Transcription
1 On the Quality of Service of Failure etectors Sam oueg Wei Chen, Marcos K. Aguilera part of Wei ChenÕs Ph hesis
2 Abstract We study the quality of service QoS of failure detectors. By QoS, we mean a specification that quantifies a how fast the failure detector detects actual failures, and b how well it avoids false detections. We first propose a set of QoS metrics to specify failure detectors for systems with probabilistic behaviors,i.e., for systems where message delays and message losses follow some probability distributions. We then give a new failure detector algorithm and analyze its QoS in terms of the proposed metrics. We show that, among a large class of failure detectors, the new algorithm is optimal with respect to some of these QoS metrics. Given a set of failure detector QoS requirements, we show how to compute the parameters of our algorithm so that it satisfies these requirements, and we show how this can be done even if the probabilistic behavior of the system is not known. We then present some simulation results that show that the new failure detector algorithm provides a better QoS than an algorithm that is commonly used in practice. Finally, we briefly explain how to make our failure detector adaptive, so that it automatically reconfigures itself when there is a change in the probabilistic behavior of the network. Full paper available at:
3 What is the QoS of Failure etectors A specification that quantifies Ð speed: how fast an F detects a crash Ð accuracy: how well an F avoids erroneous detections
4 Our Results Metrics for the QoS specification esign and analysis of a new failure detector algorithm Ð An optimality result Ð he analysis on the QoS metrics Ð Satisfying the QoS requirements given by applications ealing with unknown system behavior and unsynchronized clocks Simulation results
5 About Failure etectors F trust trust trust suspect suspect suspect Process p up down
6 Part I On the QoS Specification of Failure etectors
7 Our Results on QoS Metrics Identify 7 QoS metrics Quantify the relations between these metrics Propose 3 metrics as the primary metrics for the QoS specification
8 Metric 1: etection ime Measure how fast a failure detector detects a process crash. F trust Process p up suspect down
9 What should be the accuracy metrics? Accuracy: how well an F avoids erroneous suspicions For accuracy metrics, consider runs in which process p does not crash etermining a good set of accuracy metrics is not that simple
10 he First Accuracy Metric: Query Accuracy Probability P A Application random query S F Process p up P A is the probability that the failure detector output is correct at a random time.
11 Query Accuracy Probability is not Sufficient F1 F2 Process p up
12 Another Accuracy Metric: Average Mistake Rate λ M λ M measures the rate at which a failure detector makes mistakes. λ M itself is not sufficient either. F1 F2 Process p up
13 P A and λ M together are not sufficient F1 is better than F2 in both P A and λ M, but F2 is faster in correcting each mistakes F1 F2 Process p up
14 More Accuracy Metrics Mistake uration M Good Period uration G Mistake Recurrence ime MR F M G MR
15 Forward Good Period uration FG More relevant to short-lived applications monitor request Application F FG interrupt Process p up
16 Relations among Accuracy Metrics + = = + = > = Pr 1 Pr G G G G G FG k G k G k FG x G G FG E V E E E E E k E E dy y E x MR G A E P = E 1 MR M E = λ MR M G = +
17 Primary Metrics etection time F Mistake recurrence time MR Mistake duration M F Process p Process p M MR
18 Part II he esign and Analysis of the New Failure etector Algorithm
19 he Probabilistic Network Model message loss probability p L message delay time we first assume clocks are synchronized; then only that clock drifts are negligible
20 A Simple F Algorithm Process p η η η Process q O O O F at q iming out depends on two consecutive messages
21 Large etection ime epends on the delay of the last message sent by p Process p Process q crash O O O F at q max + O
22 New Algorithm w/ synchronized clocks η Process p m i-1 m i m i+1 m i+2 Process q Freshness points: τ i-1 F at q δ η τ i τ i+1 τ i+2 At any time t [τ i,τ i+1, F trusts p iff q has received heartbeat message m i or higher.
23 he Core Property At any time t [τ i,τ i+1, F trusts p iff q has received heartbeat message m i or higher. Process p Process q m i-1 m i m i+1 m i+2 t t t F at q Freshness points: τ i τ i+1 τ i+2
24 etection ime Process p Process q F at q m i crash δ η τ i τ i+1 δ +η
25 he QoS Analysis of the New Failure etector Algorithm
26 An Optimality Result Among all F algorithms such that the monitored process p sends a message every η, the detection time is always less than a given bound, our new algorithm provides the best query accuracy probability.
27 Summary of the QoS Analysis P A where 1 η = 1 dx u x p S u x η 0 = j = 0 = 1 p Pr < δ + η u0 By decreasing η or increasing δ linearly: Ð P A increases exponentially towards 1 L E = η p MR S δ η [ p + 1 p Pr > δ + x jη ] L Ð E MR increases exponentially we also derived E M, E G, λ M from P A and E MR L
28 Satisfying QoS Requirements Given a set of QoS requirements as a tuple such that Find η and δ to achieve these requirements U M M L MR MR U E E,, U M L MR U
29 Suppose p L and the distribution Pr t is known Problem to solve: such that max η δ + η η p S η 0 S L MR U u x dx p U M
30 Configuration Procedure Step 1: computeq 0 = 1 p Pr < and let U η max = q 0 M U L Step 2: let f η = η U η 1 U q 0 [ pl + 1 pl Pr > jη] j= 1 find the largest η η max that satisfies Step 3: set δ η = U L f η MR
31 Heartbeat Probabilistic Behavior QoS Requirements U L,, MR U M P L Configurator η P[ x] δ Failure etector
32 Part III ealing with: Unknown System Behavior Unsynchronized Clocks
33 Unknown System Behavior Message loss probability p L is not known Message delay distribution Pr x is not known Clocks are synchronized Need to modify the configuration procedure to satisfy the given QoS requirements
34 Main Idea Bound Pr x using E and V Modify configuration procedure to use E and V instead of Pr x Estimate E, V and p L using heartbeats Use estimates to run configuration procedure
35 Estimator of Heartbeat Probabilistic Behavior QoS Requirements U L,, MR U M P L Configurator η E δ V Failure etector
36 etails 1. Bound Pr > t using the one-sided inequality: For all t > E, Pr > t V V + t E 2
37 γ η β η M MR E E where ,, 0 η δ η δ γ η δ η δ η δ β = = + + = = E V E p E k j E V j E p V L k j L etails contõd 2. Bound the QoS accuracy metrics:
38 etails contõd Step 3: set η δ = U Step 1: compute and let, min max E U U M = γ η E V E p U U L + = γ Step 2: let find the largest η η max that satisfies L f MR η = + + = η η η η η E j U L U U j E p V j E V f 3. Obtain the following configuration procedure:
39 Estimate p L, E and V Estimating p L : using the sequence numbers associated with heartbeat messages Estimating E and V: Ð p timestamps the heartbeats using the sending time S i Ð q records the receipt times A i of heartbeats Ð taking the average and the variance of A i -S i
40 When Clocks are Not Sychronized Problem: Freshness points cannot be set as shifts of the sending times of heartbeats Solution: Ð shift the freshness points with respect to expected arrival times EA i Õs of heartbeats Ð estimate the expected arrival times
41 Algorithm with non-synch clocks Process p m i m i+1 m i+2 Process q E EA i δ α τ i τ i+1 τ i+2 EA i is the expected arrival time of m i. Parameter α accounts for the variation of message delay. δ = E +α
42 QoS analysis of new algorithm remains the same with δ replaced by E+α Given some QoS requirements, we can compute F parameters δ and α using only p L and V
43 QoS Requirements U L,, MR U M Estimator of Heartbeat Probabilistic Behavior P L V Configurator η α EA i Õs Failure etector
44 Estimating Expected Arrival imes Using n most recently received heartbeat messages With appropriate n, the estimates can be very accurate
45 Estimating Expected Arrival imes Process q m 1 m 2 m 3... m n arrival times: A 1 A 2 A... 3 A n EA n+1? known A naive idea: compute the average of interarrival time, and add it to A n to get EA n+1 his does not work: it depends too much on A n
46 EA he Estimator n 1 1 Ai iη + + 1η n n+ n i= 1 ÒnormalizeÓ each A i by shifting it backward in time by iη compute the normalized A i Õs shift forward the computed average by n+1η
47 Spectrum of Algorithms Simple algorithm New algorithm with known expected arrival times 1 n = number of messages used to estimate the expected arrival times
48 Part IV Simulation Results
49 Algorithms We Simulated New algorithm with synchronized clocks Ð matches the analytical results New algorithm with unsynchronized clocks Ð matches the one with synchronized clocks Simple algorithm Ð he new algorithm provides better QoS than the simple algorithm
50 How to Compare Send heartbeat messages at the same rate Satisfy the same bound on the worst-case detection time Compare the average time between mistakes E MR
51 Simulation Settings intersending time η = 1 message loss probability p L = 0.01 message delay has exponential distribution E = σ = 0.02
52 Comparing E MR
53 Comparing E MR
54 Summary Proposed a set of QoS metrics and quantified the relation between them Presented a new failure detector algorithm and analyzed its QoS Showed how to compute the parameters of the new algorithm to satisfy some given QoS requirements Showed how to use the algorithm when: a the system behavior is not known, and b clocks are not synchronized Presented simulation results showing that the new algorithm provides a better QoS than a simple algorithm
55 Related Work Vogels [1996] Gouda and McGuire [1998] Van Renesse, Minsky and Hayden [1998] Raynal and ronel [1999] Ver ssimo and Raynal [2000]
CS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 & Clocks, Clocks, and the Ordering of Events in a Distributed System. L. Lamport, Communications of the ACM, 1978 Notes 15: & Clocks CS 347 Notes
More informationCS505: Distributed Systems
Cristina Nita-Rotaru CS505: Distributed Systems Ordering events. Lamport and vector clocks. Global states. Detecting failures. Required reading for this topic } Leslie Lamport,"Time, Clocks, and the Ordering
More informationClock Synchronization
Today: Canonical Problems in Distributed Systems Time ordering and clock synchronization Leader election Mutual exclusion Distributed transactions Deadlock detection Lecture 11, page 7 Clock Synchronization
More informationFailure Detection and Consensus in the Crash-Recovery Model
Failure Detection and Consensus in the Crash-Recovery Model Marcos Kawazoe Aguilera Wei Chen Sam Toueg Department of Computer Science Upson Hall, Cornell University Ithaca, NY 14853-7501, USA. aguilera,weichen,sam@cs.cornell.edu
More informationUnreliable Failure Detectors for Reliable Distributed Systems
Unreliable Failure Detectors for Reliable Distributed Systems A different approach Augment the asynchronous model with an unreliable failure detector for crash failures Define failure detectors in terms
More informationDistributed Computing. Synchronization. Dr. Yingwu Zhu
Distributed Computing Synchronization Dr. Yingwu Zhu Topics to Discuss Physical Clocks Logical Clocks: Lamport Clocks Classic paper: Time, Clocks, and the Ordering of Events in a Distributed System Lamport
More informationReal-Time Course. Clock synchronization. June Peter van der TU/e Computer Science, System Architecture and Networking
Real-Time Course Clock synchronization 1 Clocks Processor p has monotonically increasing clock function C p (t) Clock has drift rate For t1 and t2, with t2 > t1 (1-ρ)(t2-t1)
More informationExperimental Evaluation of the QoS of Failure Detectors on Wide Area Network
Experimental Evaluation of the QoS of Failure Detectors on Wide Area Network Lorenzo Falai, Andrea Bondavalli Dipartimento di Sistemi e Informatica Università degli Studi di Firenze Viale Morgagni 65,
More informationDISTRIBUTED COMPUTER SYSTEMS
DISTRIBUTED COMPUTER SYSTEMS SYNCHRONIZATION Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Topics Clock Synchronization Physical Clocks Clock Synchronization Algorithms
More informationFailure detectors Introduction CHAPTER
CHAPTER 15 Failure detectors 15.1 Introduction This chapter deals with the design of fault-tolerant distributed systems. It is widely known that the design and verification of fault-tolerent distributed
More informationFailure detection and consensus in the crash-recovery model
Distrib. Comput. (2000) 13: 99 125 c Springer-Verlag 2000 Failure detection and consensus in the crash-recovery model Marcos Kawazoe Aguilera 1, Wei Chen 2, Sam Toueg 1 1 Department of Computer Science,
More information7680: Distributed Systems
Cristina Nita-Rotaru 7680: Distributed Systems Physical and logical clocks. Global states. Failure detection. Ordering events in distributed systems } Time is essential for ordering events in a distributed
More informationInterplay of security and clock synchronization"
July 13, 2010, P. R. Kumar " This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License." See http://creativecommons.org/licenses/by-nc-nd/3.0/" Interplay
More informationDistributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 06. Logical clocks Paul Krzyzanowski Rutgers University Fall 2017 2014-2017 Paul Krzyzanowski 1 Logical clocks Assign sequence numbers to messages All cooperating processes can agree
More informationEventually consistent failure detectors
J. Parallel Distrib. Comput. 65 (2005) 361 373 www.elsevier.com/locate/jpdc Eventually consistent failure detectors Mikel Larrea a,, Antonio Fernández b, Sergio Arévalo b a Departamento de Arquitectura
More informationSlides for Chapter 14: Time and Global States
Slides for Chapter 14: Time and Global States From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, Addison-Wesley 2012 Overview of Chapter Introduction Clocks,
More informationApproximation of δ-timeliness
Approximation of δ-timeliness Carole Delporte-Gallet 1, Stéphane Devismes 2, and Hugues Fauconnier 1 1 Université Paris Diderot, LIAFA {Carole.Delporte,Hugues.Fauconnier}@liafa.jussieu.fr 2 Université
More informationModeling and Simulation NETW 707
Modeling and Simulation NETW 707 Lecture 6 ARQ Modeling: Modeling Error/Flow Control Course Instructor: Dr.-Ing. Maggie Mashaly maggie.ezzat@guc.edu.eg C3.220 1 Data Link Layer Data Link Layer provides
More informationFailure Detectors. Seif Haridi. S. Haridi, KTHx ID2203.1x
Failure Detectors Seif Haridi haridi@kth.se 1 Modeling Timing Assumptions Tedious to model eventual synchrony (partial synchrony) Timing assumptions mostly needed to detect failures Heartbeats, timeouts,
More informationLogical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation
Logical Time Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation 2013 ACM Turing Award:
More informationChapter 11 Time and Global States
CSD511 Distributed Systems 分散式系統 Chapter 11 Time and Global States 吳俊興 國立高雄大學資訊工程學系 Chapter 11 Time and Global States 11.1 Introduction 11.2 Clocks, events and process states 11.3 Synchronizing physical
More informationCoordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.
Coordination Failures and Consensus If the solution to availability and scalability is to decentralize and replicate functions and data, how do we coordinate the nodes? data consistency update propagation
More informationEventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony
Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony Antonio FERNÁNDEZ Ernesto JIMÉNEZ Michel RAYNAL LADyR, GSyC, Universidad Rey Juan Carlos, 28933
More informationTime. To do. q Physical clocks q Logical clocks
Time To do q Physical clocks q Logical clocks Events, process states and clocks A distributed system A collection P of N single-threaded processes (p i, i = 1,, N) without shared memory The processes in
More informationOur Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering
Our Problem Global Predicate Detection and Event Ordering To compute predicates over the state of a distributed application Model Clock Synchronization Message passing No failures Two possible timing assumptions:
More informationTime. Today. l Physical clocks l Logical clocks
Time Today l Physical clocks l Logical clocks Events, process states and clocks " A distributed system a collection P of N singlethreaded processes without shared memory Each process p i has a state s
More informationCrash-resilient Time-free Eventual Leadership
Crash-resilient Time-free Eventual Leadership Achour MOSTEFAOUI Michel RAYNAL Corentin TRAVERS IRISA, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex, France {achour raynal travers}@irisa.fr
More informationTime in Distributed Systems: Clocks and Ordering of Events
Time in Distributed Systems: Clocks and Ordering of Events Clocks in Distributed Systems Needed to Order two or more events happening at same or different nodes (Ex: Consistent ordering of updates at different
More informationThe Weakest Failure Detector for Wait-Free Dining under Eventual Weak Exclusion
The Weakest Failure Detector for Wait-Free Dining under Eventual Weak Exclusion Srikanth Sastry Computer Science and Engr Texas A&M University College Station, TX, USA sastry@cse.tamu.edu Scott M. Pike
More informationarxiv: v2 [cs.dc] 21 Apr 2017
AllConcur: Leaderless Concurrent Atomic Broadcast (Extended Version) arxiv:1608.05866v2 [cs.dc] 21 Apr 2017 Marius Poke HLRS University of Stuttgart marius.poke@hlrs.de Abstract Many distributed systems
More informationAgreement. Today. l Coordination and agreement in group communication. l Consensus
Agreement Today l Coordination and agreement in group communication l Consensus Events and process states " A distributed system a collection P of N singlethreaded processes w/o shared memory Each process
More informationShared Memory vs Message Passing
Shared Memory vs Message Passing Carole Delporte-Gallet Hugues Fauconnier Rachid Guerraoui Revised: 15 February 2004 Abstract This paper determines the computational strength of the shared memory abstraction
More informationTime Free Self-Stabilizing Local Failure Detection
Research Report 33/2004, TU Wien, Institut für Technische Informatik July 6, 2004 Time Free Self-Stabilizing Local Failure Detection Martin Hutle and Josef Widder Embedded Computing Systems Group 182/2
More informationConsensus when failstop doesn't hold
Consensus when failstop doesn't hold FLP shows that can't solve consensus in an asynchronous system with no other facility. It can be solved with a perfect failure detector. If p suspects q then q has
More informationProbabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford
Probabilistic Model Checking Michaelmas Term 2011 Dr. Dave Parker Department of Computer Science University of Oxford Overview Temporal logic Non-probabilistic temporal logic CTL Probabilistic temporal
More informationCausality and Time. The Happens-Before Relation
Causality and Time The Happens-Before Relation Because executions are sequences of events, they induce a total order on all the events It is possible that two events by different processors do not influence
More informationTime, Clocks, and the Ordering of Events in a Distributed System
Time, Clocks, and the Ordering of Events in a Distributed System Motivating example: a distributed compilation service FTP server storing source files, object files, executable file stored files have timestamps,
More informationAbsence of Global Clock
Absence of Global Clock Problem: synchronizing the activities of different part of the system (e.g. process scheduling) What about using a single shared clock? two different processes can see the clock
More informationConsistency or Latency? A Quantitative Analysis of Replication Systems Based on Replicated State Machines
Consistency or Latency? A Quantitative Analysis of Replication Systems Based on Replicated State Machines Xu Wang, Hailong Sun, Ting Deng, Jinpeng Huai School of Computer Science and Engineering Beihang
More informationAbstract. The paper considers the problem of implementing \Virtually. system. Virtually Synchronous Communication was rst introduced
Primary Partition \Virtually-Synchronous Communication" harder than Consensus? Andre Schiper and Alain Sandoz Departement d'informatique Ecole Polytechnique Federale de Lausanne CH-1015 Lausanne (Switzerland)
More informationDistributed Systems Principles and Paradigms. Chapter 06: Synchronization
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 06: Synchronization Version: November 16, 2009 2 / 39 Contents Chapter
More informationDistributed Systems 8L for Part IB
Distributed Systems 8L for Part IB Handout 2 Dr. Steven Hand 1 Clocks Distributed systems need to be able to: order events produced by concurrent processes; synchronize senders and receivers of messages;
More informationFigure 10.1 Skew between computer clocks in a distributed system
Figure 10.1 Skew between computer clocks in a distributed system Network Instructor s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 Pearson Education 2001
More informationEasy Consensus Algorithms for the Crash-Recovery Model
Reihe Informatik. TR-2008-002 Easy Consensus Algorithms for the Crash-Recovery Model Felix C. Freiling, Christian Lambertz, and Mila Majster-Cederbaum Department of Computer Science, University of Mannheim,
More informationImplementation of the IEEE 1588 Precision Time Protocol for Clock Synchronization in the Radio Detection of Ultra-High Energy Neutrinos
i Implementation of the IEEE 1588 Precision Time Protocol for Clock Synchronization in the Radio Detection of Ultra-High Energy Neutrinos Undergraduate Research Thesis Presented in partial fulfillment
More informationDistributed Systems Principles and Paradigms
Distributed Systems Principles and Paradigms Chapter 6 (version April 7, 28) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.2. Tel: (2)
More informationEstimation of clock offset from one-way delay measurement on asymmetric paths
Estimation of clock offset from one-way delay measurement on asymmetric paths Masato TSURU 1, Tetsuya TAKINE 2 and Yuji OIE 3 1 Telecommunications Advancement Organization of Japan. 2 Graduate School of
More informationTHE chase for the weakest system model that allows
1 Chasing the Weakest System Model for Implementing Ω and Consensus Martin Hutle, Dahlia Malkhi, Ulrich Schmid, Lidong Zhou Abstract Aguilera et al. and Malkhi et al. presented two system models, which
More informationImpact: an Unreliable Failure Detector Based on Processes Relevance and the Confidence Degree in the System
Impact: an Unreliable Failure Detector Based on Processes Relevance and the Confidence Degree in the System Rosetto Anubis, Luciana Arantes, Pierre Sens, Claudio Geyer To cite this version: Rosetto Anubis,
More informationConsistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong
Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms CS 249 Project Fall 2005 Wing Wong Outline Introduction Asynchronous distributed systems, distributed computations,
More informationCS505: Distributed Systems
Department of Computer Science CS505: Distributed Systems Lecture 10: Consensus Outline Consensus impossibility result Consensus with S Consensus with Ω Consensus Most famous problem in distributed computing
More informationDistributed systems Lecture 4: Clock synchronisation; logical clocks. Dr Robert N. M. Watson
Distributed systems Lecture 4: Clock synchronisation; logical clocks Dr Robert N. M. Watson 1 Last time Started to look at time in distributed systems Coordinating actions between processes Physical clocks
More informationAsynchronous Leasing
Asynchronous Leasing Romain Boichat Partha Dutta Rachid Guerraoui Distributed Programming Laboratory Swiss Federal Institute of Technology in Lausanne Abstract Leasing is a very effective way to improve
More information416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017
416 Distributed Systems Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017 1 Important Lessons (last lecture) Clocks on different systems will always behave differently Skew and drift
More informationWait-Free Dining Under Eventual Weak Exclusion
Wait-Free Dining Under Eventual Weak Exclusion Scott M. Pike, Yantao Song, and Srikanth Sastry Texas A&M University Department of Computer Science College Station, TX 77843-3112, USA {pike,yantao,sastry}@cs.tamu.edu
More informationCS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG Why Synchronization? You want to catch a bus at 6.05 pm, but your watch is
More informationGradient Clock Synchronization
Noname manuscript No. (will be inserted by the editor) Rui Fan Nancy Lynch Gradient Clock Synchronization the date of receipt and acceptance should be inserted later Abstract We introduce the distributed
More informationThe Weakest Failure Detector to Solve Mutual Exclusion
The Weakest Failure Detector to Solve Mutual Exclusion Vibhor Bhatt Nicholas Christman Prasad Jayanti Dartmouth College, Hanover, NH Dartmouth Computer Science Technical Report TR2008-618 April 17, 2008
More informationTime. Lakshmi Ganesh. (slides borrowed from Maya Haridasan, Michael George)
Time Lakshmi Ganesh (slides borrowed from Maya Haridasan, Michael George) The Problem Given a collection of processes that can... only communicate with significant latency only measure time intervals approximately
More informationProbabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components
Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components John Z. Sun Massachusetts Institute of Technology September 21, 2011 Outline Automata Theory Error in Automata Controlling
More informationClock Synchronization in the Presence of. Omission and Performance Failures, and. Processor Joins. Flaviu Cristian, Houtan Aghili and Ray Strong
Clock Synchronization in the Presence of Omission and Performance Failures, and Processor Joins Flaviu Cristian, Houtan Aghili and Ray Strong IBM Research Almaden Research Center Abstract This paper presents
More informationA Realistic Look At Failure Detectors
A Realistic Look At Failure Detectors C. Delporte-Gallet, H. Fauconnier, R. Guerraoui Laboratoire d Informatique Algorithmique: Fondements et Applications, Université Paris VII - Denis Diderot Distributed
More informationEarly consensus in an asynchronous system with a weak failure detector*
Distrib. Comput. (1997) 10: 149 157 Early consensus in an asynchronous system with a weak failure detector* André Schiper Ecole Polytechnique Fe dérale, De partement d Informatique, CH-1015 Lausanne, Switzerland
More information1.225J J (ESD 205) Transportation Flow Systems
1.225J J (ESD 25) Transportation Flow Systems Lecture 9 Simulation Models Prof. Ismail Chabini and Prof. Amedeo R. Odoni Lecture 9 Outline About this lecture: It is based on R16. Only material covered
More informationLecture 19: Interactive Proofs and the PCP Theorem
Lecture 19: Interactive Proofs and the PCP Theorem Valentine Kabanets November 29, 2016 1 Interactive Proofs In this model, we have an all-powerful Prover (with unlimited computational prover) and a polytime
More informationTimeliness, Failure-Detectors, and Consensus Performance ALEXANDER SHRAER
Timeliness, Failure-Detectors, and Consensus Performance ALEXANDER SHRAER Timeliness, Failure-Detectors, and Consensus Performance Research Thesis Submitted in Partial Fulfillment of the Requirements
More informationNotes on BAN Logic CSG 399. March 7, 2006
Notes on BAN Logic CSG 399 March 7, 2006 The wide-mouthed frog protocol, in a slightly different form, with only the first two messages, and time stamps: A S : A, {T a, B, K ab } Kas S B : {T s, A, K ab
More informationStatistics 150: Spring 2007
Statistics 150: Spring 2007 April 23, 2008 0-1 1 Limiting Probabilities If the discrete-time Markov chain with transition probabilities p ij is irreducible and positive recurrent; then the limiting probabilities
More informationClocks in Asynchronous Systems
Clocks in Asynchronous Systems The Internet Network Time Protocol (NTP) 8 Goals provide the ability to externally synchronize clients across internet to UTC provide reliable service tolerating lengthy
More informationFinally the Weakest Failure Detector for Non-Blocking Atomic Commit
Finally the Weakest Failure Detector for Non-Blocking Atomic Commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory EPFL Abstract Recent papers [7, 9] define the weakest failure detector
More informationTime-Decayed Correlated Aggregates over Data Streams
Time-Decayed Correlated Aggregates over Data Streams Graham Cormode AT&T Labs Research graham@research.att.com Srikanta Tirthapura Bojian Xu ECE Dept., Iowa State University {snt,bojianxu}@iastate.edu
More informationTermination Detection in an Asynchronous Distributed System with Crash-Recovery Failures
Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures Technical Report Department for Mathematics and Computer Science University of Mannheim TR-2006-008 Felix C. Freiling
More informationSeeking Fastness in Multi-Writer Multiple- Reader Atomic Register Implementations
ΚΥΠΡΙΑΚΗ ΔΗΜΟΚΡΑΤΙΑ ΕΥΡΩΠΑΪΚΗ ΕΝΩΣΗ Η ΔΕΣΜΗ 29- ΣΥΓΧΡΗΜΑΤΟΔΟΤΕΙΤΑΙ ΑΠΟ ΤΗΝ ΚΥΠΡΙΑΚΗ ΔΗΜΟΚΡΑΤΙΑ ΚΑΙ ΤΟ ΕΥΡΩΠΑΪΚΟ ΤΑΜΕΙΟ ΠΕΡΙΦΕΡΕΙΑΚΗΣ ΑΝΑΠΤΥΞΗΣ ΤΗΣ ΕΕ Seeking Fastness in Multi-Writer Multiple- Reader Atomic
More informationConvergence of Time Decay for Event Weights
Convergence of Time Decay for Event Weights Sharon Simmons and Dennis Edwards Department of Computer Science, University of West Florida 11000 University Parkway, Pensacola, FL, USA Abstract Events of
More informationDistributed Systems Fundamentals
February 17, 2000 ECS 251 Winter 2000 Page 1 Distributed Systems Fundamentals 1. Distributed system? a. What is it? b. Why use it? 2. System Architectures a. minicomputer mode b. workstation model c. processor
More informationHigh Performance Computing
Master Degree Program in Computer Science and Networking, 2014-15 High Performance Computing 2 nd appello February 11, 2015 Write your name, surname, student identification number (numero di matricola),
More informationLecture 6: Corrections; Dimension; Linear maps
Lecture 6: Corrections; Dimension; Linear maps Travis Schedler Tues, Sep 28, 2010 (version: Tues, Sep 28, 4:45 PM) Goal To briefly correct the proof of the main Theorem from last time. (See website for
More informationGenuine atomic multicast in asynchronous distributed systems
Theoretical Computer Science 254 (2001) 297 316 www.elsevier.com/locate/tcs Genuine atomic multicast in asynchronous distributed systems Rachid Guerraoui, Andre Schiper Departement d Informatique, Ecole
More informationAtmospheric delay. X, Y, Z : satellite cartesian coordinates. Z : receiver cartesian coordinates. In the vacuum the signal speed c is constant
Atmospheric delay In the vacuum the signal speed c is constant c τ = ρ = ( X X ) + ( Y Y ) + ( Z Z ) S S S 2 S 2 S 2 X, Y, Z : receiver cartesian coordinates S S S X, Y, Z : satellite cartesian coordinates
More informationModel Checking of Fault-Tolerant Distributed Algorithms
Model Checking of Fault-Tolerant Distributed Algorithms Part I: Fault-Tolerant Distributed Algorithms Annu Gmeiner Igor Konnov Ulrich Schmid Helmut Veith Josef Widder LOVE 2016 @ TU Wien Josef Widder (TU
More informationProbe Vehicle Runs or Loop Detectors?
Probe Vehicle Runs or Loop etectors? Effect of etector Spacing and Sample Size on Accuracy of Freeway Congestion Monitoring Jaimyoung Kwon, Karl Petty, and Pravin Varaiya Freeway congestion monitoring
More informationValency Arguments CHAPTER7
CHAPTER7 Valency Arguments In a valency argument, configurations are classified as either univalent or multivalent. Starting from a univalent configuration, all terminating executions (from some class)
More informationExam Spring Embedded Systems. Prof. L. Thiele
Exam Spring 20 Embedded Systems Prof. L. Thiele NOTE: The given solution is only a proposal. For correctness, completeness, or understandability no responsibility is taken. Sommer 20 Eingebettete Systeme
More informationCS505: Distributed Systems
Cristina Nita-Rotaru CS505: Distributed Systems. Required reading for this topic } Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson for "Impossibility of Distributed with One Faulty Process,
More informationClock Synchronization with Bounded Global and Local Skew
Clock Synchronization with ounded Global and Local Skew Distributed Computing Christoph Lenzen, ETH Zurich Thomas Locher, ETH Zurich Roger Wattenhofer, ETH Zurich October 2008 Motivation: No Global Clock
More informationHow to deal with uncertainties and dynamicity?
How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling
More informationExercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010
Exercises Stochastic Performance Modelling Hamilton Institute, Summer Instruction Exercise Let X be a non-negative random variable with E[X ]
More informationB. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (1)
B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (1) Introduction The basic, most serious disadvantage of simulation is that we don t get exact answers. Two different runs of the same model
More informationMarkov processes and queueing networks
Inria September 22, 2015 Outline Poisson processes Markov jump processes Some queueing networks The Poisson distribution (Siméon-Denis Poisson, 1781-1840) { } e λ λ n n! As prevalent as Gaussian distribution
More information5. Density evolution. Density evolution 5-1
5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;
More informationProving Safety Properties of the Steam Boiler Controller. Abstract
Formal Methods for Industrial Applications: A Case Study Gunter Leeb leeb@auto.tuwien.ac.at Vienna University of Technology Department for Automation Treitlstr. 3, A-1040 Vienna, Austria Abstract Nancy
More informationTTA and PALS: Formally Verified Design Patterns for Distributed Cyber-Physical
TTA and PALS: Formally Verified Design Patterns for Distributed Cyber-Physical DASC 2011, Oct/19 CoMMiCS Wilfried Steiner wilfried.steiner@tttech.com TTTech Computertechnik AG John Rushby rushby@csl.sri.com
More informationTime Synchronization
Massachusetts Institute of Technology Lecture 7 6.895: Advanced Distributed Algorithms March 6, 2006 Professor Nancy Lynch Time Synchronization Readings: Fan, Lynch. Gradient clock synchronization Attiya,
More informationRecap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA Recap We ve started a process of isolating questions that arise in big systems Tease out an abstract issue Treat
More informationDistributed Consensus
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort in distributed transactions Reaching agreement
More information6.897: Selected Topics in Cryptography Lectures 7 and 8. Lecturer: Ran Canetti
6.897: Selected Topics in Cryptography Lectures 7 and 8 Lecturer: Ran Canetti Highlights of past lectures Presented a basic framework for analyzing the security of protocols for multi-party function evaluation.
More informationSolutions to Homework Discrete Stochastic Processes MIT, Spring 2011
Exercise 1 Solutions to Homework 6 6.262 Discrete Stochastic Processes MIT, Spring 2011 Let {Y n ; n 1} be a sequence of rv s and assume that lim n E[ Y n ] = 0. Show that {Y n ; n 1} converges to 0 in
More informationRelaying Information Streams
Relaying Information Streams Anant Sahai UC Berkeley EECS sahai@eecs.berkeley.edu Originally given: Oct 2, 2002 This was a talk only. I was never able to rigorously formalize the back-of-the-envelope reasoning
More information1 Modelling and Simulation
1 Modelling and Simulation 1.1 Introduction This course teaches various aspects of computer-aided modelling for the performance evaluation of computer systems and communication networks. The performance
More informationA Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch
A Starvation-free Algorithm For Achieving 00% Throughput in an Input- Queued Switch Abstract Adisak ekkittikul ick ckeown Department of Electrical Engineering Stanford University Stanford CA 9405-400 Tel
More information