Distributed Algorithms (CAS 769) Dr. Borzoo Bonakdarpour
|
|
- Jordan Pearson
- 5 years ago
- Views:
Transcription
1 Distributed Algorithms (CAS 769) Week 1: Introduction, Logical clocks, Snapshots Dr. Borzoo Bonakdarpour Department of Computing and Software McMaster University Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 1/44
2 Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 2/44
3 Acknowledgments Most of the contents of these slides are obtained from the following books: Distributed Algorithms: An Intuitive Approach - Wan Fokkink Elements of Distributed Computing - Vijay K. Garg Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 3/44
4 Distributed Systems Some Definitions There is no universally accepted definition of a distributed system. What makes a system distributed? One man s constant is another man s variable. - Alan Perlis A distributed system is a system where I can t get my work done because a computer has failed that Ive never even heard of. A distributed system is one in which the failure of a computer you didn t even know existed can render your own computer unusable. - Leslie Lamport Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 4/44
5 Distributed Systems Some Definitions A distributed system is one that has multiple machines is connected by a network is cooperating on some task Communication in Distributed Systems Message passing Shared memory Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 5/44
6 Distributed Systems We begin with message passing systems. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 6/44
7 Preliminaries Message Passing Framework In a message passing framework, a distributed system consists of a finite graph of N processes (a process is a running program and each process has its local state) Each process carries a unique ID Processes communicate through FIFO channels Characteristics of Communication Communication is asynchronous; i.e., sending and receiving messages are distinct events, respectively Delay in channels is arbitrary but finite There are no garbled, duplicated or lost messages Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 7/44
8 Preliminaries Other Assumptions Absence of a shared clock Absence of shared memory Absence of accurate failure detection Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 8/44
9 Example {x1=0} Process P1() { e0 1 : send(p2,m1); e1 1 : x1=5; e2 1 : x1=10; e3 1 : recv(m2); } {x2=0} Process P2() { e0 2 : recv(m1); e1 2 : x2=15; e2 2 : x2=20; e3 2 : send(p1,m2); } Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 9/44
10 Preliminaries Transition Systems The behavior of a distributed algorithm, which runs on a distributed system is often captured by a transition system, which consists of: A set C of configurations (i.e., the composition of local states of its processes plus the messages in transit) A binary transition relation on C A set I C of initial configurations A configuration γ is terminal, if there does not exist γ C such that γ γ An execution of the distributed system is a sequence γ = γ 0 γ 1 γ 2 such that: γ 0 I for all i 0, we have γ i γ i+1 A configuration δ is reachable if there is a γ 0 I and a finite execution γ 0 γ 1 γ k, such that γ k = δ. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 10/44
11 Example For example, in the distributed algorithm on Slide 9: Configuration (x1 = 0, x2 = 0) is the only initial configuration. Configuration (x1 = 10, x2 = 20) is the only terminal configuration. (x1 = 0, x2 = 0) (x1 = 5, x2 = 0) (x1 = 10, x2 = 0) (x1 = 10, x2 = 15) (x1 = 10, x2 = 20) is a valid execution. And so is (x1 = 0, x2 = 0) (x1 = 5, x2 = 0) (x1 = 5, x2 = 15) (x1 = 10, x2 = 15) (x1 = 10, x2 = 20). Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 11/44
12 Preliminaries Question: Is configuration reachability decidable? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 12/44
13 Preliminaries A transition between two configurations is associate to an event. A process can perform an internal (i.e., change of local state of a process), send, or receive event. A process if called an initiator if its first event is either internal or send. An assertion is a predicate on the configuration of an algorithms (e.g., x y + 1). We use assertions to define safety properties. An assertion P is an invariant if: P(γ) for all γ I, and if γ γ and P(γ), then P(γ ). Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 13/44
14 Example For example, in the distributed algorithm on Slide 9: Instruction x1 = 5 is an internal event. Process P1 is an initiator. (x1 100 x2 50) is an invariant. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 14/44
15 Preliminaries Properties A property is a set of executions. Safety Properties A safety property typically expresses that something bad will never happen. For example: The temprature of a boiler never reaches 100 degress. If an interrupt occurs, a message will be printed in one second. Formally, a safety property is a set S of infinite executions where: γ S : α γ : γ : α γ γ S where α γ denotes the fact that α is a prefix of γ. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 15/44
16 Preliminaries Liveness Properties A liveness property typically expresses that something good will eventually happen. Formally, if L is a liveness property, then the following holds: α : γ : α γ L where α is a finite execution and γ is an infinite execution. Examples of liveness properties: Non-starvation. If an interrupt occurs, a message will be printed. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 16/44
17 Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 17/44
18 Causal Order In an asynchronous distributed system, in each configuration, different events can occur in different processes. Such occurrence of events are independent. The causal order is a binary relation on events in an execution, such that a b iff event a happened before event b. I.e., events in an execution cannot be reordered, so that a happens after b. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 18/44
19 Causal Order Causal Order (Happened Before) Formally, the causal order (also called happened before) is the smallest binary relation, where if a and b are events at the same process and a occurs before b, then a b, if a is a send event and b the corresponding receive event, then a b, and if a b and b c, then a c. Notice that the happened before relation is a partial order. We write a b if either a b or a = b. If a b and b a, then we say a and b are concurrent events. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 19/44
20 Computation A permutation of concurrent events in an execution does not affect the result of the execution. P 2 P 1 e 1 0 e 2 0 e 2 1 e 1 1 e 2 2 e 1 2 e 2 3 e 1 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 20/44
21 Computation The set of all permutations form the computation lattice. e 1 3, e2 3 e 1 2, e2 3 e 1 2, e2 2 e 1 1, e2 3 e 1 2, e2 1 e 1 1, e2 2 e 1 0, e2 3 e 1 2, e2 0 e 1 1, e2 1 e 1 0, e2 2 e 1 2 e 1 1, e2 0 e 1 0, e2 1 e 1 1 e 1 0, e2 0 e 1 0 {} Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 21/44
22 Happened before Vs. Physical Time Question: If a safety property holds in the happened before relation, does it hold in physical time as well? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 22/44
23 Logical Clocks Since a physical shared clock does not exists in a distributed system, we use logical clocks. A logical clock C maps occurrences of events in a computation to a partially ordered set such that a b C(a) < C(b) Lamport s clock LC assigns to each event a the length k of a longest causality chain a 1 a k = a in the computation. Obviously, a b LC(a) < LC(b) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 23/44
24 Logical Clocks Algorithm for Handling Lamport s clocks Consider an event a, and let k be the clock value of the previous event at the same process (k = 0 if there is no such previous event). If a is an internal or send event, then LC(a) = k + 1 If a is a receive event and b the corresponding send event, then LC(a) = max{k, LC(b)} + 1 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 24/44
25 Vector Clocks The vector clock VC has the property a b VC(a) < VC(b) Let a distributed system consist of processes p 0,..., p N 1. The vector clock assigns events a computation values in N N, whereby this set is provided with a partial order defined by: (k 0,..., k N 1 ) (l 0,..., l N 1 ) k i l i, for all i {0,..., N 1} The vector clock is defined as follows: VC(a) = (k 0,..., k N 1 ), where k i is the length of a longest causality chain a1 i ak i i of events at process p i with ak i i a. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 25/44
26 Example Demonstrate the evolution of the vector clock for this computation: P 2 P 1 e 1 0 e 2 0 e 2 1 e 1 1 e 2 2 e 1 2 e 2 3 e 1 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 26/44
27 Presentation outline Introduction Logical Clocks Snapshots (Global States) Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 27/44
28 Sanpshot (Global State) Definitions Snapshot cannot be defined based on physical time (e.g., the composition of all local state at the same time instant). We use the happened before relation to compute concurrent local states and, hence, snapshots. A (global) snapshot of an execution of a distributed algorithm is a configuration of this execution, consisting of the local states of the processes and the messages in transit. Intuitively, a snapshot is consistent if it represents a configuration of the current execution or a configuration of an execution in the same computation. Snapshots are useful to determine stable properties of a distributed system (i.e., properties that when become true, will remain true). E.g., deadlock, termination, loss of a token, etc. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 28/44
29 Sanpshot The Challenge Why is it difficult to compute a snapshot of a distributed system at run time? Taking a global snapshot is like taking the picture of the sky: the scene is so big that it cannot be captured by a single photograph. The challenge is taking multiple photographs at the same time is not quite possible. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 29/44
30 Sanpshot Terminology Suppose we design an algorithm that takes a snapshot of another distributed algorithm. We call the messages of the underlying algorithm basic messages and messages of the snapshot algorithm control messages. An event is called presnapshot if it occurs at a process before the local snapshot at this process is taken. Otherwise it s called postsnapshot. Consistent Snapshot A snapshot is consistent if for each presnapshot event a, all events that are causally before a are also presnapshot, a basic message included in a channel state iff the corresponding send event is presnapshot while the corresponding receive event is postsnapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 30/44
31 Example G 1 G 2 P 1 m 1 P 2 m 2 m 3 P 3 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 31/44
32 Example G 1 G 2 P 1 m 1 P 2 m 2 m 3 P 3 G 1 is not a consistent snapshot, but G 2 is. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 31/44
33 Chandy-Lamport Algorithm Assumption All channels are FIFO. Challenges All recorded local state are mutually concurrent The state of all channels are captured correctly. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 32/44
34 Chandy-Lamport Algorithm Solution We associate with each process a variable called color that is either red or white. All processes are initially white. Intuitively, the computed global snapshot corresponds to the state of the system just before the processes turn red. The algorithm relies on special control messages called markers Once a process turns red, it send a marker along all its outgoing channels before it sends out any message. A process turns red on receiving a marker if it has not already done so. No white process receives a marker from a red process. Why? This guarantees that local states are mutually concurrent. Why? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 33/44
35 Chandy-Lamport Algorithm Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 34/44
36 Chandy-Lamport Algorithm Classification of Basic Messages (ww messages) These messages are sent by a white process to a white process. These message correspond to the messages sent and received before the global snapshot. (rr messages) These message correspond to the messages sent and received after the global snapshot. (rw messages) These messages cross the global snapshot in the backward directions. Such a message will make the snapshot inconsistent. It is not possible to have such messages, if a marker is used. Why? (wr messages) These messages cross the global snapshot in the forward directions and participate in the state of the channel in the snapshot, because they are in transit when the snapshot is taken. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 35/44
37 Chandy-Lamport Algorithm P 1 ww rw wr rr P 2 Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 36/44
38 Chandy-Lamport Algorithm (Example) C A B
39 Chandy-Lamport Algorithm (Example) A m 1, mkr C mkr B Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 37/44
40 Chandy-Lamport Algorithm (Example) m 1, mkr C m2 A B mkr B computes the state of channel AB as {}. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 38/44
41 Chandy-Lamport Algorithm (Example) C mkr, m 2 m 1 A B C computes the state of channel AC as {}. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 39/44
42 Chandy-Lamport Algorithm (Example) C m 1 {m 2 } A B B computes the state of channel CB as {m 2 }. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 40/44
43 Chandy-Lamport Algorithm (Example) Question: Is the computed snapshot a configuration of the actual execution? Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 41/44
44 Lai-Yang Algorithm Assumptions This algorithm does not assume FIFO channels. But it assumes message piggybacking. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 42/44
45 Lai-Yang Algorithm The Algorithm Any initiator can decide to take a local snapshot. As long as a process hs not taken a local snapshot, it appends false to its outgoing basic messages. When a process has taken its local snapshot, it appends true to each outgoing basic message. When a process that hasn t yet taken a snapshot receives a message with true or a control message (see next slide) for the first time, it takes a local snapshot of its state before reception of this message. A process q computes as channel state of pq the basic messages without the tag true that it receives via pq after its local snapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 43/44
46 Lai-Yang Algorithm The Algorithm Question: How does q know when it can determine the channel state of pq? p sends a control message to q, informing q how many basic messages without the tag true p sent into pq. These control messages also ensure that all processes eventually take a local snapshot. Dr. Borzoo Bonakdarpour Distributed Algorithms (CAS 769) - McMaster University 44/44
Distributed Algorithms
Distributed Algorithms December 17, 2008 Gerard Tel Introduction to Distributed Algorithms (2 nd edition) Cambridge University Press, 2000 Set-Up of the Course 13 lectures: Wan Fokkink room U342 email:
More informationCptS 464/564 Fall Prof. Dave Bakken. Cpt. S 464/564 Lecture January 26, 2014
Overview of Ordering and Logical Time Prof. Dave Bakken Cpt. S 464/564 Lecture January 26, 2014 Context This material is NOT in CDKB5 textbook Rather, from second text by Verissimo and Rodrigues, chapters
More informationConsistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms. CS 249 Project Fall 2005 Wing Wong
Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanisms CS 249 Project Fall 2005 Wing Wong Outline Introduction Asynchronous distributed systems, distributed computations,
More informationChandy-Lamport Snapshotting
Chandy-Lamport Snapshotting COS 418: Distributed Systems Precept 8 Themis Melissaris and Daniel Suo [Content adapted from I. Gupta] Agenda What are global snapshots? The Chandy-Lamport algorithm Why does
More informationClocks in Asynchronous Systems
Clocks in Asynchronous Systems The Internet Network Time Protocol (NTP) 8 Goals provide the ability to externally synchronize clients across internet to UTC provide reliable service tolerating lengthy
More informationOrdering and Consistent Cuts Nicole Caruso
Ordering and Consistent Cuts Nicole Caruso Cornell University Dept. of Computer Science Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport Stanford Research Institute About
More informationSnapshots. Chandy-Lamport Algorithm for the determination of consistent global states <$1000, 0> <$50, 2000> mark. (order 10, $100) mark
8 example: P i P j (5 widgets) (order 10, $100) cji 8 ed state P i : , P j : , c ij : , c ji : Distributed Systems
More informationAgreement. Today. l Coordination and agreement in group communication. l Consensus
Agreement Today l Coordination and agreement in group communication l Consensus Events and process states " A distributed system a collection P of N singlethreaded processes w/o shared memory Each process
More informationMAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at
MAD Models & Algorithms for Distributed systems -- /5 -- download slides at http://people.rennes.inria.fr/eric.fabre/ 1 Today Runs/executions of a distributed system are partial orders of events We introduce
More informationToday. Vector Clocks and Distributed Snapshots. Motivation: Distributed discussion board. Distributed discussion board. 1. Logical Time: Vector clocks
Vector Clocks and Distributed Snapshots Today. Logical Time: Vector clocks 2. Distributed lobal Snapshots CS 48: Distributed Systems Lecture 5 Kyle Jamieson 2 Motivation: Distributed discussion board Distributed
More informationCS505: Distributed Systems
Cristina Nita-Rotaru CS505: Distributed Systems Ordering events. Lamport and vector clocks. Global states. Detecting failures. Required reading for this topic } Leslie Lamport,"Time, Clocks, and the Ordering
More informationCuts. Cuts. Consistent cuts and consistent global states. Global states and cuts. A cut C is a subset of the global history of H
Cuts Cuts A cut C is a subset of the global history of H C = h c 1 1 hc 2 2...hc n n A cut C is a subset of the global history of H The frontier of C is the set of events e c 1 1,ec 2 2,...ec n n C = h
More informationChapter 11 Time and Global States
CSD511 Distributed Systems 分散式系統 Chapter 11 Time and Global States 吳俊興 國立高雄大學資訊工程學系 Chapter 11 Time and Global States 11.1 Introduction 11.2 Clocks, events and process states 11.3 Synchronizing physical
More informationSlides for Chapter 14: Time and Global States
Slides for Chapter 14: Time and Global States From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, Addison-Wesley 2012 Overview of Chapter Introduction Clocks,
More informationTime is an important issue in DS
Chapter 0: Time and Global States Introduction Clocks,events and process states Synchronizing physical clocks Logical time and logical clocks Global states Distributed debugging Summary Time is an important
More informationA subtle problem. An obvious problem. An obvious problem. An obvious problem. No!
A subtle problem An obvious problem when LC = t do S doesn t make sense for Lamport clocks! there is no guarantee that LC will ever be S is anyway executed after LC = t Fixes: if e is internal/send and
More informationCausality & Concurrency. Time-Stamping Systems. Plausibility. Example TSS: Lamport Clocks. Example TSS: Vector Clocks
Plausible Clocks with Bounded Inaccuracy Causality & Concurrency a b exists a path from a to b Brad Moore, Paul Sivilotti Computer Science & Engineering The Ohio State University paolo@cse.ohio-state.edu
More informationOur Problem. Model. Clock Synchronization. Global Predicate Detection and Event Ordering
Our Problem Global Predicate Detection and Event Ordering To compute predicates over the state of a distributed application Model Clock Synchronization Message passing No failures Two possible timing assumptions:
More informationFigure 10.1 Skew between computer clocks in a distributed system
Figure 10.1 Skew between computer clocks in a distributed system Network Instructor s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 Pearson Education 2001
More informationCausality and Time. The Happens-Before Relation
Causality and Time The Happens-Before Relation Because executions are sequences of events, they induce a total order on all the events It is possible that two events by different processors do not influence
More informationLogical Time. 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation
Logical Time Nicola Dragoni Embedded Systems Engineering DTU Compute 1. Introduction 2. Clock and Events 3. Logical (Lamport) Clocks 4. Vector Clocks 5. Efficient Implementation 2013 ACM Turing Award:
More informationDistributed Algorithms Time, clocks and the ordering of events
Distributed Algorithms Time, clocks and the ordering of events Alberto Montresor University of Trento, Italy 2016/04/26 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International
More informationAbsence of Global Clock
Absence of Global Clock Problem: synchronizing the activities of different part of the system (e.g. process scheduling) What about using a single shared clock? two different processes can see the clock
More informationDistributed Systems. 06. Logical clocks. Paul Krzyzanowski. Rutgers University. Fall 2017
Distributed Systems 06. Logical clocks Paul Krzyzanowski Rutgers University Fall 2017 2014-2017 Paul Krzyzanowski 1 Logical clocks Assign sequence numbers to messages All cooperating processes can agree
More information416 Distributed Systems. Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017
416 Distributed Systems Time Synchronization (Part 2: Lamport and vector clocks) Jan 27, 2017 1 Important Lessons (last lecture) Clocks on different systems will always behave differently Skew and drift
More informationThe State Explosion Problem
The State Explosion Problem Martin Kot August 16, 2003 1 Introduction One from main approaches to checking correctness of a concurrent system are state space methods. They are suitable for automatic analysis
More informationTime. To do. q Physical clocks q Logical clocks
Time To do q Physical clocks q Logical clocks Events, process states and clocks A distributed system A collection P of N single-threaded processes (p i, i = 1,, N) without shared memory The processes in
More informationMAD. Models & Algorithms for Distributed systems -- 2/5 -- download slides at h9p://people.rennes.inria.fr/eric.fabre/
MAD Models & Algorithms for Distributed systems -- /5 -- download slides at h9p://peoplerennesinriafr/ericfabre/ 1 Today Runs/execuDons of a distributed system are pardal orders of events We introduce
More informationTime. Today. l Physical clocks l Logical clocks
Time Today l Physical clocks l Logical clocks Events, process states and clocks " A distributed system a collection P of N singlethreaded processes without shared memory Each process p i has a state s
More informationDistributed Computing. Synchronization. Dr. Yingwu Zhu
Distributed Computing Synchronization Dr. Yingwu Zhu Topics to Discuss Physical Clocks Logical Clocks: Lamport Clocks Classic paper: Time, Clocks, and the Ordering of Events in a Distributed System Lamport
More informationFinite-State Model Checking
EECS 219C: Computer-Aided Verification Intro. to Model Checking: Models and Properties Sanjit A. Seshia EECS, UC Berkeley Finite-State Model Checking G(p X q) Temporal logic q p FSM Model Checker Yes,
More informationGeneralized Consensus and Paxos
Generalized Consensus and Paxos Leslie Lamport 3 March 2004 revised 15 March 2005 corrected 28 April 2005 Microsoft Research Technical Report MSR-TR-2005-33 Abstract Theoretician s Abstract Consensus has
More informationRevising Distributed UNITY Programs is NP-Complete
Revising Distributed UNITY Programs is NP-Complete Borzoo Bonakdarpour and Sandeep S. Kulkarni Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824, U.S.A. {borzoo,sandeep}@cse.msu.edu
More informationFormal Methods for Monitoring Distributed Computations
Formal Methods for Monitoring Distributed Computations Vijay K. Garg Parallel and Distributed Systems Lab, Department of Electrical and Computer Engineering, The University of Texas at Austin, FRIDA 15
More informationRecognizing Safety and Liveness by Alpern and Schneider
Recognizing Safety and Liveness by Alpern and Schneider Calvin Deutschbein 17 Jan 2017 1 Intro 1.1 Safety What is safety? Bad things do not happen For example, consider the following safe program in C:
More informationSafety and Liveness Properties
Safety and Liveness Properties Lecture #6 of Model Checking Joost-Pieter Katoen Lehrstuhl 2: Software Modeling and Verification E-mail: katoen@cs.rwth-aachen.de November 5, 2008 c JPK Overview Lecture
More informationClock Synchronization
Today: Canonical Problems in Distributed Systems Time ordering and clock synchronization Leader election Mutual exclusion Distributed transactions Deadlock detection Lecture 11, page 7 Clock Synchronization
More informationRevising UNITY Programs: Possibilities and Limitations 1
Revising UNITY Programs: Possibilities and Limitations 1 Ali Ebnenasir, Sandeep S. Kulkarni, and Borzoo Bonakdarpour Software Engineering and Network Systems Laboratory Department of Computer Science and
More informationUsing Happens-Before Relationship to debug MPI non-determinism. Anh Vo and Alan Humphrey
Using Happens-Before Relationship to debug MPI non-determinism Anh Vo and Alan Humphrey {avo,ahumphre}@cs.utah.edu Distributed event ordering is crucial Bob receives two undated letters from his dad One
More information7680: Distributed Systems
Cristina Nita-Rotaru 7680: Distributed Systems Physical and logical clocks. Global states. Failure detection. Ordering events in distributed systems } Time is essential for ordering events in a distributed
More informationCausality and physical time
Logical Time Causality and physical time Causality is fundamental to the design and analysis of parallel and distributed computing and OS. Distributed algorithms design Knowledge about the progress Concurrency
More informationCS505: Distributed Systems
Cristina Nita-Rotaru CS505: Distributed Systems. Required reading for this topic } Michael J. Fischer, Nancy A. Lynch, and Michael S. Paterson for "Impossibility of Distributed with One Faulty Process,
More informationTime in Distributed Systems: Clocks and Ordering of Events
Time in Distributed Systems: Clocks and Ordering of Events Clocks in Distributed Systems Needed to Order two or more events happening at same or different nodes (Ex: Consistent ordering of updates at different
More informationDistributed Systems Principles and Paradigms
Distributed Systems Principles and Paradigms Chapter 6 (version April 7, 28) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.2. Tel: (2)
More informationModal and Temporal Logics
Modal and Temporal Logics Colin Stirling School of Informatics University of Edinburgh July 23, 2003 Why modal and temporal logics? 1 Computational System Modal and temporal logics Operational semantics
More informationCausal Broadcast Seif Haridi
Causal Broadcast Seif Haridi haridi@kth.se Motivation Assume we have a chat application Whatever written is reliably broadcast to group If you get the following output, is it ok? [Paris] Are you sure,
More informationDistributed Systems Fundamentals
February 17, 2000 ECS 251 Winter 2000 Page 1 Distributed Systems Fundamentals 1. Distributed system? a. What is it? b. Why use it? 2. System Architectures a. minicomputer mode b. workstation model c. processor
More informationComplexity Results in Revising UNITY Programs
Complexity Results in Revising UNITY Programs BORZOO BONAKDARPOUR Michigan State University ALI EBNENASIR Michigan Technological University and SANDEEP S. KULKARNI Michigan State University We concentrate
More informationCoordination. Failures and Consensus. Consensus. Consensus. Overview. Properties for Correct Consensus. Variant I: Consensus (C) P 1. v 1.
Coordination Failures and Consensus If the solution to availability and scalability is to decentralize and replicate functions and data, how do we coordinate the nodes? data consistency update propagation
More information1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007
Lecture 1 Introduction 1 Introduction 1.1 The Problem Domain Today, we are going to ask whether a system can recover from perturbation. Consider a children s top: If it is perfectly vertically, you can
More informationRevising Distributed UNITY Programs is NP-Complete
Revising Distributed UNITY Programs is NP-Complete Borzoo Bonakdarpour Sandeep S. Kulkarni Department of Computer Science and Engineering Michigan State University East Lansing, MI 48824, USA Email: {borzoo,sandeep}@cse.msu.edu
More informationDistributed Systems Principles and Paradigms. Chapter 06: Synchronization
Distributed Systems Principles and Paradigms Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, steen@cs.vu.nl Chapter 06: Synchronization Version: November 16, 2009 2 / 39 Contents Chapter
More informationModel Checking: An Introduction
Model Checking: An Introduction Meeting 3, CSCI 5535, Spring 2013 Announcements Homework 0 ( Preliminaries ) out, due Friday Saturday This Week Dive into research motivating CSCI 5535 Next Week Begin foundations
More informationIntroduction to Model Checking. Debdeep Mukhopadhyay IIT Madras
Introduction to Model Checking Debdeep Mukhopadhyay IIT Madras How good can you fight bugs? Comprising of three parts Formal Verification techniques consist of three parts: 1. A framework for modeling
More informationThe Weakest Failure Detector to Solve Mutual Exclusion
The Weakest Failure Detector to Solve Mutual Exclusion Vibhor Bhatt Nicholas Christman Prasad Jayanti Dartmouth College, Hanover, NH Dartmouth Computer Science Technical Report TR2008-618 April 17, 2008
More informationClojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014
Clojure Concurrency Constructs, Part Two CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014 1 Goals Cover the material presented in Chapter 4, of our concurrency textbook In particular,
More informationLecture 2 Automata Theory
Lecture 2 Automata Theory Ufuk Topcu Nok Wongpiromsarn Richard M. Murray Outline: Transition systems Linear-time properties Regular propereties EECI, 14 May 2012 This short-course is on this picture applied
More informationColorless Wait-Free Computation
Colorless Wait-Free Computation Companion slides for Distributed Computing Through Maurice Herlihy & Dmitry Kozlov & Sergio Rajsbaum Distributed Computing through 1 Colorless Tasks 32 19 21 19-Apr-14 2
More informationShared Memory vs Message Passing
Shared Memory vs Message Passing Carole Delporte-Gallet Hugues Fauconnier Rachid Guerraoui Revised: 15 February 2004 Abstract This paper determines the computational strength of the shared memory abstraction
More informationCS505: Distributed Systems
Department of Computer Science CS505: Distributed Systems Lecture 5: Time in Distributed Systems Overview Time and Synchronization Logical Clocks Vector Clocks Distributed Systems Asynchronous systems:
More informationOn Equilibria of Distributed Message-Passing Games
On Equilibria of Distributed Message-Passing Games Concetta Pilotto and K. Mani Chandy California Institute of Technology, Computer Science Department 1200 E. California Blvd. MC 256-80 Pasadena, US {pilotto,mani}@cs.caltech.edu
More informationFORMAL METHODS LECTURE III: LINEAR TEMPORAL LOGIC
Alessandro Artale (FM First Semester 2007/2008) p. 1/39 FORMAL METHODS LECTURE III: LINEAR TEMPORAL LOGIC Alessandro Artale Faculty of Computer Science Free University of Bolzano artale@inf.unibz.it http://www.inf.unibz.it/
More informationLecture 2 Automata Theory
Lecture 2 Automata Theory Ufuk Topcu Nok Wongpiromsarn Richard M. Murray EECI, 18 March 2013 Outline Modeling (discrete) concurrent systems: transition systems, concurrency and interleaving Linear-time
More informationThe algorithmic analysis of hybrid system
The algorithmic analysis of hybrid system Authors: R.Alur, C. Courcoubetis etc. Course teacher: Prof. Ugo Buy Xin Li, Huiyong Xiao Nov. 13, 2002 Summary What s a hybrid system? Definition of Hybrid Automaton
More informationDistributed Systems Time and Global State
Distributed Systems Time and Global State Allan Clark School of Informatics University of Edinburgh http://www.inf.ed.ac.uk/teaching/courses/ds Autumn Term 2012 Distributed Systems Time and Global State
More informationAsynchronous Models For Consensus
Distributed Systems 600.437 Asynchronous Models for Consensus Department of Computer Science The Johns Hopkins University 1 Asynchronous Models For Consensus Lecture 5 Further reading: Distributed Algorithms
More informationTECHNICAL REPORT YL DISSECTING ZAB
TECHNICAL REPORT YL-2010-0007 DISSECTING ZAB Flavio Junqueira, Benjamin Reed, and Marco Serafini Yahoo! Labs 701 First Ave Sunnyvale, CA 94089 {fpj,breed,serafini@yahoo-inc.com} Bangalore Barcelona Haifa
More informationAutomata-Theoretic Model Checking of Reactive Systems
Automata-Theoretic Model Checking of Reactive Systems Radu Iosif Verimag/CNRS (Grenoble, France) Thanks to Tom Henzinger (IST, Austria), Barbara Jobstmann (CNRS, Grenoble) and Doron Peled (Bar-Ilan University,
More informationTime, Clocks, and the Ordering of Events in a Distributed System
Time, Clocks, and the Ordering of Events in a Distributed System Motivating example: a distributed compilation service FTP server storing source files, object files, executable file stored files have timestamps,
More informationAnalysis and Optimization of Discrete Event Systems using Petri Nets
Volume 113 No. 11 2017, 1 10 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Analysis and Optimization of Discrete Event Systems using Petri Nets
More informationRecap. CS514: Intermediate Course in Operating Systems. What time is it? This week. Reminder: Lamport s approach. But what does time mean?
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA Recap We ve started a process of isolating questions that arise in big systems Tease out an abstract issue Treat
More informationAbstractions and Decision Procedures for Effective Software Model Checking
Abstractions and Decision Procedures for Effective Software Model Checking Prof. Natasha Sharygina The University of Lugano, Carnegie Mellon University Microsoft Summer School, Moscow, July 2011 Lecture
More informationThe Underlying Semantics of Transition Systems
The Underlying Semantics of Transition Systems J. M. Crawford D. M. Goldschlag Technical Report 17 December 1987 Computational Logic Inc. 1717 W. 6th St. Suite 290 Austin, Texas 78703 (512) 322-9951 1
More informationVerification of Probabilistic Systems with Faulty Communication
Verification of Probabilistic Systems with Faulty Communication P. A. Abdulla 1, N. Bertrand 2, A. Rabinovich 3, and Ph. Schnoebelen 2 1 Uppsala University, Sweden 2 LSV, ENS de Cachan, France 3 Tel Aviv
More information6.852: Distributed Algorithms Fall, Class 24
6.852: Distributed Algorithms Fall, 2009 Class 24 Today s plan Self-stabilization Self-stabilizing algorithms: Breadth-first spanning tree Mutual exclusion Composing self-stabilizing algorithms Making
More informationarxiv: v2 [cs.dc] 18 Feb 2015
Consensus using Asynchronous Failure Detectors Nancy Lynch CSAIL, MIT Srikanth Sastry CSAIL, MIT arxiv:1502.02538v2 [cs.dc] 18 Feb 2015 Abstract The FLP result shows that crash-tolerant consensus is impossible
More informationAxiomatic Semantics: Verification Conditions. Review of Soundness and Completeness of Axiomatic Semantics. Announcements
Axiomatic Semantics: Verification Conditions Meeting 12, CSCI 5535, Spring 2009 Announcements Homework 4 is due tonight Wed forum: papers on automated testing using symbolic execution 2 Questions? Review
More informationChapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS. 1. Introduction. Svein Willassen
Chapter 7 HYPOTHESIS-BASED INVESTIGATION OF DIGITAL TIMESTAMPS Svein Willassen Abstract Timestamps stored on digital media play an important role in digital investigations. However, the evidentiary value
More informationAnalyzing Isochronic Forks with Potential Causality
Analyzing Isochronic Forks with Potential Causality Rajit Manohar Cornell NYC Tech New York, NY 10011, USA rajit@csl.cornell.edu Yoram Moses Technion-Israel Institute of Technology Haifa 32000, Israel
More informationSection 6 Fault-Tolerant Consensus
Section 6 Fault-Tolerant Consensus CS586 - Panagiota Fatourou 1 Description of the Problem Consensus Each process starts with an individual input from a particular value set V. Processes may fail by crashing.
More informationLecture 4 Event Systems
Lecture 4 Event Systems This lecture is based on work done with Mark Bickford. Marktoberdorf Summer School, 2003 Formal Methods One of the major research challenges faced by computer science is providing
More informationAlgorithmic verification
Algorithmic verification Ahmed Rezine IDA, Linköpings Universitet Hösttermin 2018 Outline Overview Model checking Symbolic execution Outline Overview Model checking Symbolic execution Program verification
More informationMethods for the specification and verification of business processes MPB (6 cfu, 295AA)
Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Roberto Bruni http://www.di.unipi.it/~bruni 17 - Diagnosis for WF nets 1 Object We study suitable diagnosis techniques
More informationPetri nets. s 1 s 2. s 3 s 4. directed arcs.
Petri nets Petri nets Petri nets are a basic model of parallel and distributed systems (named after Carl Adam Petri). The basic idea is to describe state changes in a system with transitions. @ @R s 1
More informationAutomatic Synthesis of Distributed Protocols
Automatic Synthesis of Distributed Protocols Rajeev Alur Stavros Tripakis 1 Introduction Protocols for coordination among concurrent processes are an essential component of modern multiprocessor and distributed
More informationTermination Detection in an Asynchronous Distributed System with Crash-Recovery Failures
Termination Detection in an Asynchronous Distributed System with Crash-Recovery Failures Technical Report Department for Mathematics and Computer Science University of Mannheim TR-2006-008 Felix C. Freiling
More informationC 1. Recap: Finger Table. CSE 486/586 Distributed Systems Consensus. One Reason: Impossibility of Consensus. Let s Consider This
Recap: Finger Table Finding a using fingers Distributed Systems onsensus Steve Ko omputer Sciences and Engineering University at Buffalo N102 86 + 2 4 N86 20 + 2 6 N20 2 Let s onsider This
More informationWarm-Up Problem. Please fill out your Teaching Evaluation Survey! Please comment on the warm-up problems if you haven t filled in your survey yet.
Warm-Up Problem Please fill out your Teaching Evaluation Survey! Please comment on the warm-up problems if you haven t filled in your survey yet Warm up: Given a program that accepts input, is there an
More informationDecentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication
Decentralized Control of Discrete Event Systems with Bounded or Unbounded Delay Communication Stavros Tripakis Abstract We introduce problems of decentralized control with communication, where we explicitly
More informationDistributed Mutual Exclusion Based on Causal Ordering
Journal of Computer Science 5 (5): 398-404, 2009 ISSN 1549-3636 2009 Science Publications Distributed Mutual Exclusion Based on Causal Ordering Mohamed Naimi and Ousmane Thiare Department of Computer Science,
More informationReal Time Operating Systems
Real Time Operating ystems Luca Abeni luca.abeni@unitn.it Interacting Tasks Until now, only independent tasks... A job never blocks or suspends A task only blocks on job termination In real world, jobs
More informationTemporal logics and explicit-state model checking. Pierre Wolper Université de Liège
Temporal logics and explicit-state model checking Pierre Wolper Université de Liège 1 Topics to be covered Introducing explicit-state model checking Finite automata on infinite words Temporal Logics and
More informationComplexity Results in Revising UNITY Programs
Complexity Results in Revising UNITY Programs BORZOO BONAKDARPOUR Michigan State University ALI EBNENASIR Michigan Technological University SANDEEP S. KULKARNI Michigan State University We concentrate
More informationAGREEMENT PROBLEMS (1) Agreement problems arise in many practical applications:
AGREEMENT PROBLEMS (1) AGREEMENT PROBLEMS Agreement problems arise in many practical applications: agreement on whether to commit or abort the results of a distributed atomic action (e.g. database transaction)
More informationCIS 842: Specification and Verification of Reactive Systems. Lecture Specifications: Specification Patterns
CIS 842: Specification and Verification of Reactive Systems Lecture Specifications: Specification Patterns Copyright 2001-2002, Matt Dwyer, John Hatcliff, Robby. The syllabus and all lectures for this
More informationVariations on Itai-Rodeh Leader Election for Anonymous Rings and their Analysis in PRISM
Variations on Itai-Rodeh Leader Election for Anonymous Rings and their Analysis in PRISM Wan Fokkink (Vrije Universiteit, Section Theoretical Computer Science CWI, Embedded Systems Group Amsterdam, The
More information6.852: Distributed Algorithms Fall, Class 10
6.852: Distributed Algorithms Fall, 2009 Class 10 Today s plan Simulating synchronous algorithms in asynchronous networks Synchronizers Lower bound for global synchronization Reading: Chapter 16 Next:
More informationDo we have a quorum?
Do we have a quorum? Quorum Systems Given a set U of servers, U = n: A quorum system is a set Q 2 U such that Q 1, Q 2 Q : Q 1 Q 2 Each Q in Q is a quorum How quorum systems work: A read/write shared register
More informationA Brief Introduction to Model Checking
A Brief Introduction to Model Checking Jan. 18, LIX Page 1 Model Checking A technique for verifying finite state concurrent systems; a benefit on this restriction: largely automatic; a problem to fight:
More informationA Polynomial-Time Algorithm for Checking Consistency of Free-Choice Signal Transition Graphs
Fundamenta Informaticae XX (2004) 1 23 1 IOS Press A Polynomial-Time Algorithm for Checking Consistency of Free-Choice Signal Transition Graphs Javier Esparza Institute for Formal Methods in Computer Science
More information