Distributed Systems Part II Solution to Exercise Sheet 10

Similar documents
Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

CS-206 Concurrency. Lecture 8 Concurrent. Data structures. a b c d. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(c) remove(b)

INF 4140: Models of Concurrency Series 3

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

Quadratic Equations Part I

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

Distributed Systems Principles and Paradigms. Chapter 06: Synchronization

Distributed Systems Principles and Paradigms

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Embedded Systems - FS 2018

An analogy from Calculus: limits

Lists, Stacks, and Queues (plus Priority Queues)

Introduction to Karnaugh Maps

1 Lamport s Bakery Algorithm

Central Algorithmic Techniques. Iterative Algorithms

Topic Contents. Factoring Methods. Unit 3: Factoring Methods. Finding the square root of a number

Decision Trees. Introduction. Some facts about decision trees: They represent data-classification models.

Basics of Proofs. 1 The Basics. 2 Proof Strategies. 2.1 Understand What s Going On

Lock using Bakery Algorithm

VLSI Physical Design Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

UNIVERSITY OF CALIFORNIA

Expected Value II. 1 The Expected Number of Events that Happen

Fundamental Algorithms

CS 425 / ECE 428 Distributed Systems Fall Indranil Gupta (Indy) Oct. 5, 2017 Lecture 12: Time and Ordering All slides IG

Admin. ! Will post sample questions soon. 1. Choose a bit-length k. 2. Choose two primes p and q which can be represented with at most k bits

1. In Activity 1-1, part 3, how do you think graph a will differ from graph b? 3. Draw your graph for Prediction 2-1 below:

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

where Q is a finite set of states

Discrete Event Systems Exam

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Random Number Generation. Stephen Booth David Henty

CIS 4930/6930: Principles of Cyber-Physical Systems

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

15-451/651: Design & Analysis of Algorithms September 13, 2018 Lecture #6: Streaming Algorithms last changed: August 30, 2018

Mathematical Induction

Lab Exercise 03: Gauss Law

HW8. Due: November 1, 2018

ASSIGNMENT 1 SOLUTIONS

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 1

1 Sequences and Summation

Saturday, April 23, Dependence Analysis

1 Maintaining a Dictionary

Different encodings generate different circuits

Alpha-Beta Pruning: Algorithm and Analysis

Solution suggestions for examination of Logic, Algorithms and Data Structures,

Module 5: CPU Scheduling

CSC236 Week 2. Larry Zhang

Correctness of Concurrent Programs

Chapter 6: CPU Scheduling

Turing Machines and Time Complexity

Dynamic Programming. Cormen et. al. IV 15

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007

Probability Year 9. Terminology

Department of Electrical and Computer Engineering The University of Texas at Austin

CS154 Final Examination

Algorithms. Algorithms 2.4 PRIORITY QUEUES. Pro tip: Sit somewhere where you can work in a group of 2 or 3

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

Shannon-Fano-Elias coding

Software Testing Lecture 5. Justin Pearson

CS162 Operating Systems and Systems Programming Lecture 7 Semaphores, Conditional Variables, Deadlocks"

Algorithms and Data Structures

Probability Year 10. Terminology

One-to-one functions and onto functions

MI 4 Mathematical Induction Name. Mathematical Induction

COE 202: Digital Logic Design Sequential Circuits Part 4. Dr. Ahmad Almulhem ahmadsm AT kfupm Phone: Office:

Clock-driven scheduling

Searching, mainly via Hash tables

MITOCW watch?v=vjzv6wjttnc

CSCB63 Winter Week10 - Lecture 2 - Hashing. Anna Bretscher. March 21, / 30

Overview. Discrete Event Systems Verification of Finite Automata. What can finite automata be used for? What can finite automata be used for?

Queueing Theory. VK Room: M Last updated: October 17, 2013.

Lecture 2: Divide and conquer and Dynamic programming

Algorithms Exam TIN093 /DIT602

Session-Based Queueing Systems

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

I/O Devices. Device. Lecture Notes Week 8

Please bring the task to your first physics lesson and hand it to the teacher.

CSE 4201, Ch. 6. Storage Systems. Hennessy and Patterson

2 GRAPH AND NETWORK OPTIMIZATION. E. Amaldi Introduction to Operations Research Politecnico Milano 1

Learning Objectives:

Section 2.7 Solving Linear Inequalities

Computational Complexity. This lecture. Notes. Lecture 02 - Basic Complexity Analysis. Tom Kelsey & Susmit Sarkar. Notes

Analysis of Algorithms I: Perfect Hashing

Lecture 6: The Pigeonhole Principle and Probability Spaces

E23: Hotel Management System Wen Yunlu Hu Xing Chen Ke Tang Haoyuan Module: EEE 101

Boyle s Law and Charles Law Activity

CS173 Lecture B, November 3, 2015

Manual of Logical Style (fresh version 2018)

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies

Chapter 3 Deterministic planning

Bayes Nets III: Inference

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

Alpha-Beta Pruning: Algorithm and Analysis

Designing geared 3D twisty puzzles (on the example of Gear Pyraminx) Timur Evbatyrov, Jan 2012

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Improper Nesting Example

Discrete Mathematics for CS Spring 2008 David Wagner Note 4

Mathematica Project 3

Midterm 1. Name: TA: U.C. Berkeley CS70 : Algorithms Midterm 1 Lecturers: Anant Sahai & Christos Papadimitriou October 15, 2008

Data Structures and Algorithms Winter Semester

Transcription:

Distributed Computing HS 2012 Prof. R. Wattenhofer / C. Decker Distributed Systems Part II Solution to Exercise Sheet 10 1 Spin Locks A read-write lock is a lock that allows either multiple processes to read some resource, or one process to write some resource. a) Write a simple read-write lock using only spinning, one shared integer and the CAS operation. Do not use local variables (it is ok to have variable within a method, but not outside). b) What is the problem with your lock? Hint: what happens if a lot of processes access the lock repeatedly? We now build a queue lock using only spinning, one shared integer, one local integer per process and the CAS operation. c) To prepare for this task, answer the following questions: i) Head and tail of the queue have to be stored in the shared integer. What is the head and the tail, and how can they be stored in one integer? Hint: could the head be a process id? Or is there a much easier solution? ii) How could a process add itself to the queue? Hint: you need the local integer of the process for this operation. iii) When has a process acquired the lock? iv) How does a process release the lock? d) Write down the lock using pseudo-code. Do not forget to initialize all variables.

Solution a) We use the shared integer state to indicate the state of the lock. The lock is free if state is 0. The lock is in write mode if state is -1. And it is in read-mode if state is n, with n > 0. // the shared i n t e g e r i n t s t a t e = 0 ; // a c q u i r e the l o c k f o r a read o p e r a t i o n r e a d l o c k ( ) { i n t value = s t a t e. read ( ) ; i f ( value >= 0 ){ i f ( s t a t e.cas( value, value+1 ) == value ){ // l o c k acquired // r e l e a s e the l o c k read unlock ( ) { while ( s t a t e.cas( s t a t e, s t a t e 1 )!= s t a t e ) ; // a c q u i r e the l o c k f o r a w r i t e o p e r a t i o n w r i t e l o c k ( ) { i n t value = s t a t e ; i f ( s t a t e == 0 ){ i f ( s t a t e.cas( 0, 1 ) == 0 ){ // l o c k acquired // r e l e a s e the l o c k w r i t e u n l o c k ( ) { // no need to t e s t, no other p r o c e s s can c a l l t h i s at // the same time. s t a t e.cas( 1, 0 ) ; b) Starvation is a problem. Example: if many processes constantly acquire and release the read-lock, then the state variable always remains bigger than 0. If one process wants to acquire the write-lock, it will never get the chance. c) The basic idea behind this lock is a ticketing service as can be found in swiss post offices. i) The tail is the ticket which can be drawn by the next process. The head denotes the ticket which can acquire the lock. If we assume an integer consists of 32 bits, then we can use the first 16 bits for the head, and the last 16 bits for the tail. 2

ii) The process reads the value of the tail, and then increments the tail. This should of course happen in a secure way, i.e. no two processes have the same ticket. iii) When its ticket equals the head. iv) The process increments the head by one. d) // the shared i n t e g e r c o n t a i n i n g head t a i l shared i n t queue = 0 ; // the t i c k e t o f t h i s p r o c e s s i n t l o c a l = 0 ; // a c q u i r e the l o c k l o c k ( ) { // 1. add t h i s p r o c e s s to the queue l o c a l = add ( ) ; // 2. wait u n t i l the l o c k i s acquired while ( head ( )!= l o c a l ( ) ) ; // add t h i s p r o c e s s to the queue i n t add ( ) { i n t value = queue. read ( ) ; i f ( queue.cas( value, value+1 ) == value ){ r eturn value & 0xFF ; // r e t u r n s the c u r r e n t head o f the queue i n t head ( ) { i n t value = s t a t e. read ( ) ; r eturn ( value >>> 16) & 0xFF ; // r e l e a s e s the l o c k unlock ( ) { i n t value = queue. read ( ) ; i n t head = ( value >>> 16) & 0xFF i n t t a i l = value & 0xFF i n t next = ( head+1) << 16 t a i l ; i f ( queue.cas( value, next ) == value ){ 2 ALock2 Have a look at the source code below. It is a modified version of the ALock (slides 3/46 ff). p u b l i c c l a s s ALock2 implements Lock { i n t [ ] f l a g s = { true, true, f a l s e,..., f a l s e ; AtomicInteger next = new AtomicInteger ( 0 ) ; 3

ThreadLocal<Integer > myslot ; p u b l i c void l o c k ( ) { myslot = next. getandincrement ( ) ; while (! f l a g s [ myslot % n ] ) { f l a g s [ myslot % n ] = f a l s e ; p u b l i c void unlock ( ) { f l a g s [ ( myslot+2) % n ] = true ; a) What was the intention of the author of ALock2? b) Will ALock2 work properly? Why (not)? c) Give an idea how to repair ALock2. Hint: don t bother about performance. Solution a) The author wants that two processes can acquire the lock simultaneously. b) The lock is seriously flawed. An example shows how the lock will fail: Assume there are n processes, all processes try to acquire the lock. The first two processes (p 1, p 2 ) get the lock, the others have to wait. Process p 1 keeps the lock a very long time, while p 2 releases the lock almost immediately. Afterwards every second process (p 4, p 6,...) acquires and releases the lock. One half of all process are waiting on the lock (p 3, p 5,...), the others continues to work (p 4, p 6,...). If the working process now start to acquire the lock again, then they wait in slots that are already in use. c) A solution would be to increase the size of the array to at least 2 n and further block the lock() method if a process holds the other lock for a (too) long time. This way, wraparounds are handled correctly. Unfortunately FIFO (first in, first out) is still not guaranteed. In a second step one could make the unlock method more intelligent: instead of jumping two slots, the method searches for the oldest slot waiting for a lock. To simplify this search, the boolean array is replaced by an enum (or integer) array holding four states: unused, lockable, working, and finished. We can use a CAS operation to protect the unlock method against race conditions (two process may invoke the method concurrently). p u b l i c c l a s s ALock2 implements Lock{ p u b l i c enum State {UNUSED, LOCKABLE, WORKING, FINISHED ; // >= 2n elements : 2 l o c k a b l e, >= (n 2) unused, n f i n i s h e d State [ ] f l a g s = {LOCKABLE, LOCKABLE, UNUSED,..., UNUSED, FINISHED,... ; AtomicInteger next = new AtomicInteger ( 0 ) ; ThreadLocal<I n t e g e r > myslot ; p u b l i c void l o c k ( ) { // s p i n u n t i l s l o t becomes l o c k a b l e myslot = next. getandincrement ( ) ; while ( f l a g s [ myslot%n ]!= LOCKABLE){ 4

// mark as working f l a g s [ myslot%n ] = WORKING; // wait f o r other l o c k i f i t s p r o c e s s i s too slow ( l a r g e r array h e l p s here ) // & mark the s l o t as unused to support wrap arounds while ( f l a g s [ ( myslot f l a g s. s i z e ( ) + n)%n ]!= FINISHED){ f l a g s [ ( myslot f l a g s. s i z e ( ) + n)%n ] = UNUSED; p u b l i c void unlock ( ) { f l a g s [ myslot ] = FINISHED ; // mark my s l o t as f i n i s h e d... // s e t next unused s l o t to l o c k a b l e i n t index = myslot + 1 ; while ( t r u e ){ i f ( f l a g s [ index%n ]!= UNUSED) index++; e l s e i f ( f l a g s [ index%n ]. CAS(UNUSED, LOCKABLE) == UNUSED) break ; // next unused s l o t became l o c k a b l e 3 MCS Queue Lock See slides 3/65 ff. a) A developer suggests to add an abort flag to each node: if a process no longer wants to wait it sets this abort flag to true. If a process unlocks the lock, it may see the abort flag of the next node, jump over the aborted node, and check the successor s successor node. Modify the basic algorithm to support aborts. Optional: sketch a proof for your answer. Hint: Be aware of race-conditions! b) Assuming many processes may abort concurrently, does your answer from a) still work? Explain why. If it does not work: modify your algorithm to allow concurrent aborts. Optional: sketch a proof for your answer. c) Instead of a locked and an aborted flag one could use an integer, and modify the integer with the CAS operation. What do you think about this idea? How is the algorithm affected? How is performance affected? d) The CLH lock (slide 3/56) is basically the same as an MCS lock. Conceptually the only difference is, that a process spins on the locked field of the predecessor node, not on its own node. What could be an advantage of CLH over MCS and what could be a disadvantage? Solution a) There is more than one solution, but we can solve this problem without using RMW registers or other locks. It is important to set and read the flags in the right order: The unlock method first sets locked, then reads aborted. The abort method on the other hand first sets aborted, then reads locked. This way if unlock and abort run in parallel, one of them must already have written its flag before the other can read it. In the worst case unlock is called twice for some process, but that is not a problem. Unlocking an already unlocked lock results in no action. 5

p u b l i c void unlock ( ) { i f (... missing s u c c e s s o r... )... wait f o r missing s u c c e s s o r qnode. next. locked = f a l s e ; i f ( qnode. next. aborted ){ i f (... qnode. next misses s u c c e s s o r... ){ i f (... r e a l l y no s u c c e s s o r... ) e l s e {... wait f o r missing s u c c e s s o r... qnode. next. next. locked = f a l s e ; p u b l i c void abort ( ) { qnode. aborted = true ; i f (! qnode. locked ){ unlock ( ) ; b) The solution of a) does not yet work for concurrent aborts. Making the unlock method recursive will help. p u b l i c void unlock ( ) { unlock ( qnode ) ; p r i v a t e void unlock ( QNode qnode ){ // as b e f o r e... i f (... missing s u c c e s s o r... )... wait f o r missing s u c c e s s o r qnode. next. locked = f a l s e ; i f ( qnode. next. aborted ){ // wait f o r s u c c e s s o r o f qnode. next i f (... ){... e l s e {... unlock ( qnode. next ) ; c) There are four combinations of values the locked and aborted flag can have. We can easily encode these combinations in an integer. We would not need too worry about the order in which we read and write to the flags, as we could do this atomically. So the algorithm would get easier. We could also ensure that unlock is called only once. Depending on the benchmark this could increase the performance. On the other hand, a CAS operation is quite expensive and could decrease performance. d) - There could be problems with caches: spinning on a value that belongs to another process can introduce additional load on the bus, and thus slow down the entire application. 6

+ The implementation is much easier: when releasing the lock one has only to set its own locked flag to false. + Also aborting is easier: a blocked process could read the state of its predecessor. If the predecessor is aborted, then the successor can just remove the node from the queue, and continue reading values from its predecessor s predecessor. 7