CS-206 Concurrency. Lecture 10. Scheduling & Work Distribution. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/

Size: px

Start display at page:

Download "CS-206 Concurrency. Lecture 10. Scheduling & Work Distribution. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/"

Winfred Atkins
6 years ago
Views:

1 CS-206 Concurrency Lecture 10 zzz Scheduling & Work Distribution Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Adapted from slides originally developed by Maurice Herlihy and Nir Shavit from the Art of Multiprocessor Programming, and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring 2015 Lec.9-1

2 Where are We? Lecture & Lab M T W T F 16-Feb 17-Feb 18-Feb 19-Feb 20-Feb 23-Feb 24-Feb 25-Feb 26-Feb 27-Feb 2-Mar 3-Mar 4-Mar 5-Mar 6-Mar 9-Mar 10-Mar 11-Mar 12-Mar 13-Mar 16-Mar 17-Mar 18-Mar 19-Mar 20-Mar 23-Mar 24-Mar 25-Mar 26-Mar 27-Mar 30-Mar 31-Mar 1-Apr 2-Apr 3-Apr 6-Apr 7-Apr 8-Apr 9-Apr 10-Apr 13-Apr 14-Apr 15-Apr 16-Apr 17-Apr 20-Apr 21-Apr 22-Apr 23-Apr 24-Apr 27-Apr 28-Apr 29-Apr 30-Apr 1-May 4-May 5-May 6-May 7-May 8-May 11-May 12-May 13-May 14-May 15-May 18-May 19-May 20-May 21-May 22-May 25-May 26-May 27-May 28-May 29-May u Work scheduling w Threads vs. Futures w Dependencies u Work distribution w Work stealing w Work balancing u Next week w GPUs EPFL CS-206 Spring 2015 Lec.9-2

3 Evaluations: Sample feedback u u u u u u u u u u Faire faire aux étudiants des devoirs notés et ne pas les avoir rendu alors que la fin du semestre approche est pénible et démotivant. I find (it) strange to forbid (the) use of laptops in a CS course in the 21 st century....the midterm was only about learning the slides and did not require any reflection, it's not what we expect of an engineer. More explanations on slides would be a plus...some notes would be useful for the exam The class is alarmingly average. But that's not necessarily a bad thing. The course is dynamic and interesting, well structured. The lab sessions are adapted. I enjoy going for class. Good course, Babak is cool, but the slides are poor. The slides are hard to read and understand What do Bob & Alice have to do with concurrency???? Le professeur ne se donne pas beaucoup de peine pour faire un cours pédagogique. Bon diaporama (fr.wikipedia.org/wiki/diaporama). Excellent professor. Really good assignments and enough time to do it EPFL CS-206 Spring 2015 Lec.9-3

4 Matrix Add ( C) = A ( ) + ( B) EPFL CS-206 Spring 2015 Lec.9-4

5 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} EPFL CS-206 Spring 2015 Lec.9-5

6 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} a thread EPFL CS-206 Spring 2015 Lec.9-6

7 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} compute Which matrix entry to EPFL CS-206 Spring 2015 Lec.9-7

8 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} Actual computation EPFL CS-206 Spring 2015 Lec.9-8

9 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); EPFL CS-206 Spring 2015 Lec.9-9

10 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Create n x n threads EPFL CS-206 Spring 2015 Lec.9-10

11 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them EPFL CS-206 Spring 2015 Lec.9-11

12 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-12

13 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); What s wrong with this for (int row ) picture? for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-13

14 Thread Overhead u Threads Require resources w Memory for stacks w Setup, teardown w Scheduler overhead u Short-lived threads w Ratio of work versus overhead bad EPFL CS-206 Spring 2015 Lec.9-14

15 Thread Pools u More sensible to keep a pool of w long-lived threads u Threads assigned short-lived tasks w Run the task w Rejoin pool w Wait for next assignment EPFL CS-206 Spring 2015 Lec.9-15

16 Thread Pool = Abstraction u Insulate programmer from platform w Big machine, big pool w Small machine, small pool u Portable code w Works across platforms w Worry about algorithm, not platform EPFL CS-206 Spring 2015 Lec.9-16

17 ExecutorService Interface u In java.util.concurrent w Task = Runnable object w If no result value expected w Calls run() method. w Task = Callable<T> object w If result value of type T expected w Calls T call() method. EPFL CS-206 Spring 2015 Lec.9-17

18 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); EPFL CS-206 Spring 2015 Lec.9-18

19 Example Overhead: Threads vs. Futures u Creating threads: ~200μs u Submitting tasks to futures: ~10μs u Each task will do a certain amount of work u What fraction of overall time is in task submission? w w For tasks with 1μs of work? For tasks with 1ms of work? EPFL CS-206 Spring 2015 Lec.9-19

20 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); Submitting a Callable<T> task returns a Future<T> object EPFL CS-206 Spring 2015 Lec.9-20

21 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); The Future s get() method blocks until the value is available EPFL CS-206 Spring 2015 Lec.9-21

22 Note u Executor Service submissions w Are purely advisory in nature u The executor w Is free to ignore any such advice w And could execute tasks sequentially EPFL CS-206 Spring 2015 Lec.9-22

23 Matrix Addition (4 elements) C C A + B B + A = C C A B A B EPFL CS-206 Spring 2015 Lec.9-23

24 Matrix Addition (4 elements) C C A + B B + A = C C A B A B parallel additions EPFL CS-206 Spring 2015 Lec.9-24

25 Matrix Addition (4 elements) C C A + B B + A = C C A B A B u How do we add larger matrices? w Hint: for now assume dimensions are power of 2 EPFL CS-206 Spring 2015 Lec.9-25

26 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-26

27 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Data matrix reference public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-27

28 Matrix Addition Task public class Matrix { int dim; double[][] data; variables int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Access data through displacement public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-28

29 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); }..so we can split the matrix in constant time EPFL CS-206 Spring 2015 Lec.9-29

30 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } EPFL CS-206 Spring 2015 Lec.9-30

31 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } Split into 4 matrices.. EPFL CS-206 Spring 2015 Lec.9-31

32 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = } with half dimentions new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; EPFL CS-206 Spring 2015 Lec.9-32

33 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = } new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; Reassign displacement variables, no data copy EPFL CS-206 Spring 2015 Lec.9-33

34 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-34

35 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Constant-time operation }} EPFL CS-206 Spring 2015 Lec.9-35

36 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Submit 4 recursive tasks Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-36

37 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Base case: add directly }} EPFL CS-206 Spring 2015 Lec.9-37

38 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} Let them finish EPFL CS-206 Spring 2015 Lec.9-38

39 Matrix Addition DAG u With dim = 2 Add(2) EPFL CS-206 Spring 2015 Lec.9-39

40 Matrix Addition DAG u With dim = 2 Add(2) Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-40

41 Matrix Addition DAG u With dim = 2 call Add(2) get Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-41

42 Dependencies u Matrix example is not typical u Individual tasks are independent w Don t need results of one task w Only need to split recursively u Often task results are not independent EPFL CS-206 Spring 2015 Lec.9-42

43 Fibonacci F(n) 1 if n = 0 or 1 F(n-1) + F(n-2) otherwise u Note w Potential parallelism w Dependencies EPFL CS-206 Spring 2015 Lec.9-43

44 Disclaimer u This Fibonacci implementation is w Egregiously inefficient w So don t try this at home or job! w But illustrates our point w How to deal with dependencies u Exercise: w Make this implementation efficient! EPFL CS-206 Spring 2015 Lec.9-44

45 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-45

46 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Parallel calls } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-46

47 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Pick up & combine results } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-47

48 The Blumofe-Leiserson DAG Model u Multithreaded program is w A directed acyclic graph (DAG) w That unfolds dynamically u Each node is w A single unit of work EPFL CS-206 Spring 2015 Lec.9-48

49 Fibonacci DAG fib(4) EPFL CS-206 Spring 2015 Lec.9-49

50 Fibonacci DAG fib(4) fib(3) EPFL CS-206 Spring 2015 Lec.9-50

51 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) EPFL CS-206 Spring 2015 Lec.9-51

52 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-52

53 Fibonacci DAG call fib(3) fib(4) get fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-53

54 How Parallel is That? u Define work: w Total time on one processor u Define critical-path length: w Longest dependency path w Can t beat that! EPFL CS-206 Spring 2015 Lec.9-54

55 Notation Watch u T P = time on P processors u T 1 = work (time on 1 processor) u T = critical path length (time on processors) u Parallelism = T 1 / T EPFL CS-206 Spring 2015 Lec.9-55

56 Unfolded DAG u With dim = 2 EPFL CS-206 Spring 2015 Lec.9-56

57 Work? u With dim = T 1 =6 EPFL CS-206 Spring 2015 Lec.9-57

58 Critical path? u With dim = T = 3 EPFL CS-206 Spring 2015 Lec.9-58

59 Unfolded DAG u With dim = 4 EPFL CS-206 Spring 2015 Lec.9-59

60 Work? u With dim = T 1 =26 26 EPFL CS-206 Spring 2015 Lec.9-60

61 Critical path? u With dim = T = 5 5 EPFL CS-206 Spring 2015 Lec.9-61

62 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T 1 upper half n + lower half n (n-1) = 2Σ2 2i + 2 2n u T = 2n+1 i=0 EPFL CS-206 Spring 2015 Lec.9-62

63 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T upper half and lower half are the logs + 1 = n+1 + n = 2n + 1 EPFL CS-206 Spring 2015 Lec.9-63

64 Unfolded DAG EPFL CS-206 Spring 2015 Lec.9-64

65 Parallelism? Serial fraction = 3/18 = 1/6 Amdahl s Law says speedup cannot exceed 6. EPFL CS-206 Spring 2015 Lec.9-65

66 Work? T 1 : time needed on one processor Just count the nodes T 1 = 18 EPFL CS-206 Spring 2015 Lec.9-66

67 Critical Path? T : time needed on as many processors as you like EPFL CS-206 Spring 2015 Lec.9-67

68 Critical Path? T : time needed on as many processors as you like Longest path T = 9 EPFL CS-206 Spring 2015 Lec.9-68

69 Simple Laws u Work Law: T P T 1 /P w In one step, can t do more than P work u Critical Path Law: T P T w Can t beat infinite resources EPFL CS-206 Spring 2015 Lec.9-69

70 Performance Measures u Speedup on P processors w Ratio T 1 /T P w How much faster with P processors u Linear speedup w T 1 /T P α P u Max speedup (average parallelism) w T 1 /T EPFL CS-206 Spring 2015 Lec.9-70

71 Sequential Composition A B Work: T 1 (A) + T 1 (B) Critical Path: T (A) + T (B) EPFL CS-206 Spring 2015 Lec.9-71

72 Parallel Composition A B Work: T 1 (A) + T 1 (B) Critical Path: max{t (A), T (B)} EPFL CS-206 Spring 2015 Lec.9-72

73 Back to Matrix Addition u How much work? u What is the critical path? EPFL CS-206 Spring 2015 Lec.9-73

74 Work Distribution zzz EPFL CS-206 Spring 2015 Lec.9-74

75 Work Dealing Yes! EPFL CS-206 Spring 2015 Lec.9-75

76 The Problem with Work Dealing D oh! D oh! D oh! EPFL CS-206 Spring 2015 Lec.9-76

77 Work Dealing is fundamentally flawed! u Work dealing w Everyone passes extra work to others w Spend time dealing rather than actually working w There is contention in work distribution EPFL CS-206 Spring 2015 Lec.9-77

78 Work Stealing Yes! No work EPFL CS-206 Spring 2015 Lec.9-78

79 Lock-Free Work Stealing u Each thread has a pool of ready work u Remove work without synchronizing u If you run out of work, steal someone else s u Choose victim at random EPFL CS-206 Spring 2015 Lec.9-79

80 Local Work Pools Each work pool is a Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-80

81 Work DEQueue 1 work pushbottom popbottom Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-81

82 Obtain Work Obtain work Run task until blocks or terminates popbottom EPFL CS-206 Spring 2015 Lec.9-82

83 New Work Unblock node Spawn node pushbottom EPFL CS-206 Spring 2015 Lec.9-83

84 What to do When the Well Runs empty EPFL CS-206 Spring 2015 Lec.9-84

85 Steal Work from Others Pick random thread s DEQeueue EPFL CS-206 Spring 2015 Lec.9-85

86 Steal this Task! poptop EPFL CS-206 Spring 2015 Lec.9-86

87 Task DEQueue u Methods w pushbottom w popbottom w poptop Never happen concurrently EPFL CS-206 Spring 2015 Lec.9-87

88 Task DEQueue u Methods w pushbottom w popbottom w poptop Most common make them fast (minimize use of CAS) EPFL CS-206 Spring 2015 Lec.9-88

89 Ideal double-ended queue u Wait-Free w No locking or starvation u Linearizable w All methods appear to execute instantaneously w Method execution can follow a total order w Can reason about correctness u Constant time But, can not always have the cake and eat it too! EPFL CS-206 Spring 2015 Lec.9-89

90 Compromise u Method poptop may fail if w poptop succeeds, or a w popbottom (concurrent with poptop) takes last task Blame the victim! EPFL CS-206 Spring 2015 Lec.9-90

91 What could go wrong? CAS top EPFL CS-206 Spring 2015 Lec.9-91

92 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-92

93 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-93

94 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-94

95 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-95

96 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-96

97 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-97

98 What could go wrong? Yes! CAS top Uh-Oh EPFL CS-206 Spring 2015 Lec.9-98

99 Fix stamp top bottom EPFL CS-206 Spring 2015 Lec.9-99

100 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } EPFL CS-206 Spring 2015 Lec.9-100

101 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Index & Stamp (synchronized) EPFL CS-206 Spring 2015 Lec.9-101

102 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } index of bottom task no need to synchronize but must volatile (why?) EPFL CS-206 Spring 2015 Lec.9-102

103 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Array holding tasks EPFL CS-206 Spring 2015 Lec.9-103

104 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } EPFL CS-206 Spring 2015 Lec.9-104

105 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } Bottom is the index to store the new task in the array EPFL CS-206 Spring 2015 Lec.9-105

106 pushbottom() public class BDEQueue { stamp top void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } bottom } Adjust the bottom index EPFL CS-206 Spring 2015 Lec.9-106

107 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } EPFL CS-206 Spring 2015 Lec.9-107

108 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Read top (value & stamp) EPFL CS-206 Spring 2015 Lec.9-108

109 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Compute new value & stamp EPFL CS-206 Spring 2015 Lec.9-109

110 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } stamp top bottom Quit if queue is empty EPFL CS-206 Spring 2015 Lec.9-110

111 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; stamp top CAS bottom if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Try to steal the task EPFL CS-206 Spring 2015 Lec.9-111

112 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Give up if conflict occurs EPFL CS-206 Spring 2015 Lec.9-112

113 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } EPFL CS-206 Spring 2015 Lec.9-113

114 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Make sure queue is non-empty EPFL CS-206 Spring 2015 Lec.9-114

115 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Prepare to grab bottom task EPFL CS-206 Spring 2015 Lec.9-115

116 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Read top, & prepare new values EPFL CS-206 Spring 2015 Lec.9-116

117 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; If } top & bottom one or more apart, no conflict top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top

118 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp At most one item left EPFL CS-206 Spring 2015 Lec top

119 Take Work Runnable popbottom() { if (bottom == 0) return null; Try to steal last task. Reset bottom because the DEQueue will be empty even if unsuccessful (why?) bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-119

120 Take Work Runnable popbottom() { if I win (bottom CAS == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom

121 Take Work Runnable popbottom() { If if I (bottom lose CAS, == 0) return thief null; must bottom--; have won Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom

122 Take Work Failed to get last task (bottom bottom--; could be less than top) Runnable popbottom() { if (bottom == 0) return null; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } Must still reset top and bottom since deque is empty return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-122

123 Old English Proverb u May as well be hanged for stealing a sheep as a goat u From which we conclude: w Stealing was punished severely w Sheep were worth more than goats EPFL CS-206 Spring 2015 Lec.9-123

124 Variations u Stealing is expensive w Pay CAS w Only one task taken u What if w Move more than one task w Randomly balance loads? EPFL CS-206 Spring 2015 Lec.9-124

125 Work Balancing 2 5 EPFL CS-206 Spring 2015 Lec.9-125

126 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-126

127 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); Keep running tasks EPFL CS-206 Spring 2015 Lec.9-127

128 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); With probability 1/ queue EPFL CS-206 Spring 2015 Lec.9-128

129 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Choose random victim balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-129

130 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Lock queues in canonical order balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-130

131 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Rebalance queues balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-131

132 Summary u Futures are lighter weight u Worry about the critical path, speedup u Work distribution u Work dealing is flawed u Work stealing can work but expensive u Work balancing EPFL CS-206 Spring 2015 Lec.9-132

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ CS-206 Concurrency Lecture 13 Wrap Up Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Created by Nooshin Mirzadeh, Georgios Psaropoulos and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring