CS-206 Concurrency. Lecture 10. Scheduling & Work Distribution. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/
|
|
- Winfred Atkins
- 6 years ago
- Views:
Transcription
1 CS-206 Concurrency Lecture 10 zzz Scheduling & Work Distribution Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Adapted from slides originally developed by Maurice Herlihy and Nir Shavit from the Art of Multiprocessor Programming, and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring 2015 Lec.9-1
2 Where are We? Lecture & Lab M T W T F 16-Feb 17-Feb 18-Feb 19-Feb 20-Feb 23-Feb 24-Feb 25-Feb 26-Feb 27-Feb 2-Mar 3-Mar 4-Mar 5-Mar 6-Mar 9-Mar 10-Mar 11-Mar 12-Mar 13-Mar 16-Mar 17-Mar 18-Mar 19-Mar 20-Mar 23-Mar 24-Mar 25-Mar 26-Mar 27-Mar 30-Mar 31-Mar 1-Apr 2-Apr 3-Apr 6-Apr 7-Apr 8-Apr 9-Apr 10-Apr 13-Apr 14-Apr 15-Apr 16-Apr 17-Apr 20-Apr 21-Apr 22-Apr 23-Apr 24-Apr 27-Apr 28-Apr 29-Apr 30-Apr 1-May 4-May 5-May 6-May 7-May 8-May 11-May 12-May 13-May 14-May 15-May 18-May 19-May 20-May 21-May 22-May 25-May 26-May 27-May 28-May 29-May u Work scheduling w Threads vs. Futures w Dependencies u Work distribution w Work stealing w Work balancing u Next week w GPUs EPFL CS-206 Spring 2015 Lec.9-2
3 Evaluations: Sample feedback u u u u u u u u u u Faire faire aux étudiants des devoirs notés et ne pas les avoir rendu alors que la fin du semestre approche est pénible et démotivant. I find (it) strange to forbid (the) use of laptops in a CS course in the 21 st century....the midterm was only about learning the slides and did not require any reflection, it's not what we expect of an engineer. More explanations on slides would be a plus...some notes would be useful for the exam The class is alarmingly average. But that's not necessarily a bad thing. The course is dynamic and interesting, well structured. The lab sessions are adapted. I enjoy going for class. Good course, Babak is cool, but the slides are poor. The slides are hard to read and understand What do Bob & Alice have to do with concurrency???? Le professeur ne se donne pas beaucoup de peine pour faire un cours pédagogique. Bon diaporama (fr.wikipedia.org/wiki/diaporama). Excellent professor. Really good assignments and enough time to do it EPFL CS-206 Spring 2015 Lec.9-3
4 Matrix Add ( C) = A ( ) + ( B) EPFL CS-206 Spring 2015 Lec.9-4
5 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} EPFL CS-206 Spring 2015 Lec.9-5
6 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} a thread EPFL CS-206 Spring 2015 Lec.9-6
7 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} compute Which matrix entry to EPFL CS-206 Spring 2015 Lec.9-7
8 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} Actual computation EPFL CS-206 Spring 2015 Lec.9-8
9 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); EPFL CS-206 Spring 2015 Lec.9-9
10 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Create n x n threads EPFL CS-206 Spring 2015 Lec.9-10
11 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them EPFL CS-206 Spring 2015 Lec.9-11
12 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-12
13 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); What s wrong with this for (int row ) picture? for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-13
14 Thread Overhead u Threads Require resources w Memory for stacks w Setup, teardown w Scheduler overhead u Short-lived threads w Ratio of work versus overhead bad EPFL CS-206 Spring 2015 Lec.9-14
15 Thread Pools u More sensible to keep a pool of w long-lived threads u Threads assigned short-lived tasks w Run the task w Rejoin pool w Wait for next assignment EPFL CS-206 Spring 2015 Lec.9-15
16 Thread Pool = Abstraction u Insulate programmer from platform w Big machine, big pool w Small machine, small pool u Portable code w Works across platforms w Worry about algorithm, not platform EPFL CS-206 Spring 2015 Lec.9-16
17 ExecutorService Interface u In java.util.concurrent w Task = Runnable object w If no result value expected w Calls run() method. w Task = Callable<T> object w If result value of type T expected w Calls T call() method. EPFL CS-206 Spring 2015 Lec.9-17
18 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); EPFL CS-206 Spring 2015 Lec.9-18
19 Example Overhead: Threads vs. Futures u Creating threads: ~200μs u Submitting tasks to futures: ~10μs u Each task will do a certain amount of work u What fraction of overall time is in task submission? w w For tasks with 1μs of work? For tasks with 1ms of work? EPFL CS-206 Spring 2015 Lec.9-19
20 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); Submitting a Callable<T> task returns a Future<T> object EPFL CS-206 Spring 2015 Lec.9-20
21 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); The Future s get() method blocks until the value is available EPFL CS-206 Spring 2015 Lec.9-21
22 Note u Executor Service submissions w Are purely advisory in nature u The executor w Is free to ignore any such advice w And could execute tasks sequentially EPFL CS-206 Spring 2015 Lec.9-22
23 Matrix Addition (4 elements) C C A + B B + A = C C A B A B EPFL CS-206 Spring 2015 Lec.9-23
24 Matrix Addition (4 elements) C C A + B B + A = C C A B A B parallel additions EPFL CS-206 Spring 2015 Lec.9-24
25 Matrix Addition (4 elements) C C A + B B + A = C C A B A B u How do we add larger matrices? w Hint: for now assume dimensions are power of 2 EPFL CS-206 Spring 2015 Lec.9-25
26 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-26
27 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Data matrix reference public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-27
28 Matrix Addition Task public class Matrix { int dim; double[][] data; variables int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Access data through displacement public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-28
29 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); }..so we can split the matrix in constant time EPFL CS-206 Spring 2015 Lec.9-29
30 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } EPFL CS-206 Spring 2015 Lec.9-30
31 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } Split into 4 matrices.. EPFL CS-206 Spring 2015 Lec.9-31
32 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = } with half dimentions new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; EPFL CS-206 Spring 2015 Lec.9-32
33 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = } new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; Reassign displacement variables, no data copy EPFL CS-206 Spring 2015 Lec.9-33
34 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-34
35 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Constant-time operation }} EPFL CS-206 Spring 2015 Lec.9-35
36 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Submit 4 recursive tasks Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-36
37 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Base case: add directly }} EPFL CS-206 Spring 2015 Lec.9-37
38 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} Let them finish EPFL CS-206 Spring 2015 Lec.9-38
39 Matrix Addition DAG u With dim = 2 Add(2) EPFL CS-206 Spring 2015 Lec.9-39
40 Matrix Addition DAG u With dim = 2 Add(2) Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-40
41 Matrix Addition DAG u With dim = 2 call Add(2) get Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-41
42 Dependencies u Matrix example is not typical u Individual tasks are independent w Don t need results of one task w Only need to split recursively u Often task results are not independent EPFL CS-206 Spring 2015 Lec.9-42
43 Fibonacci F(n) 1 if n = 0 or 1 F(n-1) + F(n-2) otherwise u Note w Potential parallelism w Dependencies EPFL CS-206 Spring 2015 Lec.9-43
44 Disclaimer u This Fibonacci implementation is w Egregiously inefficient w So don t try this at home or job! w But illustrates our point w How to deal with dependencies u Exercise: w Make this implementation efficient! EPFL CS-206 Spring 2015 Lec.9-44
45 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-45
46 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Parallel calls } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-46
47 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Pick up & combine results } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-47
48 The Blumofe-Leiserson DAG Model u Multithreaded program is w A directed acyclic graph (DAG) w That unfolds dynamically u Each node is w A single unit of work EPFL CS-206 Spring 2015 Lec.9-48
49 Fibonacci DAG fib(4) EPFL CS-206 Spring 2015 Lec.9-49
50 Fibonacci DAG fib(4) fib(3) EPFL CS-206 Spring 2015 Lec.9-50
51 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) EPFL CS-206 Spring 2015 Lec.9-51
52 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-52
53 Fibonacci DAG call fib(3) fib(4) get fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-53
54 How Parallel is That? u Define work: w Total time on one processor u Define critical-path length: w Longest dependency path w Can t beat that! EPFL CS-206 Spring 2015 Lec.9-54
55 Notation Watch u T P = time on P processors u T 1 = work (time on 1 processor) u T = critical path length (time on processors) u Parallelism = T 1 / T EPFL CS-206 Spring 2015 Lec.9-55
56 Unfolded DAG u With dim = 2 EPFL CS-206 Spring 2015 Lec.9-56
57 Work? u With dim = T 1 =6 EPFL CS-206 Spring 2015 Lec.9-57
58 Critical path? u With dim = T = 3 EPFL CS-206 Spring 2015 Lec.9-58
59 Unfolded DAG u With dim = 4 EPFL CS-206 Spring 2015 Lec.9-59
60 Work? u With dim = T 1 =26 26 EPFL CS-206 Spring 2015 Lec.9-60
61 Critical path? u With dim = T = 5 5 EPFL CS-206 Spring 2015 Lec.9-61
62 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T 1 upper half n + lower half n (n-1) = 2Σ2 2i + 2 2n u T = 2n+1 i=0 EPFL CS-206 Spring 2015 Lec.9-62
63 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T upper half and lower half are the logs + 1 = n+1 + n = 2n + 1 EPFL CS-206 Spring 2015 Lec.9-63
64 Unfolded DAG EPFL CS-206 Spring 2015 Lec.9-64
65 Parallelism? Serial fraction = 3/18 = 1/6 Amdahl s Law says speedup cannot exceed 6. EPFL CS-206 Spring 2015 Lec.9-65
66 Work? T 1 : time needed on one processor Just count the nodes T 1 = 18 EPFL CS-206 Spring 2015 Lec.9-66
67 Critical Path? T : time needed on as many processors as you like EPFL CS-206 Spring 2015 Lec.9-67
68 Critical Path? T : time needed on as many processors as you like Longest path T = 9 EPFL CS-206 Spring 2015 Lec.9-68
69 Simple Laws u Work Law: T P T 1 /P w In one step, can t do more than P work u Critical Path Law: T P T w Can t beat infinite resources EPFL CS-206 Spring 2015 Lec.9-69
70 Performance Measures u Speedup on P processors w Ratio T 1 /T P w How much faster with P processors u Linear speedup w T 1 /T P α P u Max speedup (average parallelism) w T 1 /T EPFL CS-206 Spring 2015 Lec.9-70
71 Sequential Composition A B Work: T 1 (A) + T 1 (B) Critical Path: T (A) + T (B) EPFL CS-206 Spring 2015 Lec.9-71
72 Parallel Composition A B Work: T 1 (A) + T 1 (B) Critical Path: max{t (A), T (B)} EPFL CS-206 Spring 2015 Lec.9-72
73 Back to Matrix Addition u How much work? u What is the critical path? EPFL CS-206 Spring 2015 Lec.9-73
74 Work Distribution zzz EPFL CS-206 Spring 2015 Lec.9-74
75 Work Dealing Yes! EPFL CS-206 Spring 2015 Lec.9-75
76 The Problem with Work Dealing D oh! D oh! D oh! EPFL CS-206 Spring 2015 Lec.9-76
77 Work Dealing is fundamentally flawed! u Work dealing w Everyone passes extra work to others w Spend time dealing rather than actually working w There is contention in work distribution EPFL CS-206 Spring 2015 Lec.9-77
78 Work Stealing Yes! No work EPFL CS-206 Spring 2015 Lec.9-78
79 Lock-Free Work Stealing u Each thread has a pool of ready work u Remove work without synchronizing u If you run out of work, steal someone else s u Choose victim at random EPFL CS-206 Spring 2015 Lec.9-79
80 Local Work Pools Each work pool is a Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-80
81 Work DEQueue 1 work pushbottom popbottom Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-81
82 Obtain Work Obtain work Run task until blocks or terminates popbottom EPFL CS-206 Spring 2015 Lec.9-82
83 New Work Unblock node Spawn node pushbottom EPFL CS-206 Spring 2015 Lec.9-83
84 What to do When the Well Runs empty EPFL CS-206 Spring 2015 Lec.9-84
85 Steal Work from Others Pick random thread s DEQeueue EPFL CS-206 Spring 2015 Lec.9-85
86 Steal this Task! poptop EPFL CS-206 Spring 2015 Lec.9-86
87 Task DEQueue u Methods w pushbottom w popbottom w poptop Never happen concurrently EPFL CS-206 Spring 2015 Lec.9-87
88 Task DEQueue u Methods w pushbottom w popbottom w poptop Most common make them fast (minimize use of CAS) EPFL CS-206 Spring 2015 Lec.9-88
89 Ideal double-ended queue u Wait-Free w No locking or starvation u Linearizable w All methods appear to execute instantaneously w Method execution can follow a total order w Can reason about correctness u Constant time But, can not always have the cake and eat it too! EPFL CS-206 Spring 2015 Lec.9-89
90 Compromise u Method poptop may fail if w poptop succeeds, or a w popbottom (concurrent with poptop) takes last task Blame the victim! EPFL CS-206 Spring 2015 Lec.9-90
91 What could go wrong? CAS top EPFL CS-206 Spring 2015 Lec.9-91
92 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-92
93 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-93
94 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-94
95 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-95
96 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-96
97 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-97
98 What could go wrong? Yes! CAS top Uh-Oh EPFL CS-206 Spring 2015 Lec.9-98
99 Fix stamp top bottom EPFL CS-206 Spring 2015 Lec.9-99
100 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } EPFL CS-206 Spring 2015 Lec.9-100
101 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Index & Stamp (synchronized) EPFL CS-206 Spring 2015 Lec.9-101
102 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } index of bottom task no need to synchronize but must volatile (why?) EPFL CS-206 Spring 2015 Lec.9-102
103 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Array holding tasks EPFL CS-206 Spring 2015 Lec.9-103
104 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } EPFL CS-206 Spring 2015 Lec.9-104
105 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } Bottom is the index to store the new task in the array EPFL CS-206 Spring 2015 Lec.9-105
106 pushbottom() public class BDEQueue { stamp top void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } bottom } Adjust the bottom index EPFL CS-206 Spring 2015 Lec.9-106
107 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } EPFL CS-206 Spring 2015 Lec.9-107
108 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Read top (value & stamp) EPFL CS-206 Spring 2015 Lec.9-108
109 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Compute new value & stamp EPFL CS-206 Spring 2015 Lec.9-109
110 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } stamp top bottom Quit if queue is empty EPFL CS-206 Spring 2015 Lec.9-110
111 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; stamp top CAS bottom if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Try to steal the task EPFL CS-206 Spring 2015 Lec.9-111
112 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Give up if conflict occurs EPFL CS-206 Spring 2015 Lec.9-112
113 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } EPFL CS-206 Spring 2015 Lec.9-113
114 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Make sure queue is non-empty EPFL CS-206 Spring 2015 Lec.9-114
115 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Prepare to grab bottom task EPFL CS-206 Spring 2015 Lec.9-115
116 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Read top, & prepare new values EPFL CS-206 Spring 2015 Lec.9-116
117 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; If } top & bottom one or more apart, no conflict top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top
118 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp At most one item left EPFL CS-206 Spring 2015 Lec top
119 Take Work Runnable popbottom() { if (bottom == 0) return null; Try to steal last task. Reset bottom because the DEQueue will be empty even if unsuccessful (why?) bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-119
120 Take Work Runnable popbottom() { if I win (bottom CAS == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom
121 Take Work Runnable popbottom() { If if I (bottom lose CAS, == 0) return thief null; must bottom--; have won Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom
122 Take Work Failed to get last task (bottom bottom--; could be less than top) Runnable popbottom() { if (bottom == 0) return null; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } Must still reset top and bottom since deque is empty return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-122
123 Old English Proverb u May as well be hanged for stealing a sheep as a goat u From which we conclude: w Stealing was punished severely w Sheep were worth more than goats EPFL CS-206 Spring 2015 Lec.9-123
124 Variations u Stealing is expensive w Pay CAS w Only one task taken u What if w Move more than one task w Randomly balance loads? EPFL CS-206 Spring 2015 Lec.9-124
125 Work Balancing 2 5 EPFL CS-206 Spring 2015 Lec.9-125
126 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-126
127 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); Keep running tasks EPFL CS-206 Spring 2015 Lec.9-127
128 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); With probability 1/ queue EPFL CS-206 Spring 2015 Lec.9-128
129 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Choose random victim balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-129
130 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Lock queues in canonical order balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-130
131 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Rebalance queues balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-131
132 Summary u Futures are lighter weight u Worry about the critical path, speedup u Work distribution u Work dealing is flawed u Work stealing can work but expensive u Work balancing EPFL CS-206 Spring 2015 Lec.9-132
CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/
CS-206 Concurrency Lecture 13 Wrap Up Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Created by Nooshin Mirzadeh, Georgios Psaropoulos and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring
More informationCS-206 Concurrency. Lecture 8 Concurrent. Data structures. a b c d. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(c) remove(b)
CS-206 Concurrency Lecture 8 Concurrent a b c d Data structures Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(b) remove(c) Adapted from slides originally developed by Maurice Herlihy
More informationMulti-threading model
Multi-threading model High level model of thread processes using spawn and sync. Does not consider the underlying hardware. Algorithm Algorithm-A begin { } spawn Algorithm-B do Algorithm-B in parallel
More informationCSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )
CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance
More informationTDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II
TDDB68 Concurrent programming and operating systems Lecture: CPU Scheduling II Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice:
More informationNotation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing
Parallel Processing CS575 Parallel Processing Lecture five: Efficiency Wim Bohm, Colorado State University Some material from Speedup vs Efficiency in Parallel Systems - Eager, Zahorjan and Lazowska IEEE
More informationScheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling
Scheduling I Today! Introduction to scheduling! Classical algorithms Next Time! Advanced topics on scheduling Scheduling out there! You are the manager of a supermarket (ok, things don t always turn out
More informationINF 4140: Models of Concurrency Series 3
Universitetet i Oslo Institutt for Informatikk PMA Olaf Owe, Martin Steffen, Toktam Ramezani INF 4140: Models of Concurrency Høst 2016 Series 3 14. 9. 2016 Topic: Semaphores (Exercises with hints for solution)
More informationCS173 Lecture B, November 3, 2015
CS173 Lecture B, November 3, 2015 Tandy Warnow November 3, 2015 CS 173, Lecture B November 3, 2015 Tandy Warnow Announcements Examlet 7 is a take-home exam, and is due November 10, 11:05 AM, in class.
More informationParallel Program Performance Analysis
Parallel Program Performance Analysis Chris Kauffman CS 499: Spring 2016 GMU Logistics Today Final details of HW2 interviews HW2 timings HW2 Questions Parallel Performance Theory Special Office Hours Mon
More informationScheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling
Scheduling I Today Introduction to scheduling Classical algorithms Next Time Advanced topics on scheduling Scheduling out there You are the manager of a supermarket (ok, things don t always turn out the
More informationMaps performance tips On server: Maintain DB connections, prepared statements (per thread/request!)
Announcements Maps performance tips On server: Maintain DB connections, prepared statements (per thread/request!) Use Spark.before, Spark.after to open and close. Use ThreadLocal, or pass the connection
More informationDistributed Systems Part II Solution to Exercise Sheet 10
Distributed Computing HS 2012 Prof. R. Wattenhofer / C. Decker Distributed Systems Part II Solution to Exercise Sheet 10 1 Spin Locks A read-write lock is a lock that allows either multiple processes to
More informationLab Course: distributed data analytics
Lab Course: distributed data analytics 01. Threading and Parallelism Nghia Duong-Trung, Mohsan Jameel Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim, Germany International
More informationChe-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University
Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.
More informationLecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg
Lecture 22: Multithreaded Algorithms CSCI 700 - Algorithms I Andrew Rosenberg Last Time Open Addressing Hashing Today Multithreading Two Styles of Threading Shared Memory Every thread can access the same
More informationA Note on Parallel Algorithmic Speedup Bounds
arxiv:1104.4078v1 [cs.dc] 20 Apr 2011 A Note on Parallel Algorithmic Speedup Bounds Neil J. Gunther February 8, 2011 Abstract A parallel program can be represented as a directed acyclic graph. An important
More informationUCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis
UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis Lecture 14 Class URL: http://vlsicad.ucsd.edu/courses/cse21-s14/ Lecture 14 Notes Goals for this week Big-O complexity
More informationClock-driven scheduling
Clock-driven scheduling Also known as static or off-line scheduling Michal Sojka Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Control Engineering November 8, 2017
More informationA Tighter Analysis of Work Stealing
A Tighter Analysis of Work Stealing Marc Tchiboukdjian Nicolas Gast Denis Trystram Jean-Louis Roch Julien Bernard Laboratoire d Informatique de Grenoble INRIA Marc Tchiboukdjian A Tighter Analysis of Work
More informationClojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014
Clojure Concurrency Constructs, Part Two CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014 1 Goals Cover the material presented in Chapter 4, of our concurrency textbook In particular,
More informationSeason Finale: Which one is better?
CS4310.01 Introduction to Operating System Spring 2016 Dr. Zhizhang Shen Season Finale: Which one is better? 1 Background In this lab, we will study, and compare, two processor scheduling policies via
More informationHow to deal with uncertainties and dynamicity?
How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2019 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More informationCS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3
CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3 This third problem set explores graphs, relations, functions, cardinalities, and the pigeonhole principle. This should be a great way to get a
More informationSafety and Liveness. Thread Synchronization: Too Much Milk. Critical Sections. A Really Cool Theorem
Safety and Liveness Properties defined over an execution of a program Thread Synchronization: Too Much Milk Safety: nothing bad happens holds in every finite execution prefix Windows never crashes No patient
More informationCISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata
CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences
More informationCSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015 Today Registration should be done. Homework 1 due 11:59pm next Wednesday, April 8 th. Review math
More informationLecture 23: Illusiveness of Parallel Performance. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 23: Illusiveness of Parallel Performance James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L23 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping peel
More informationCPU SCHEDULING RONG ZHENG
CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM
More informationCPU Scheduling. CPU Scheduler
CPU Scheduling These slides are created by Dr. Huang of George Mason University. Students registered in Dr. Huang s courses at GMU can make a single machine readable copy and print a single copy of each
More informationParallel Performance Theory - 1
Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science Outline q Performance scalability q Analytical performance measures q Amdahl s law and Gustafson-Barsis
More informationUCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis
UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis Lecture 15 Class URL: http://vlsicad.ucsd.edu/courses/cse21-s14/ Lecture 15 Notes Goals for this week Big-O complexity
More informationCMSC 132, Object-Oriented Programming II Summer Lecture 10:
CMSC 132, Object-Oriented Programming II Summer 2016 Lecturer: Anwar Mamat Lecture 10: Disclaimer: These notes may be distributed outside this class only with the permission of the Instructor. 10.1 RECURSION
More informationTDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.
TDDI4 Concurrent Programming, Operating Systems, and Real-time Operating Systems CPU Scheduling Overview: CPU Scheduling CPU bursts and I/O bursts Scheduling Criteria Scheduling Algorithms Multiprocessor
More informationMore Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018
CS 61B More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018 Here is a review of some formulas that you will find useful when doing asymptotic analysis. ˆ N i=1 i = 1 + 2 + 3 + 4 + + N = N(N+1)
More informationDETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD
D DETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD Yashasvini Sharma 1 Abstract The process scheduling, is one of the most important tasks
More informationCIS 4930/6930: Principles of Cyber-Physical Systems
CIS 4930/6930: Principles of Cyber-Physical Systems Chapter 11 Scheduling Hao Zheng Department of Computer Science and Engineering University of South Florida H. Zheng (CSE USF) CIS 4930/6930: Principles
More informationPractical Combustion Kinetics with CUDA
Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides
More information0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA
0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA 2008-09 Salvatore Orlando 1 0-1 Knapsack problem N objects, j=1,..,n Each kind of item j has a value p j and a weight w j (single
More informationCSC236 Week 3. Larry Zhang
CSC236 Week 3 Larry Zhang 1 Announcements Problem Set 1 due this Friday Make sure to read Submission Instructions on the course web page. Search for Teammates on Piazza Educational memes: http://www.cs.toronto.edu/~ylzhang/csc236/memes.html
More informationCPU scheduling. CPU Scheduling
EECS 3221 Operating System Fundamentals No.4 CPU scheduling Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University CPU Scheduling CPU scheduling is the basis of multiprogramming
More informationCS 170 Algorithms Spring 2009 David Wagner Final
CS 170 Algorithms Spring 2009 David Wagner Final PRINT your name:, (last) SIGN your name: (first) PRINT your Unix account login: Your TA s name: Name of the person sitting to your left: Name of the person
More informationCS173 Running Time and Big-O. Tandy Warnow
CS173 Running Time and Big-O Tandy Warnow CS 173 Running Times and Big-O analysis Tandy Warnow Today s material We will cover: Running time analysis Review of running time analysis of Bubblesort Review
More information1 Introduction (January 21)
CS 97: Concrete Models of Computation Spring Introduction (January ). Deterministic Complexity Consider a monotonically nondecreasing function f : {,,..., n} {, }, where f() = and f(n) =. We call f a step
More informationTopics in Complexity
Topics in Complexity Please evaluate this course on Axess! Your feedback really does make a difference. Applied Complexity Theory Complexity theory has enormous practical relevance across various domains
More informationAnnouncements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010
Announcements CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis Dan Grossman Spring 2010 Project 1 posted Section materials on using Eclipse will be very useful if you have never used
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More information1 Lamport s Bakery Algorithm
Com S 6 Spring Semester 2009 Algorithms for Multiprocessor Synchronization Lecture 3: Tuesday, 27th January 2009 Instructor: Soma Chaudhuri Scribe: Neeraj Khanolkar Lamport s Bakery Algorithm Algorithm
More informationDistributed Computing in Shared Memory and Networks
Distributed Computing in Shared Memory and Networks Class 2: Consensus WEP 2018 KAUST This class Reaching agreement in shared memory: Consensus ü Impossibility of wait-free consensus 1-resilient consensus
More informationThe topics in this section concern with the first course objective.
1.1 Systems & Probability The topics in this section concern with the first course objective. A system is one of the most fundamental concepts and one of the most useful and powerful tools in STEM (science,
More informationBlocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012)
1 Streams Blocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012) Streams provide a very simple abstraction for determinate parallel computation. The intuition for streams is already present in
More informationAnnouncements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday
Announcements Project #1 grades were returned on Monday Requests for re-grades due by Tuesday Midterm #1 Re-grade requests due by Monday Project #2 Due 10 AM Monday 1 Page State (hardware view) Page frame
More informationSection 20: Arrow Diagrams on the Integers
Section 0: Arrow Diagrams on the Integers Most of the material we have discussed so far concerns the idea and representations of functions. A function is a relationship between a set of inputs (the leave
More informationModule 5: CPU Scheduling
Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 5.1 Basic Concepts Maximum CPU utilization obtained
More informationState and Finite State Machines
State and Finite State Machines See P&H Appendix C.7. C.8, C.10, C.11 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register
More informationCS103 Handout 12 Winter February 4, 2013 Practice Midterm Exam
CS103 Handout 12 Winter 2012-2013 February 4, 2013 Practice Midterm Exam This midterm exam is open-book, open-note, open-computer, but closed-network. This means that if you want to have your laptop with
More informationChapter 6: CPU Scheduling
Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 6.1 Basic Concepts Maximum CPU utilization obtained
More informationAnalytical Modeling of Parallel Systems
Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing
More informationProgramming Abstractions
Programming Abstractions C S 1 0 6 X Cynthia Lee Today s topics: Performance issues in recursion Big-O performance analysis 2 Announcement: Recursive art contest! Go to http://recursivedrawing.com/ Make
More informationAside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n
Aside: Golden Ratio Golden Ratio: A universal law. Golden ratio φ = lim n a n+b n a n = 1+ 5 2 a n+1 = a n + b n, b n = a n 1 Ruta (UIUC) CS473 1 Spring 2018 1 / 41 CS 473: Algorithms, Spring 2018 Dynamic
More informationRandomized Algorithms
Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours
More informationExercise 1 (15 points)
Exam System Validation (214012) 8:45-12:15, 25-06-2010 The exercises are worth a total of 90 points. The final grade is 10+pts 10. The exam is open book: copies of slides/papers/notes/etc. are allowed.
More informationReview for the Midterm Exam
Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations
More informationComputer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic
Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: Digital Logic Our goal for the next few weeks is to paint a a reasonably complete picture of how we can go from transistor
More informationDeadlock (2) Dave Eckhardt Brian Railing Roger Dannenberg
Deadlock () Dave Eckhardt Brian Railing Roger Dannenberg 1 1-410, S'18 Synchronization P You should really have, today: Drawn pictures of thread stacks (even if not perfect) Figured out where stubs belong,
More informationCMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017
CMSC CMSC : Lecture Greedy Algorithms for Scheduling Tuesday, Sep 9, 0 Reading: Sects.. and. of KT. (Not covered in DPV.) Interval Scheduling: We continue our discussion of greedy algorithms with a number
More information2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51
2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each
More informationCHAPTER 5 - PROCESS SCHEDULING
CHAPTER 5 - PROCESS SCHEDULING OBJECTIVES To introduce CPU scheduling, which is the basis for multiprogrammed operating systems To describe various CPU-scheduling algorithms To discuss evaluation criteria
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2017S-22 Dynamic Programming David Galles Department of Computer Science University of San Francisco 22-0: Dynamic Programming Simple, recursive solution to a problem
More informationModels: Amdahl s Law, PRAM, α-β Tal Ben-Nun
spcl.inf.ethz.ch @spcl_eth Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun Design of Parallel and High-Performance Computing Fall 2017 DPHPC Overview cache coherency memory models 2 Speedup An application
More informationUC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara
Operating Systems Christopher Kruegel Department of Computer Science http://www.cs.ucsb.edu/~chris/ Many processes to execute, but one CPU OS time-multiplexes the CPU by operating context switching Between
More informationReal Time Operating Systems
Real Time Operating ystems hared Resources Luca Abeni Credits: Luigi Palopoli, Giuseppe Lipari, and Marco Di Natale cuola uperiore ant Anna Pisa -Italy Real Time Operating ystems p. 1 Interacting Tasks
More informationECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University
ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University Prof. Mi Lu TA: Ehsan Rohani Laboratory Exercise #4 MIPS Assembly and Simulation
More informationReal Time Operating Systems
Real Time Operating ystems Luca Abeni luca.abeni@unitn.it Interacting Tasks Until now, only independent tasks... A job never blocks or suspends A task only blocks on job termination In real world, jobs
More informationProvably Efficient Two-Level Adaptive Scheduling
Provably Efficient Two-Level Adaptive Scheduling Yuxiong He 1, Wen-Jing Hsu 1, and Charles E. Leiserson 2 1 Nanyang Technological University, Nanyang Avenue 639798, Singapore, yxhe@mit.edu, hsu@ntu.edu.sg
More informationINF Models of concurrency
INF4140 - Models of concurrency Fall 2017 October 17, 2017 Abstract This is the handout version of the slides for the lecture (i.e., it s a rendering of the content of the slides in a way that does not
More informationWork Stealing with Parallelism Feedback
Work Stealing with Parallelism Feedback Kunal Agrawal Yuxiong He Charles E. Leiserson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139,
More informationN/4 + N/2 + N = 2N 2.
CS61B Summer 2006 Instructor: Erin Korber Lecture 24, 7 Aug. 1 Amortized Analysis For some of the data structures we ve discussed (namely hash tables and splay trees), it was claimed that the average time
More informationCSE 241 Class 1. Jeremy Buhler. August 24,
CSE 41 Class 1 Jeremy Buhler August 4, 015 Before class, write URL on board: http://classes.engineering.wustl.edu/cse41/. Also: Jeremy Buhler, Office: Jolley 506, 314-935-6180 1 Welcome and Introduction
More information5 + 9(10) + 3(100) + 0(1000) + 2(10000) =
Chapter 5 Analyzing Algorithms So far we have been proving statements about databases, mathematics and arithmetic, or sequences of numbers. Though these types of statements are common in computer science,
More informationDesigning Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2
EECS 16A Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2 This homework is due February 6, 2017, at 23:59. Self-grades are due February 9, 2017, at
More informationMath 308 Midterm Answers and Comments July 18, Part A. Short answer questions
Math 308 Midterm Answers and Comments July 18, 2011 Part A. Short answer questions (1) Compute the determinant of the matrix a 3 3 1 1 2. 1 a 3 The determinant is 2a 2 12. Comments: Everyone seemed to
More informationAPTAS for Bin Packing
APTAS for Bin Packing Bin Packing has an asymptotic PTAS (APTAS) [de la Vega and Leuker, 1980] For every fixed ε > 0 algorithm outputs a solution of size (1+ε)OPT + 1 in time polynomial in n APTAS for
More informationCSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo
CSE 431/531: Analysis of Algorithms Dynamic Programming Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Paradigms for Designing Algorithms Greedy algorithm Make a
More informationSequential Logic (3.1 and is a long difficult section you really should read!)
EECS 270, Fall 2014, Lecture 6 Page 1 of 8 Sequential Logic (3.1 and 3.2. 3.2 is a long difficult section you really should read!) One thing we have carefully avoided so far is feedback all of our signals
More informationAlgorithms: COMP3121/3821/9101/9801
Algorithms: COMP311/381/9101/9801 Aleks Ignjatović, ignjat@cse.unsw.edu.au office: 504 (CSE building); phone: 5-6659 Course Admin: Amin Malekpour, a.malekpour@unsw.edu.au School of Computer Science and
More informationParallel Performance Theory
AMS 250: An Introduction to High Performance Computing Parallel Performance Theory Shawfeng Dong shaw@ucsc.edu (831) 502-7743 Applied Mathematics & Statistics University of California, Santa Cruz Outline
More information6.042/18.062J Mathematics for Computer Science March 17, 2005 Srini Devadas and Eric Lehman. Recurrences
6.04/8.06J Mathematics for Computer Science March 7, 00 Srini Devadas and Eric Lehman Lecture Notes Recurrences Recursion breaking an object down into smaller objects of the same type is a major theme
More informationCS 453 Operating Systems. Lecture 7 : Deadlock
CS 453 Operating Systems Lecture 7 : Deadlock 1 What is Deadlock? Every New Yorker knows what a gridlock alert is - it s one of those days when there is so much traffic that nobody can move. Everything
More informationExercise 1: Formal Tools and Techniques (10 points)
EXAM System Validation (191401) 13:45-17:15 0-07-01 The exercises are worth a total of 100 points. The final grade for the course is (hw 1+hw )/+exam/10, provided that you obtain at least 50 points for
More informationCongratulations you're done with CS103 problem sets! Please evaluate this course on Axess!
The Big Picture Announcements Problem Set 9 due right now. We'll release solutions right after lecture. Congratulations you're done with CS103 problem sets! Practice final exam is Monday, December 8 from
More informationBoyle s Law and Charles Law Activity
Boyle s Law and Charles Law Activity Introduction: This simulation helps you to help you fully understand 2 Gas Laws: Boyle s Law and Charles Law. These laws are very simple to understand, but are also
More information1 Trees. Listing 1: Node with two child reference. public class ptwochildnode { protected Object data ; protected ptwochildnode l e f t, r i g h t ;
1 Trees The next major set of data structures belongs to what s called Trees. They are called that, because if you try to visualize the structure, it kind of looks like a tree (root, branches, and leafs).
More informationA 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value
A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,
More information(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).
CMPSCI611: Verifying Polynomial Identities Lecture 13 Here is a problem that has a polynomial-time randomized solution, but so far no poly-time deterministic solution. Let F be any field and let Q(x 1,...,
More informationState & Finite State Machines
State & Finite State Machines Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Appendix C.7. C.8, C.10, C.11 Big Picture: Building a Processor memory inst register file
More informationAdvanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras
Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras Module No. # 5.4 Lecture No. # 30 Matrix Analysis of Beams and Grids (Refer Slide
More informationQR Decomposition in a Multicore Environment
QR Decomposition in a Multicore Environment Omar Ahsan University of Maryland-College Park Advised by Professor Howard Elman College Park, MD oha@cs.umd.edu ABSTRACT In this study we examine performance
More information