CS-206 Concurrency. Lecture 10. Scheduling & Work Distribution. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/

Size: px
Start display at page:

Download "CS-206 Concurrency. Lecture 10. Scheduling & Work Distribution. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/"

Transcription

1 CS-206 Concurrency Lecture 10 zzz Scheduling & Work Distribution Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Adapted from slides originally developed by Maurice Herlihy and Nir Shavit from the Art of Multiprocessor Programming, and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring 2015 Lec.9-1

2 Where are We? Lecture & Lab M T W T F 16-Feb 17-Feb 18-Feb 19-Feb 20-Feb 23-Feb 24-Feb 25-Feb 26-Feb 27-Feb 2-Mar 3-Mar 4-Mar 5-Mar 6-Mar 9-Mar 10-Mar 11-Mar 12-Mar 13-Mar 16-Mar 17-Mar 18-Mar 19-Mar 20-Mar 23-Mar 24-Mar 25-Mar 26-Mar 27-Mar 30-Mar 31-Mar 1-Apr 2-Apr 3-Apr 6-Apr 7-Apr 8-Apr 9-Apr 10-Apr 13-Apr 14-Apr 15-Apr 16-Apr 17-Apr 20-Apr 21-Apr 22-Apr 23-Apr 24-Apr 27-Apr 28-Apr 29-Apr 30-Apr 1-May 4-May 5-May 6-May 7-May 8-May 11-May 12-May 13-May 14-May 15-May 18-May 19-May 20-May 21-May 22-May 25-May 26-May 27-May 28-May 29-May u Work scheduling w Threads vs. Futures w Dependencies u Work distribution w Work stealing w Work balancing u Next week w GPUs EPFL CS-206 Spring 2015 Lec.9-2

3 Evaluations: Sample feedback u u u u u u u u u u Faire faire aux étudiants des devoirs notés et ne pas les avoir rendu alors que la fin du semestre approche est pénible et démotivant. I find (it) strange to forbid (the) use of laptops in a CS course in the 21 st century....the midterm was only about learning the slides and did not require any reflection, it's not what we expect of an engineer. More explanations on slides would be a plus...some notes would be useful for the exam The class is alarmingly average. But that's not necessarily a bad thing. The course is dynamic and interesting, well structured. The lab sessions are adapted. I enjoy going for class. Good course, Babak is cool, but the slides are poor. The slides are hard to read and understand What do Bob & Alice have to do with concurrency???? Le professeur ne se donne pas beaucoup de peine pour faire un cours pédagogique. Bon diaporama (fr.wikipedia.org/wiki/diaporama). Excellent professor. Really good assignments and enough time to do it EPFL CS-206 Spring 2015 Lec.9-3

4 Matrix Add ( C) = A ( ) + ( B) EPFL CS-206 Spring 2015 Lec.9-4

5 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} EPFL CS-206 Spring 2015 Lec.9-5

6 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} a thread EPFL CS-206 Spring 2015 Lec.9-6

7 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} compute Which matrix entry to EPFL CS-206 Spring 2015 Lec.9-7

8 Matrix Add class Worker extends Thread { int row, col; Worker(int row, int col) { this.row = row; this.col = col; } public void run() { C[row][col] = A[row][col] + B[row][col]; }} Actual computation EPFL CS-206 Spring 2015 Lec.9-8

9 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); EPFL CS-206 Spring 2015 Lec.9-9

10 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Create n x n threads EPFL CS-206 Spring 2015 Lec.9-10

11 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them EPFL CS-206 Spring 2015 Lec.9-11

12 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); for (int row ) for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-12

13 Matrix Add } void add() { Worker[][] worker = new Worker[n][n]; for (int row ) for (int col ) worker[row][col] = new Worker(row,col); for (int row ) for (int col ) worker[row][col].start(); What s wrong with this for (int row ) picture? for (int col ) worker[row][col].join(); Start them Wait for them to finish EPFL CS-206 Spring 2015 Lec.9-13

14 Thread Overhead u Threads Require resources w Memory for stacks w Setup, teardown w Scheduler overhead u Short-lived threads w Ratio of work versus overhead bad EPFL CS-206 Spring 2015 Lec.9-14

15 Thread Pools u More sensible to keep a pool of w long-lived threads u Threads assigned short-lived tasks w Run the task w Rejoin pool w Wait for next assignment EPFL CS-206 Spring 2015 Lec.9-15

16 Thread Pool = Abstraction u Insulate programmer from platform w Big machine, big pool w Small machine, small pool u Portable code w Works across platforms w Worry about algorithm, not platform EPFL CS-206 Spring 2015 Lec.9-16

17 ExecutorService Interface u In java.util.concurrent w Task = Runnable object w If no result value expected w Calls run() method. w Task = Callable<T> object w If result value of type T expected w Calls T call() method. EPFL CS-206 Spring 2015 Lec.9-17

18 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); EPFL CS-206 Spring 2015 Lec.9-18

19 Example Overhead: Threads vs. Futures u Creating threads: ~200μs u Submitting tasks to futures: ~10μs u Each task will do a certain amount of work u What fraction of overall time is in task submission? w w For tasks with 1μs of work? For tasks with 1ms of work? EPFL CS-206 Spring 2015 Lec.9-19

20 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); Submitting a Callable<T> task returns a Future<T> object EPFL CS-206 Spring 2015 Lec.9-20

21 Future<T> Callable<T> task = ; Future<T> future = executor.submit(task); T value = future.get(); The Future s get() method blocks until the value is available EPFL CS-206 Spring 2015 Lec.9-21

22 Note u Executor Service submissions w Are purely advisory in nature u The executor w Is free to ignore any such advice w And could execute tasks sequentially EPFL CS-206 Spring 2015 Lec.9-22

23 Matrix Addition (4 elements) C C A + B B + A = C C A B A B EPFL CS-206 Spring 2015 Lec.9-23

24 Matrix Addition (4 elements) C C A + B B + A = C C A B A B parallel additions EPFL CS-206 Spring 2015 Lec.9-24

25 Matrix Addition (4 elements) C C A + B B + A = C C A B A B u How do we add larger matrices? w Hint: for now assume dimensions are power of 2 EPFL CS-206 Spring 2015 Lec.9-25

26 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-26

27 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Data matrix reference public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-27

28 Matrix Addition Task public class Matrix { int dim; double[][] data; variables int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } Access data through displacement public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); } EPFL CS-206 Spring 2015 Lec.9-28

29 Matrix Addition Task public class Matrix { int dim; double[][] data; int rowdisplace, coldisplace; public double get(int row, int col) { return data[row+rowdisplace][col+coldisplace]; } public void set(int row, int col, double value) { data[row+rowdisplace][col+coldisplace] = value; } Matrix[][] split(); }..so we can split the matrix in constant time EPFL CS-206 Spring 2015 Lec.9-29

30 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } EPFL CS-206 Spring 2015 Lec.9-30

31 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; } Split into 4 matrices.. EPFL CS-206 Spring 2015 Lec.9-31

32 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = } with half dimentions new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; EPFL CS-206 Spring 2015 Lec.9-32

33 Matrix Addition Task Matrix[][] split() { Matrix[][] result = new Matrix[2][2]; int newdim = dim / 2; result[0][0] = new Matrix(data, rowdisplace, coldisplace, newdim); result[0][1] = } new Matrix(data, rowdisplace, coldisplace + newdim, newdim); result[1][0] = new Matrix(data, rowdisplace + newdim, coldisplace, newdim); result[1][1] = new Matrix(data, rowdisplace + newdim, coldisplace + newdim, newdim); return result; Reassign displacement variables, no data copy EPFL CS-206 Spring 2015 Lec.9-33

34 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-34

35 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Constant-time operation }} EPFL CS-206 Spring 2015 Lec.9-35

36 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Submit 4 recursive tasks Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} EPFL CS-206 Spring 2015 Lec.9-36

37 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); Base case: add directly }} EPFL CS-206 Spring 2015 Lec.9-37

38 Matrix Addition Task class AddTask implements Runnable { Matrix a, b, c; // add this! public void run() { if (a.dim == 1) { c.set(0,0,a.get(0,0) + b.get(0,0)); // base case } else { Matrix[][] splita = a.split(); Matrix[][] splitb = b.split(); Matrix[][] splitc = c.split(); Future<?> f[][] = new Future<?>[2][2]; Future<?> f[0][0] = exec.submit(new AddTask(splitA[0][0],splitB[0][0],splitC[0][0])); Future<?> f[1][1] = exec.submit(new AddTask(splitA[1][1],splitB[1][1], splitc[1][1])); f[0][0].get(); ; f[1][1].get(); }} Let them finish EPFL CS-206 Spring 2015 Lec.9-38

39 Matrix Addition DAG u With dim = 2 Add(2) EPFL CS-206 Spring 2015 Lec.9-39

40 Matrix Addition DAG u With dim = 2 Add(2) Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-40

41 Matrix Addition DAG u With dim = 2 call Add(2) get Add(1) Add(1) Add(1) Add(1) EPFL CS-206 Spring 2015 Lec.9-41

42 Dependencies u Matrix example is not typical u Individual tasks are independent w Don t need results of one task w Only need to split recursively u Often task results are not independent EPFL CS-206 Spring 2015 Lec.9-42

43 Fibonacci F(n) 1 if n = 0 or 1 F(n-1) + F(n-2) otherwise u Note w Potential parallelism w Dependencies EPFL CS-206 Spring 2015 Lec.9-43

44 Disclaimer u This Fibonacci implementation is w Egregiously inefficient w So don t try this at home or job! w But illustrates our point w How to deal with dependencies u Exercise: w Make this implementation efficient! EPFL CS-206 Spring 2015 Lec.9-44

45 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-45

46 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Parallel calls } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-46

47 Multithreaded Fibonacci class FibTask implements Callable<Integer> { static ExecutorService exec = Executors.newCachedThreadPool(); int arg; public FibTask(int n) { arg = n; Pick up & combine results } public Integer call() { if (arg > 2) { Future<Integer> left = exec.submit(new FibTask(arg-1)); Future<Integer> right = exec.submit(new FibTask(arg-2)); return left.get() + right.get(); } else { return 1; }}} EPFL CS-206 Spring 2015 Lec.9-47

48 The Blumofe-Leiserson DAG Model u Multithreaded program is w A directed acyclic graph (DAG) w That unfolds dynamically u Each node is w A single unit of work EPFL CS-206 Spring 2015 Lec.9-48

49 Fibonacci DAG fib(4) EPFL CS-206 Spring 2015 Lec.9-49

50 Fibonacci DAG fib(4) fib(3) EPFL CS-206 Spring 2015 Lec.9-50

51 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) EPFL CS-206 Spring 2015 Lec.9-51

52 Fibonacci DAG fib(4) fib(3) fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-52

53 Fibonacci DAG call fib(3) fib(4) get fib(2) fib(2) fib(1) fib(1) fib(1) fib(1) fib(1) EPFL CS-206 Spring 2015 Lec.9-53

54 How Parallel is That? u Define work: w Total time on one processor u Define critical-path length: w Longest dependency path w Can t beat that! EPFL CS-206 Spring 2015 Lec.9-54

55 Notation Watch u T P = time on P processors u T 1 = work (time on 1 processor) u T = critical path length (time on processors) u Parallelism = T 1 / T EPFL CS-206 Spring 2015 Lec.9-55

56 Unfolded DAG u With dim = 2 EPFL CS-206 Spring 2015 Lec.9-56

57 Work? u With dim = T 1 =6 EPFL CS-206 Spring 2015 Lec.9-57

58 Critical path? u With dim = T = 3 EPFL CS-206 Spring 2015 Lec.9-58

59 Unfolded DAG u With dim = 4 EPFL CS-206 Spring 2015 Lec.9-59

60 Work? u With dim = T 1 =26 26 EPFL CS-206 Spring 2015 Lec.9-60

61 Critical path? u With dim = T = 5 5 EPFL CS-206 Spring 2015 Lec.9-61

62 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T 1 upper half n + lower half n (n-1) = 2Σ2 2i + 2 2n u T = 2n+1 i=0 EPFL CS-206 Spring 2015 Lec.9-62

63 Question u Assume dimension is a power of 2 u For dim=2 n, what are T 1 and T? u For T upper half and lower half are the logs + 1 = n+1 + n = 2n + 1 EPFL CS-206 Spring 2015 Lec.9-63

64 Unfolded DAG EPFL CS-206 Spring 2015 Lec.9-64

65 Parallelism? Serial fraction = 3/18 = 1/6 Amdahl s Law says speedup cannot exceed 6. EPFL CS-206 Spring 2015 Lec.9-65

66 Work? T 1 : time needed on one processor Just count the nodes T 1 = 18 EPFL CS-206 Spring 2015 Lec.9-66

67 Critical Path? T : time needed on as many processors as you like EPFL CS-206 Spring 2015 Lec.9-67

68 Critical Path? T : time needed on as many processors as you like Longest path T = 9 EPFL CS-206 Spring 2015 Lec.9-68

69 Simple Laws u Work Law: T P T 1 /P w In one step, can t do more than P work u Critical Path Law: T P T w Can t beat infinite resources EPFL CS-206 Spring 2015 Lec.9-69

70 Performance Measures u Speedup on P processors w Ratio T 1 /T P w How much faster with P processors u Linear speedup w T 1 /T P α P u Max speedup (average parallelism) w T 1 /T EPFL CS-206 Spring 2015 Lec.9-70

71 Sequential Composition A B Work: T 1 (A) + T 1 (B) Critical Path: T (A) + T (B) EPFL CS-206 Spring 2015 Lec.9-71

72 Parallel Composition A B Work: T 1 (A) + T 1 (B) Critical Path: max{t (A), T (B)} EPFL CS-206 Spring 2015 Lec.9-72

73 Back to Matrix Addition u How much work? u What is the critical path? EPFL CS-206 Spring 2015 Lec.9-73

74 Work Distribution zzz EPFL CS-206 Spring 2015 Lec.9-74

75 Work Dealing Yes! EPFL CS-206 Spring 2015 Lec.9-75

76 The Problem with Work Dealing D oh! D oh! D oh! EPFL CS-206 Spring 2015 Lec.9-76

77 Work Dealing is fundamentally flawed! u Work dealing w Everyone passes extra work to others w Spend time dealing rather than actually working w There is contention in work distribution EPFL CS-206 Spring 2015 Lec.9-77

78 Work Stealing Yes! No work EPFL CS-206 Spring 2015 Lec.9-78

79 Lock-Free Work Stealing u Each thread has a pool of ready work u Remove work without synchronizing u If you run out of work, steal someone else s u Choose victim at random EPFL CS-206 Spring 2015 Lec.9-79

80 Local Work Pools Each work pool is a Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-80

81 Work DEQueue 1 work pushbottom popbottom Double-Ended Queue EPFL CS-206 Spring 2015 Lec.9-81

82 Obtain Work Obtain work Run task until blocks or terminates popbottom EPFL CS-206 Spring 2015 Lec.9-82

83 New Work Unblock node Spawn node pushbottom EPFL CS-206 Spring 2015 Lec.9-83

84 What to do When the Well Runs empty EPFL CS-206 Spring 2015 Lec.9-84

85 Steal Work from Others Pick random thread s DEQeueue EPFL CS-206 Spring 2015 Lec.9-85

86 Steal this Task! poptop EPFL CS-206 Spring 2015 Lec.9-86

87 Task DEQueue u Methods w pushbottom w popbottom w poptop Never happen concurrently EPFL CS-206 Spring 2015 Lec.9-87

88 Task DEQueue u Methods w pushbottom w popbottom w poptop Most common make them fast (minimize use of CAS) EPFL CS-206 Spring 2015 Lec.9-88

89 Ideal double-ended queue u Wait-Free w No locking or starvation u Linearizable w All methods appear to execute instantaneously w Method execution can follow a total order w Can reason about correctness u Constant time But, can not always have the cake and eat it too! EPFL CS-206 Spring 2015 Lec.9-89

90 Compromise u Method poptop may fail if w poptop succeeds, or a w popbottom (concurrent with poptop) takes last task Blame the victim! EPFL CS-206 Spring 2015 Lec.9-90

91 What could go wrong? CAS top EPFL CS-206 Spring 2015 Lec.9-91

92 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-92

93 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-93

94 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-94

95 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-95

96 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-96

97 What could go wrong? top EPFL CS-206 Spring 2015 Lec.9-97

98 What could go wrong? Yes! CAS top Uh-Oh EPFL CS-206 Spring 2015 Lec.9-98

99 Fix stamp top bottom EPFL CS-206 Spring 2015 Lec.9-99

100 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } EPFL CS-206 Spring 2015 Lec.9-100

101 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Index & Stamp (synchronized) EPFL CS-206 Spring 2015 Lec.9-101

102 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } index of bottom task no need to synchronize but must volatile (why?) EPFL CS-206 Spring 2015 Lec.9-102

103 Bounded DEQueue public class BDEQueue { AtomicStampedReference<Integer> top; volatile int bottom; Runnable[] tasks; } Array holding tasks EPFL CS-206 Spring 2015 Lec.9-103

104 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } EPFL CS-206 Spring 2015 Lec.9-104

105 pushbottom() public class BDEQueue { void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } } Bottom is the index to store the new task in the array EPFL CS-206 Spring 2015 Lec.9-105

106 pushbottom() public class BDEQueue { stamp top void pushbottom(runnable r){ tasks[bottom] = r; bottom++; } bottom } Adjust the bottom index EPFL CS-206 Spring 2015 Lec.9-106

107 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } EPFL CS-206 Spring 2015 Lec.9-107

108 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Read top (value & stamp) EPFL CS-206 Spring 2015 Lec.9-108

109 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Compute new value & stamp EPFL CS-206 Spring 2015 Lec.9-109

110 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } stamp top bottom Quit if queue is empty EPFL CS-206 Spring 2015 Lec.9-110

111 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; stamp top CAS bottom if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Try to steal the task EPFL CS-206 Spring 2015 Lec.9-111

112 Steal Work public Runnable poptop() { int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = oldtop + 1; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom <= oldtop) return null; Runnable r = tasks[oldtop]; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; return null; } Give up if conflict occurs EPFL CS-206 Spring 2015 Lec.9-112

113 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } EPFL CS-206 Spring 2015 Lec.9-113

114 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Make sure queue is non-empty EPFL CS-206 Spring 2015 Lec.9-114

115 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Prepare to grab bottom task EPFL CS-206 Spring 2015 Lec.9-115

116 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0; } Read top, & prepare new values EPFL CS-206 Spring 2015 Lec.9-116

117 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) return r; If } top & bottom one or more apart, no conflict top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top

118 Take Work Runnable popbottom() { if (bottom == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), bottom newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp At most one item left EPFL CS-206 Spring 2015 Lec top

119 Take Work Runnable popbottom() { if (bottom == 0) return null; Try to steal last task. Reset bottom because the DEQueue will be empty even if unsuccessful (why?) bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-119

120 Take Work Runnable popbottom() { if I win (bottom CAS == 0) return null; bottom--; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom

121 Take Work Runnable popbottom() { If if I (bottom lose CAS, == 0) return thief null; must bottom--; have won Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } return r; top.set(newtop,newstamp); return null; bottom = 0;} stamp EPFL CS-206 Spring 2015 Lec top CAS bottom

122 Take Work Failed to get last task (bottom bottom--; could be less than top) Runnable popbottom() { if (bottom == 0) return null; Runnable r = tasks[bottom]; int[] stamp = new int[1]; int oldtop = top.get(stamp), newtop = 0; int oldstamp = stamp[0], newstamp = oldstamp + 1; if (bottom > oldtop) return r; if (bottom == oldtop){ bottom = 0; if (top.cas(oldtop, newtop, oldstamp, newstamp)) } Must still reset top and bottom since deque is empty return r; top.set(newtop,newstamp); return null; bottom = 0;} EPFL CS-206 Spring 2015 Lec.9-122

123 Old English Proverb u May as well be hanged for stealing a sheep as a goat u From which we conclude: w Stealing was punished severely w Sheep were worth more than goats EPFL CS-206 Spring 2015 Lec.9-123

124 Variations u Stealing is expensive w Pay CAS w Only one task taken u What if w Move more than one task w Randomly balance loads? EPFL CS-206 Spring 2015 Lec.9-124

125 Work Balancing 2 5 EPFL CS-206 Spring 2015 Lec.9-125

126 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-126

127 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); Keep running tasks EPFL CS-206 Spring 2015 Lec.9-127

128 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { balance(queue[min], queue[max]); With probability 1/ queue EPFL CS-206 Spring 2015 Lec.9-128

129 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Choose random victim balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-129

130 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Lock queues in canonical order balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-130

131 Work-Balancing Thread public void run() { int me = ThreadID.get(); while (true) { Runnable task = queue[me].deq(); if (task!= null) task.run(); int size = queue[me].size(); if (random.nextint(size+1) == size) { }}}}} int victim = random.nextint(queue.length); int min =, max = ; synchronized (queue[min]) { synchronized (queue[max]) { Rebalance queues balance(queue[min], queue[max]); EPFL CS-206 Spring 2015 Lec.9-131

132 Summary u Futures are lighter weight u Worry about the critical path, speedup u Work distribution u Work dealing is flawed u Work stealing can work but expensive u Work balancing EPFL CS-206 Spring 2015 Lec.9-132

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/

CS-206 Concurrency. Lecture 13. Wrap Up. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ CS-206 Concurrency Lecture 13 Wrap Up Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ Created by Nooshin Mirzadeh, Georgios Psaropoulos and Babak Falsafi EPFL Copyright 2015 EPFL CS-206 Spring

More information

CS-206 Concurrency. Lecture 8 Concurrent. Data structures. a b c d. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(c) remove(b)

CS-206 Concurrency. Lecture 8 Concurrent. Data structures. a b c d. Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(c) remove(b) CS-206 Concurrency Lecture 8 Concurrent a b c d Data structures Spring 2015 Prof. Babak Falsafi parsa.epfl.ch/courses/cs206/ remove(b) remove(c) Adapted from slides originally developed by Maurice Herlihy

More information

Multi-threading model

Multi-threading model Multi-threading model High level model of thread processes using spawn and sync. Does not consider the underlying hardware. Algorithm Algorithm-A begin { } spawn Algorithm-B do Algorithm-B in parallel

More information

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes ) CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance

More information

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II TDDB68 Concurrent programming and operating systems Lecture: CPU Scheduling II Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice:

More information

Notation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing

Notation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing Parallel Processing CS575 Parallel Processing Lecture five: Efficiency Wim Bohm, Colorado State University Some material from Speedup vs Efficiency in Parallel Systems - Eager, Zahorjan and Lazowska IEEE

More information

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling Scheduling I Today! Introduction to scheduling! Classical algorithms Next Time! Advanced topics on scheduling Scheduling out there! You are the manager of a supermarket (ok, things don t always turn out

More information

INF 4140: Models of Concurrency Series 3

INF 4140: Models of Concurrency Series 3 Universitetet i Oslo Institutt for Informatikk PMA Olaf Owe, Martin Steffen, Toktam Ramezani INF 4140: Models of Concurrency Høst 2016 Series 3 14. 9. 2016 Topic: Semaphores (Exercises with hints for solution)

More information

CS173 Lecture B, November 3, 2015

CS173 Lecture B, November 3, 2015 CS173 Lecture B, November 3, 2015 Tandy Warnow November 3, 2015 CS 173, Lecture B November 3, 2015 Tandy Warnow Announcements Examlet 7 is a take-home exam, and is due November 10, 11:05 AM, in class.

More information

Parallel Program Performance Analysis

Parallel Program Performance Analysis Parallel Program Performance Analysis Chris Kauffman CS 499: Spring 2016 GMU Logistics Today Final details of HW2 interviews HW2 timings HW2 Questions Parallel Performance Theory Special Office Hours Mon

More information

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling

Scheduling I. Today Introduction to scheduling Classical algorithms. Next Time Advanced topics on scheduling Scheduling I Today Introduction to scheduling Classical algorithms Next Time Advanced topics on scheduling Scheduling out there You are the manager of a supermarket (ok, things don t always turn out the

More information

Maps performance tips On server: Maintain DB connections, prepared statements (per thread/request!)

Maps performance tips On server: Maintain DB connections, prepared statements (per thread/request!) Announcements Maps performance tips On server: Maintain DB connections, prepared statements (per thread/request!) Use Spark.before, Spark.after to open and close. Use ThreadLocal, or pass the connection

More information

Distributed Systems Part II Solution to Exercise Sheet 10

Distributed Systems Part II Solution to Exercise Sheet 10 Distributed Computing HS 2012 Prof. R. Wattenhofer / C. Decker Distributed Systems Part II Solution to Exercise Sheet 10 1 Spin Locks A read-write lock is a lock that allows either multiple processes to

More information

Lab Course: distributed data analytics

Lab Course: distributed data analytics Lab Course: distributed data analytics 01. Threading and Parallelism Nghia Duong-Trung, Mohsan Jameel Information Systems and Machine Learning Lab (ISMLL) University of Hildesheim, Germany International

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.

More information

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg

Lecture 22: Multithreaded Algorithms CSCI Algorithms I. Andrew Rosenberg Lecture 22: Multithreaded Algorithms CSCI 700 - Algorithms I Andrew Rosenberg Last Time Open Addressing Hashing Today Multithreading Two Styles of Threading Shared Memory Every thread can access the same

More information

A Note on Parallel Algorithmic Speedup Bounds

A Note on Parallel Algorithmic Speedup Bounds arxiv:1104.4078v1 [cs.dc] 20 Apr 2011 A Note on Parallel Algorithmic Speedup Bounds Neil J. Gunther February 8, 2011 Abstract A parallel program can be represented as a directed acyclic graph. An important

More information

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis Lecture 14 Class URL: http://vlsicad.ucsd.edu/courses/cse21-s14/ Lecture 14 Notes Goals for this week Big-O complexity

More information

Clock-driven scheduling

Clock-driven scheduling Clock-driven scheduling Also known as static or off-line scheduling Michal Sojka Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Control Engineering November 8, 2017

More information

A Tighter Analysis of Work Stealing

A Tighter Analysis of Work Stealing A Tighter Analysis of Work Stealing Marc Tchiboukdjian Nicolas Gast Denis Trystram Jean-Louis Roch Julien Bernard Laboratoire d Informatique de Grenoble INRIA Marc Tchiboukdjian A Tighter Analysis of Work

More information

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014 Clojure Concurrency Constructs, Part Two CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014 1 Goals Cover the material presented in Chapter 4, of our concurrency textbook In particular,

More information

Season Finale: Which one is better?

Season Finale: Which one is better? CS4310.01 Introduction to Operating System Spring 2016 Dr. Zhizhang Shen Season Finale: Which one is better? 1 Background In this lab, we will study, and compare, two processor scheduling policies via

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2019 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3

CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3 CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3 This third problem set explores graphs, relations, functions, cardinalities, and the pigeonhole principle. This should be a great way to get a

More information

Safety and Liveness. Thread Synchronization: Too Much Milk. Critical Sections. A Really Cool Theorem

Safety and Liveness. Thread Synchronization: Too Much Milk. Critical Sections. A Really Cool Theorem Safety and Liveness Properties defined over an execution of a program Thread Synchronization: Too Much Milk Safety: nothing bad happens holds in every finite execution prefix Windows never crashes No patient

More information

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata CISC 4090: Theory of Computation Chapter Regular Languages Xiaolan Zhang, adapted from slides by Prof. Werschulz Section.: Finite Automata Fordham University Department of Computer and Information Sciences

More information

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015 CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015 Today Registration should be done. Homework 1 due 11:59pm next Wednesday, April 8 th. Review math

More information

Lecture 23: Illusiveness of Parallel Performance. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 23: Illusiveness of Parallel Performance. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 23: Illusiveness of Parallel Performance James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L23 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping peel

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

CPU Scheduling. CPU Scheduler

CPU Scheduling. CPU Scheduler CPU Scheduling These slides are created by Dr. Huang of George Mason University. Students registered in Dr. Huang s courses at GMU can make a single machine readable copy and print a single copy of each

More information

Parallel Performance Theory - 1

Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science Outline q Performance scalability q Analytical performance measures q Amdahl s law and Gustafson-Barsis

More information

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis Lecture 15 Class URL: http://vlsicad.ucsd.edu/courses/cse21-s14/ Lecture 15 Notes Goals for this week Big-O complexity

More information

CMSC 132, Object-Oriented Programming II Summer Lecture 10:

CMSC 132, Object-Oriented Programming II Summer Lecture 10: CMSC 132, Object-Oriented Programming II Summer 2016 Lecturer: Anwar Mamat Lecture 10: Disclaimer: These notes may be distributed outside this class only with the permission of the Instructor. 10.1 RECURSION

More information

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts.

TDDI04, K. Arvidsson, IDA, Linköpings universitet CPU Scheduling. Overview: CPU Scheduling. [SGG7] Chapter 5. Basic Concepts. TDDI4 Concurrent Programming, Operating Systems, and Real-time Operating Systems CPU Scheduling Overview: CPU Scheduling CPU bursts and I/O bursts Scheduling Criteria Scheduling Algorithms Multiprocessor

More information

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018

More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018 CS 61B More Asymptotic Analysis Spring 2018 Discussion 8: March 6, 2018 Here is a review of some formulas that you will find useful when doing asymptotic analysis. ˆ N i=1 i = 1 + 2 + 3 + 4 + + N = N(N+1)

More information

DETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD

DETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD D DETERMINING THE VARIABLE QUANTUM TIME (VQT) IN ROUND ROBIN AND IT S IMPORTANCE OVER AVERAGE QUANTUM TIME METHOD Yashasvini Sharma 1 Abstract The process scheduling, is one of the most important tasks

More information

CIS 4930/6930: Principles of Cyber-Physical Systems

CIS 4930/6930: Principles of Cyber-Physical Systems CIS 4930/6930: Principles of Cyber-Physical Systems Chapter 11 Scheduling Hao Zheng Department of Computer Science and Engineering University of South Florida H. Zheng (CSE USF) CIS 4930/6930: Principles

More information

Practical Combustion Kinetics with CUDA

Practical Combustion Kinetics with CUDA Funded by: U.S. Department of Energy Vehicle Technologies Program Program Manager: Gurpreet Singh & Leo Breton Practical Combustion Kinetics with CUDA GPU Technology Conference March 20, 2015 Russell Whitesides

More information

0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA

0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA 0-1 Knapsack Problem in parallel Progetto del corso di Calcolo Parallelo AA 2008-09 Salvatore Orlando 1 0-1 Knapsack problem N objects, j=1,..,n Each kind of item j has a value p j and a weight w j (single

More information

CSC236 Week 3. Larry Zhang

CSC236 Week 3. Larry Zhang CSC236 Week 3 Larry Zhang 1 Announcements Problem Set 1 due this Friday Make sure to read Submission Instructions on the course web page. Search for Teammates on Piazza Educational memes: http://www.cs.toronto.edu/~ylzhang/csc236/memes.html

More information

CPU scheduling. CPU Scheduling

CPU scheduling. CPU Scheduling EECS 3221 Operating System Fundamentals No.4 CPU scheduling Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University CPU Scheduling CPU scheduling is the basis of multiprogramming

More information

CS 170 Algorithms Spring 2009 David Wagner Final

CS 170 Algorithms Spring 2009 David Wagner Final CS 170 Algorithms Spring 2009 David Wagner Final PRINT your name:, (last) SIGN your name: (first) PRINT your Unix account login: Your TA s name: Name of the person sitting to your left: Name of the person

More information

CS173 Running Time and Big-O. Tandy Warnow

CS173 Running Time and Big-O. Tandy Warnow CS173 Running Time and Big-O Tandy Warnow CS 173 Running Times and Big-O analysis Tandy Warnow Today s material We will cover: Running time analysis Review of running time analysis of Bubblesort Review

More information

1 Introduction (January 21)

1 Introduction (January 21) CS 97: Concrete Models of Computation Spring Introduction (January ). Deterministic Complexity Consider a monotonically nondecreasing function f : {,,..., n} {, }, where f() = and f(n) =. We call f a step

More information

Topics in Complexity

Topics in Complexity Topics in Complexity Please evaluate this course on Axess! Your feedback really does make a difference. Applied Complexity Theory Complexity theory has enormous practical relevance across various domains

More information

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010 Announcements CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis Dan Grossman Spring 2010 Project 1 posted Section materials on using Eclipse will be very useful if you have never used

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

1 Lamport s Bakery Algorithm

1 Lamport s Bakery Algorithm Com S 6 Spring Semester 2009 Algorithms for Multiprocessor Synchronization Lecture 3: Tuesday, 27th January 2009 Instructor: Soma Chaudhuri Scribe: Neeraj Khanolkar Lamport s Bakery Algorithm Algorithm

More information

Distributed Computing in Shared Memory and Networks

Distributed Computing in Shared Memory and Networks Distributed Computing in Shared Memory and Networks Class 2: Consensus WEP 2018 KAUST This class Reaching agreement in shared memory: Consensus ü Impossibility of wait-free consensus 1-resilient consensus

More information

The topics in this section concern with the first course objective.

The topics in this section concern with the first course objective. 1.1 Systems & Probability The topics in this section concern with the first course objective. A system is one of the most fundamental concepts and one of the most useful and powerful tools in STEM (science,

More information

Blocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012)

Blocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012) 1 Streams Blocking Synchronization: Streams Vijay Saraswat (Dec 10, 2012) Streams provide a very simple abstraction for determinate parallel computation. The intuition for streams is already present in

More information

Announcements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday

Announcements. Project #1 grades were returned on Monday. Midterm #1. Project #2. Requests for re-grades due by Tuesday Announcements Project #1 grades were returned on Monday Requests for re-grades due by Tuesday Midterm #1 Re-grade requests due by Monday Project #2 Due 10 AM Monday 1 Page State (hardware view) Page frame

More information

Section 20: Arrow Diagrams on the Integers

Section 20: Arrow Diagrams on the Integers Section 0: Arrow Diagrams on the Integers Most of the material we have discussed so far concerns the idea and representations of functions. A function is a relationship between a set of inputs (the leave

More information

Module 5: CPU Scheduling

Module 5: CPU Scheduling Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 5.1 Basic Concepts Maximum CPU utilization obtained

More information

State and Finite State Machines

State and Finite State Machines State and Finite State Machines See P&H Appendix C.7. C.8, C.10, C.11 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register

More information

CS103 Handout 12 Winter February 4, 2013 Practice Midterm Exam

CS103 Handout 12 Winter February 4, 2013 Practice Midterm Exam CS103 Handout 12 Winter 2012-2013 February 4, 2013 Practice Midterm Exam This midterm exam is open-book, open-note, open-computer, but closed-network. This means that if you want to have your laptop with

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 6.1 Basic Concepts Maximum CPU utilization obtained

More information

Analytical Modeling of Parallel Systems

Analytical Modeling of Parallel Systems Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing

More information

Programming Abstractions

Programming Abstractions Programming Abstractions C S 1 0 6 X Cynthia Lee Today s topics: Performance issues in recursion Big-O performance analysis 2 Announcement: Recursive art contest! Go to http://recursivedrawing.com/ Make

More information

Aside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n

Aside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n Aside: Golden Ratio Golden Ratio: A universal law. Golden ratio φ = lim n a n+b n a n = 1+ 5 2 a n+1 = a n + b n, b n = a n 1 Ruta (UIUC) CS473 1 Spring 2018 1 / 41 CS 473: Algorithms, Spring 2018 Dynamic

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Exercise 1 (15 points)

Exercise 1 (15 points) Exam System Validation (214012) 8:45-12:15, 25-06-2010 The exercises are worth a total of 90 points. The final grade is 10+pts 10. The exam is open book: copies of slides/papers/notes/etc. are allowed.

More information

Review for the Midterm Exam

Review for the Midterm Exam Review for the Midterm Exam 1 Three Questions of the Computational Science Prelim scaled speedup network topologies work stealing 2 The in-class Spring 2012 Midterm Exam pleasingly parallel computations

More information

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic Computer Science 324 Computer Architecture Mount Holyoke College Fall 2007 Topic Notes: Digital Logic Our goal for the next few weeks is to paint a a reasonably complete picture of how we can go from transistor

More information

Deadlock (2) Dave Eckhardt Brian Railing Roger Dannenberg

Deadlock (2) Dave Eckhardt Brian Railing Roger Dannenberg Deadlock () Dave Eckhardt Brian Railing Roger Dannenberg 1 1-410, S'18 Synchronization P You should really have, today: Drawn pictures of thread stacks (even if not perfect) Figured out where stubs belong,

More information

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017 CMSC CMSC : Lecture Greedy Algorithms for Scheduling Tuesday, Sep 9, 0 Reading: Sects.. and. of KT. (Not covered in DPV.) Interval Scheduling: We continue our discussion of greedy algorithms with a number

More information

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51

2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each

More information

CHAPTER 5 - PROCESS SCHEDULING

CHAPTER 5 - PROCESS SCHEDULING CHAPTER 5 - PROCESS SCHEDULING OBJECTIVES To introduce CPU scheduling, which is the basis for multiprogrammed operating systems To describe various CPU-scheduling algorithms To discuss evaluation criteria

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2017S-22 Dynamic Programming David Galles Department of Computer Science University of San Francisco 22-0: Dynamic Programming Simple, recursive solution to a problem

More information

Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun

Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun spcl.inf.ethz.ch @spcl_eth Models: Amdahl s Law, PRAM, α-β Tal Ben-Nun Design of Parallel and High-Performance Computing Fall 2017 DPHPC Overview cache coherency memory models 2 Speedup An application

More information

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara

UC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara Operating Systems Christopher Kruegel Department of Computer Science http://www.cs.ucsb.edu/~chris/ Many processes to execute, but one CPU OS time-multiplexes the CPU by operating context switching Between

More information

Real Time Operating Systems

Real Time Operating Systems Real Time Operating ystems hared Resources Luca Abeni Credits: Luigi Palopoli, Giuseppe Lipari, and Marco Di Natale cuola uperiore ant Anna Pisa -Italy Real Time Operating ystems p. 1 Interacting Tasks

More information

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University Prof. Mi Lu TA: Ehsan Rohani Laboratory Exercise #4 MIPS Assembly and Simulation

More information

Real Time Operating Systems

Real Time Operating Systems Real Time Operating ystems Luca Abeni luca.abeni@unitn.it Interacting Tasks Until now, only independent tasks... A job never blocks or suspends A task only blocks on job termination In real world, jobs

More information

Provably Efficient Two-Level Adaptive Scheduling

Provably Efficient Two-Level Adaptive Scheduling Provably Efficient Two-Level Adaptive Scheduling Yuxiong He 1, Wen-Jing Hsu 1, and Charles E. Leiserson 2 1 Nanyang Technological University, Nanyang Avenue 639798, Singapore, yxhe@mit.edu, hsu@ntu.edu.sg

More information

INF Models of concurrency

INF Models of concurrency INF4140 - Models of concurrency Fall 2017 October 17, 2017 Abstract This is the handout version of the slides for the lecture (i.e., it s a rendering of the content of the slides in a way that does not

More information

Work Stealing with Parallelism Feedback

Work Stealing with Parallelism Feedback Work Stealing with Parallelism Feedback Kunal Agrawal Yuxiong He Charles E. Leiserson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139,

More information

N/4 + N/2 + N = 2N 2.

N/4 + N/2 + N = 2N 2. CS61B Summer 2006 Instructor: Erin Korber Lecture 24, 7 Aug. 1 Amortized Analysis For some of the data structures we ve discussed (namely hash tables and splay trees), it was claimed that the average time

More information

CSE 241 Class 1. Jeremy Buhler. August 24,

CSE 241 Class 1. Jeremy Buhler. August 24, CSE 41 Class 1 Jeremy Buhler August 4, 015 Before class, write URL on board: http://classes.engineering.wustl.edu/cse41/. Also: Jeremy Buhler, Office: Jolley 506, 314-935-6180 1 Welcome and Introduction

More information

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

5 + 9(10) + 3(100) + 0(1000) + 2(10000) = Chapter 5 Analyzing Algorithms So far we have been proving statements about databases, mathematics and arithmetic, or sequences of numbers. Though these types of statements are common in computer science,

More information

Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2

Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2 EECS 16A Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2 This homework is due February 6, 2017, at 23:59. Self-grades are due February 9, 2017, at

More information

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions Math 308 Midterm Answers and Comments July 18, 2011 Part A. Short answer questions (1) Compute the determinant of the matrix a 3 3 1 1 2. 1 a 3 The determinant is 2a 2 12. Comments: Everyone seemed to

More information

APTAS for Bin Packing

APTAS for Bin Packing APTAS for Bin Packing Bin Packing has an asymptotic PTAS (APTAS) [de la Vega and Leuker, 1980] For every fixed ε > 0 algorithm outputs a solution of size (1+ε)OPT + 1 in time polynomial in n APTAS for

More information

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo CSE 431/531: Analysis of Algorithms Dynamic Programming Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Paradigms for Designing Algorithms Greedy algorithm Make a

More information

Sequential Logic (3.1 and is a long difficult section you really should read!)

Sequential Logic (3.1 and is a long difficult section you really should read!) EECS 270, Fall 2014, Lecture 6 Page 1 of 8 Sequential Logic (3.1 and 3.2. 3.2 is a long difficult section you really should read!) One thing we have carefully avoided so far is feedback all of our signals

More information

Algorithms: COMP3121/3821/9101/9801

Algorithms: COMP3121/3821/9101/9801 Algorithms: COMP311/381/9101/9801 Aleks Ignjatović, ignjat@cse.unsw.edu.au office: 504 (CSE building); phone: 5-6659 Course Admin: Amin Malekpour, a.malekpour@unsw.edu.au School of Computer Science and

More information

Parallel Performance Theory

Parallel Performance Theory AMS 250: An Introduction to High Performance Computing Parallel Performance Theory Shawfeng Dong shaw@ucsc.edu (831) 502-7743 Applied Mathematics & Statistics University of California, Santa Cruz Outline

More information

6.042/18.062J Mathematics for Computer Science March 17, 2005 Srini Devadas and Eric Lehman. Recurrences

6.042/18.062J Mathematics for Computer Science March 17, 2005 Srini Devadas and Eric Lehman. Recurrences 6.04/8.06J Mathematics for Computer Science March 7, 00 Srini Devadas and Eric Lehman Lecture Notes Recurrences Recursion breaking an object down into smaller objects of the same type is a major theme

More information

CS 453 Operating Systems. Lecture 7 : Deadlock

CS 453 Operating Systems. Lecture 7 : Deadlock CS 453 Operating Systems Lecture 7 : Deadlock 1 What is Deadlock? Every New Yorker knows what a gridlock alert is - it s one of those days when there is so much traffic that nobody can move. Everything

More information

Exercise 1: Formal Tools and Techniques (10 points)

Exercise 1: Formal Tools and Techniques (10 points) EXAM System Validation (191401) 13:45-17:15 0-07-01 The exercises are worth a total of 100 points. The final grade for the course is (hw 1+hw )/+exam/10, provided that you obtain at least 50 points for

More information

Congratulations you're done with CS103 problem sets! Please evaluate this course on Axess!

Congratulations you're done with CS103 problem sets! Please evaluate this course on Axess! The Big Picture Announcements Problem Set 9 due right now. We'll release solutions right after lecture. Congratulations you're done with CS103 problem sets! Practice final exam is Monday, December 8 from

More information

Boyle s Law and Charles Law Activity

Boyle s Law and Charles Law Activity Boyle s Law and Charles Law Activity Introduction: This simulation helps you to help you fully understand 2 Gas Laws: Boyle s Law and Charles Law. These laws are very simple to understand, but are also

More information

1 Trees. Listing 1: Node with two child reference. public class ptwochildnode { protected Object data ; protected ptwochildnode l e f t, r i g h t ;

1 Trees. Listing 1: Node with two child reference. public class ptwochildnode { protected Object data ; protected ptwochildnode l e f t, r i g h t ; 1 Trees The next major set of data structures belongs to what s called Trees. They are called that, because if you try to visualize the structure, it kind of looks like a tree (root, branches, and leafs).

More information

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,

More information

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ). CMPSCI611: Verifying Polynomial Identities Lecture 13 Here is a problem that has a polynomial-time randomized solution, but so far no poly-time deterministic solution. Let F be any field and let Q(x 1,...,

More information

State & Finite State Machines

State & Finite State Machines State & Finite State Machines Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H Appendix C.7. C.8, C.10, C.11 Big Picture: Building a Processor memory inst register file

More information

Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras

Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras Advanced Structural Analysis Prof. Devdas Menon Department of Civil Engineering Indian Institute of Technology, Madras Module No. # 5.4 Lecture No. # 30 Matrix Analysis of Beams and Grids (Refer Slide

More information

QR Decomposition in a Multicore Environment

QR Decomposition in a Multicore Environment QR Decomposition in a Multicore Environment Omar Ahsan University of Maryland-College Park Advised by Professor Howard Elman College Park, MD oha@cs.umd.edu ABSTRACT In this study we examine performance

More information