Scheduling algorithms for heterogeneous and failure-prone platforms

Size: px
Start display at page:

Download "Scheduling algorithms for heterogeneous and failure-prone platforms"

Transcription

1 Scheduling algorithms for heterogeneous and failure-prone platforms Yves Robert Ecole Normale Supérieure de Lyon & Institut Universitaire de France Joint work with Olivier Beaumont, Anne Benoit, Henri Casanova, Fanny Dufossé, Mathias Jacquelin, Arnaud Legrand, Loris Marchal, Paul Renaud-Goud, Arnold Rosenberg & Frédéric Vivien HCW, Anchorage, May 16, 2011 Scheduling algorithms 1/ 72

2 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Evolution of parallel machines From (good old) parallel architectures... Yves.Robert@ens-lyon.fr Scheduling algorithms 2/ 72

3 Evolution of parallel machines... to heterogeneous clusters... Scheduling algorithms 2/ 72

4 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Evolution of parallel machines... to large-scale grid platforms... Yves.Robert@ens-lyon.fr Scheduling algorithms 2/ 72

5 Evolution of parallel machines... everything hard-to-program... (courtesy Rajeev Thakur) Scheduling algorithms 2/ 72

6 Evolution of parallel machines... and definitely prone to failure! (courtesy Al Geist) Scheduling algorithms 2/ 72

7 New platforms, new problems, new solutions Parallel algorithm design and scheduling were already difficult tasks with homogeneous machines On heterogeneous multicore failure-prone platforms, it gets worse Need to adapt algorithms and scheduling strategies: new objective functions, new models Scheduling algorithms 3/ 72

8 New platforms, new problems, new solutions Parallel algorithm design and scheduling were already difficult tasks with homogeneous machines On heterogeneous multicore failure-prone platforms, it gets worse Need to adapt algorithms and scheduling strategies: new objective functions, new models Scheduling algorithms 3/ 72

9 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 4/ 72

10 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 5/ 72

11 Who needs a scheduler? Billions of (mostly idle) computers in the world All interconnected by these (mostly empty) network pipes Resources: abundant and cheap, not to say unlimited and free Virtual machines Greedy resource selection Demand-driven execution: first-come first-serve, round-robin Nobody needs a scheduler!!! Yves.Robert@ens-lyon.fr Scheduling algorithms 6/ 72

12 Who needs a scheduler? Billions of (mostly idle) computers in the world All interconnected by these (mostly empty) network pipes Resources: abundant and cheap, not to say unlimited and free Virtual machines Greedy resource selection Demand-driven execution: first-come first-serve, round-robin Let s prove this Nobody needs a scheduler!!! wrong?! Yves.Robert@ens-lyon.fr Scheduling algorithms 6/ 72

13 (A) Independent tasks no communication B independent equal-size tasks p processors P 1, P 2,..., P p w i = time for P i to process a task Intuition: load of P i proportional to its speed 1/w i Assign n i tasks to P i Objective: minimize T exe = max p i=1 n i =B (n i w i ) Yves.Robert@ens-lyon.fr Scheduling algorithms 7/ 72

14 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

15 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

16 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

17 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

18 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

19 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

20 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

21 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

22 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

23 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

24 Dynamic programming With 3 processors: w 1 = 3, w 2 = 5, and w 3 = 8 P 1 P 2 P 3 Task n 1 n 2 n 3 T exe Selected proc Yves.Robert@ens-lyon.fr Scheduling algorithms 8/ 72

25 Static versus dynamic Greedy (demand-driven) would have done a perfect job Would even be better (possible variations in processor speeds) Static assignment required useless thinking Yves.Robert@ens-lyon.fr Scheduling algorithms 9/ 72

26 Static versus dynamic Greedy (demand-driven) would have done a perfect job Would even be better (possible variations in processor speeds) Static assignment required useless thinking Yves.Robert@ens-lyon.fr Scheduling algorithms 9/ 72

27 (B) With dependencies still no communication i N 1 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x T 2,3 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x A simple finite difference problem N 2 j Iteration space: 2D rectangle of size N 1 N {( ) ( )} Dependences between tiles, 0 1 Yves.Robert@ens-lyon.fr Scheduling algorithms 10/ 72

28 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

29 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

30 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

31 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

32 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

33 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

34 Allocation strategy (1/3) Use column-wise allocation to enhance locality Stepwise execution Yves.Robert@ens-lyon.fr Scheduling algorithms 11/ 72

35 Allocation strategy (2/3) With column-wise allocation, T opt N 1 N 2 p i=1 1 w i. Greedy (demand-driven) allocation slowdown?! Execution progresses at the pace of the slowest processor Yves.Robert@ens-lyon.fr Scheduling algorithms 12/ 72

36 Allocation strategy (2/3) With column-wise allocation, T opt N 1 N 2 p i=1 1 w i. Greedy (demand-driven) allocation slowdown?! Execution progresses at the pace of the slowest processor Yves.Robert@ens-lyon.fr Scheduling algorithms 12/ 72

37 Allocation strategy (3/3) With 3 processors, w 1 = 3, w 2 = 5, and w 3 = 8: i... P 1 P P 2 3 w =3 1 w =5 2 w =8 3 j T exe 8 3 N 1N N 1 N 2 T opt N 1N N 1 N 2 Yves.Robert@ens-lyon.fr Scheduling algorithms 13/ 72

38 Periodic static allocation (1/2) With 3 processors, w 1 = 3, w 2 = 5, and w 3 = 8: i... P 1 P P 2 3 w =3 1 w =5 2 w =8 3 j Assigning blocks of B = 10 columns, T exe 1.6 N 1 N 2 Yves.Robert@ens-lyon.fr Scheduling algorithms 14/ 72

39 Periodic static allocation (2/2) L = lcm(w 1, w 2,..., w p ) Example: L = lcm(3, 5, 8) = 120 P 1 receives first n 1 = L/w 1 columns, P 2 next n 2 = L/w 2 columns, and so on Period: block of B = n 1 + n n p contiguous columns Example: B = n 1 + n 2 + n 3 = = 79 Change schedule: - Sort processors so that n 1 w 1 n 2 w 2... n p w p - Process horizontally within blocks Optimal Yves.Robert@ens-lyon.fr Scheduling algorithms 15/ 72

40 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 16/ 72

41 Traditional scheduling Framework Application DAG G = (T, E, w) T = set of tasks E = dependence constraints w(t ) = computational cost of task T (execution time) c(t, T ) = communication cost (data sent from T to T ) Platform Set of p identical processors Schedule σ(t ) = date to begin execution of task T alloc(t ) = processor assigned to it Objective Makespan or total execution time MS(σ) = max (σ(t ) + w(t )) T T Yves.Robert@ens-lyon.fr Scheduling algorithms 17/ 72

42 Traditional scheduling Constraints w(t) T T comm(t,t ) w(t ) σ(t ) σ(t ) + w(t ) σ(t ) time Data dependences If (T, T ) E then if alloc(t ) = alloc(t ) then σ(t ) + w(t ) σ(t ) if alloc(t ) alloc(t ) then σ(t ) + w(t ) + c(t, T ) σ(t ) Resource constraints alloc(t ) = alloc(t ) (σ(t ) + w(t ) σ(t )) or (σ(t ) + w(t ) σ(t )) Yves.Robert@ens-lyon.fr Scheduling algorithms 18/ 72

43 Traditional scheduling About the model Simple but OK for computational resources No CPU sharing, even in models with preemption At most one task running per processor at any time-step Very crude for network resources Unlimited number of simultaneous sends/receives per processor No contention unbounded bandwidth on any link Fully connected interconnection graph (clique) In fact, model assumes infinite network capacity Scheduling algorithms 19/ 72

44 Makespan minimization NP-hardness Pb(p) NP-complete for independent tasks and no communications (E =, p = 2 and c = 0) Pb(p) NP-complete for UET-UCT graphs (w = c = 1) Approximation algorithms Without communications, list scheduling is a (2 1 p )-approximation With communications, result extends to coarse-grain graphs With communications, no λ-approximation in general Yves.Robert@ens-lyon.fr Scheduling algorithms 20/ 72

45 List scheduling Without communications Initialization: Compute priority level of all tasks Priority queue = list of free tasks (tasks without predecessors) sorted by priority While there remain tasks to execute: Add new free tasks, if any, to the queue. If there are q available processors and r tasks in the queue, remove first min(q, r) tasks from the queue and execute them Priority level Use critical path: longest path from the task to an exit node Computed recursively by a bottom-up traversal of the graph Yves.Robert@ens-lyon.fr Scheduling algorithms 21/ 72

46 List scheduling With communications Priority level Use pessimistic critical path: include all edge costs in the weight Computed recursively by a bottom-up traversal of the graph MCP Modified Critical Path Assign free task with highest priority to best processor Best processor = finishes execution first, given already taken scheduling decisions Free tasks may not be ready for execution (communication delays) May explore inserting the task in empty slots of schedule Complexity O( V log V + ( E + V )p) Yves.Robert@ens-lyon.fr Scheduling algorithms 22/ 72

47 Extending the model to heterogeneous resources Task graph with n tasks T 1,..., T n. Platform with p heterogeneous processors P 1,..., P p. Computation costs: - w iq = execution time of T i on P q p q=1 w iq - w i = p average execution time of T i - particular case: consistent tasks w iq = w i γ q Communication costs: - data(i, j): data volume for edge e ij : T i T j - v qr : communication time for unit-size message from P q to P r (zero if q = r) - com(i, j, q, r) = data(i, j) v qr communication time from T i executed on P q to P j executed on P r - com ij = data(i, j) cost for edge e ij : T i T j 1 q,r p,q r vqr p(p 1) average communication Yves.Robert@ens-lyon.fr Scheduling algorithms 23/ 72

48 HEFT: Heterogeneous Earliest Finish Time Priority level: rank(t i ) = w i + max (com ij + rank(t j )), T j Succ(T i ) where Succ(T ) is the set of successors of T Recursive computation by bottom-up traversal of the graph Allocation For current task T i, determine best processor P q : minimize σ(t i ) + w iq Enforce constraints related to communication costs Insertion scheduling: look for t = σ(t i ) s.t. P q is available during interval [t, t + w iq [ Complexity: same as MCP without/with insertion Yves.Robert@ens-lyon.fr Scheduling algorithms 24/ 72

49 What s wrong? Nothing (still may need to map a DAG onto a platform!) Absurd communication model: complicated: many parameters to instantiate while not realistic (clique + no contention) Wrong metric: need to relax makespan minimization objective Yves.Robert@ens-lyon.fr Scheduling algorithms 25/ 72

50 What s wrong? Nothing (still may need to map a DAG onto a platform!) Absurd communication model: complicated: many parameters to instantiate while not realistic (clique + no contention) Wrong metric: need to relax makespan minimization objective Yves.Robert@ens-lyon.fr Scheduling algorithms 25/ 72

51 What s wrong? Nothing (still may need to map a DAG onto a platform!) Absurd communication model: complicated: many parameters to instantiate while not realistic (clique + no contention) Wrong metric: need to relax makespan minimization objective Yves.Robert@ens-lyon.fr Scheduling algorithms 25/ 72

52 Favorite communication models One-port serialize incoming (and/or) outgoing communications pessimistic but realistic standard MPI Bounded multi-port several concurrent incoming and outgoing communications share total available bandwidth bottleneck = network card multi-threaded systems Yves.Robert@ens-lyon.fr Scheduling algorithms 26/ 72

53 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 27/ 72

54 Volunteer computing applications Monte Carlo methods Factoring large numbers Searching for Mersenne primes Particle detection at CERN many others: BOINC at Scheduling algorithms 28/ 72

55 PQ RS N T M NO T U M UV Q E F G B DE H I JA BA CDE / ( ) *,+ -. < 9 87 ;:9!"!" #" #" #" $ % # $ '& #" $ % # $ '& %' %' =>?@ L K Z M N ZQ Y [\ UX UQ Q NQ O X T W N UX YR X O V M UX Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

56 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A is the root of the tree; all tasks start at A Time for sending one task from A to B Time for computing one task in C Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

57 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send B receive B compute C receive C compute C send D receive D compute Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

58 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send B receive B compute C receive C compute C send D receive D compute Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

59 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send B receive B compute C receive C compute C send D receive D compute Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

60 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send B receive B compute C receive C compute C send D receive D compute Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

61 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send B receive B compute C receive C compute C send D receive D compute Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

62 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Example A compute A send Startup Repeated pattern Clean-up B receive B compute C receive C compute C send D receive D compute Steady-state: 7 tasks every 6 time units Yves.Robert@ens-lyon.fr Scheduling algorithms 29/ 72

63 Rule of the game M c 1 c 2 c p c i P 1 P 2 P i P p w 1 w 2 w i w p Master sends tasks sequentially, without preemption Full computation/communication overlap for each worker Worker P i receives a task in c i time-units Worker P i processes a task in w i time-units Yves.Robert@ens-lyon.fr Scheduling algorithms 30/ 72

64 Equations M c 1 c 2 c p c i P 1 P 2 P i P p w 1 w 2 w i w p Worker P i executes α i tasks per time-unit Computations: α i w i 1 One-port ommunications: i α ic i 1 Objective: maximize throughput ρ = i α i Yves.Robert@ens-lyon.fr Scheduling algorithms 30/ 72

65 Solution Faster-communicating workers first: c 1 c 2... Make full use of first q workers, where q largest index s.t. q i=1 c i w i 1 Make partial use of next worker P q+1 Discard other workers Bandwidth-centric strategy - Delegate work to the fastest communicating workers - It doesn t matter if these workers are computing slowly - Slow workers will not contribute much to overall throughput Yves.Robert@ens-lyon.fr Scheduling algorithms 31/ 72

66 Example M Fully active Discarded Tasks Communication Computation 6 tasks to P 1 6c 1 = 6 6w 1 = 18 3 tasks to P 2 3c 2 = 6 3w 2 = 18 2 tasks to P 3 2c 3 = 6 2w 3 = 2 11 tasks every 18 time-units (ρ = 11/18 0.6) Yves.Robert@ens-lyon.fr Scheduling algorithms 32/ 72

67 Example M Fully active Discarded Compare to purely greedy (demand-driven) strategy! 5 tasks every 36 time-units (ρ = 5/ ) Even if resources are cheap and abundant, resource selection is key to performance Yves.Robert@ens-lyon.fr Scheduling algorithms 32/ 72

68 Extension to trees Fully used node Partially used node Idle node Resource selection based on local information (children) Yves.Robert@ens-lyon.fr Scheduling algorithms 33/ 72

69 Does this really work? Can we deal with arbitrary platforms (including cycles)? Can we deal with return messages? In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Scheduling algorithms 34/ 72

70 Does this really work? Can we deal with arbitrary platforms (including cycles)? Yes Can we deal with return messages? In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Scheduling algorithms 34/ 72

71 Does this really work? Can we deal with arbitrary platforms (including cycles)? Yes Can we deal with return messages? In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Scheduling algorithms 34/ 72

72 Does this really work? Can we deal with arbitrary platforms (including cycles)? Yes Can we deal with return messages? Yes In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Scheduling algorithms 34/ 72

73 Does this really work? Can we deal with arbitrary platforms (including cycles)? Yes Can we deal with return messages? Yes In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Scheduling algorithms 34/ 72

74 Does this really work? Can we deal with arbitrary platforms (including cycles)? Yes Can we deal with return messages? Yes In fact, can we deal with more complex applications (arbitrary collections of DAGs)? Yes, I mean, almost! Yves.Robert@ens-lyon.fr Scheduling algorithms 34/ 72

75 LP formulation still works well... P k T n T m file e mn P j P i c ji c ik Conservation law m, n sent(p j P i, e mn ) + executed(p i, T m ) j = executed(p i, T n ) + k w i sent(p i P k, e mn ) Computations executed(p i, T m ) flops(t m ) w i 1 m Outgoing communications sent(p j P i, e mn ) bytes(e mn ) c ij 1 m,n j Yves.Robert@ens-lyon.fr Scheduling algorithms 35/ 72

76 ... but schedule reconstruction is harder A1 A2 A3 A4 A5 χ1 χ2 χ3 χ4 χ1 χ2 χ3 χ4 χ1 χ2 χ3 χ4 χ1 χ2 χ3 χ4 link43 link34 { { { { { { { { { { { { { { { { link42 link24 link32 link23 link31 link13 link21 link12 P4 P3 P2 P Actual (cyclic) schedule obtained in polynomial time Asymptotic optimality A couple of practical problems (large period, # buffers) No local scheduling policy Yves.Robert@ens-lyon.fr Scheduling algorithms 36/ 72

77 The beauty of steady-state scheduling Rationale Maximize throughput Simplicity Relaxation of makespan minimization problem Ignore initialization and clean-up Precise ordering of tasks/messages not needed Characterize resource activity per time-unit: - fraction of time spent computing for each application - fraction of time spent receiving from / sending to each neighbor? Efficiency Optimal throughput optimal schedule (up to a constant number of tasks) Periodic schedule, described in compact form compiling a loop instead of a DAG! Yves.Robert@ens-lyon.fr Scheduling algorithms 37/ 72

78 Lesson learnt? Resource selection is mandatory Scheduling algorithms 38/ 72

79 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 39/ 72

80 Introduction Who needs a scheduler? Background: scheduling DAGs Steady-state scheduling Energy-aware scheduling Schedulin Scheduling workflows Data streams: images, frames, matrices, etc. Structured applications: several steps to process each data set Pre-processing Encoding Final result Original images Goal: efficiently use computing resources Yves.Robert@ens-lyon.fr Scheduling algorithms 40/ 72

81 Example 4 processing stages, 3 processors at our disposal Where/how can we execute the application? Yves.Robert@ens-lyon.fr Scheduling algorithms 41/ 72

82 Example 4 processing stages, 3 processors at our disposal Where/how can we execute the application? S 1 20 S 2 15 S 3 1 S P P 10 1 P 3 Yves.Robert@ens-lyon.fr Scheduling algorithms 41/ 72

83 Example 4 processing stages, 3 processors at our disposal Where/how can we execute the application? S 1 20 S 2 15 S 3 1 S P P 10 1 P 3 Use all resources greedily Many communications to pay, not efficient at all! Yves.Robert@ens-lyon.fr Scheduling algorithms 41/ 72

84 Example 4 processing stages, 3 processors at our disposal Where/how can we execute the application? S 1 20 S 2 15 S 3 1 S P P 10 1 P 3 Everything on fastest processor: no communication Optimal execution time to process one single data Yves.Robert@ens-lyon.fr Scheduling algorithms 41/ 72

85 Example 4 processing stages, 3 processors at our disposal Where/how can we execute the application? S 1 20 S 2 15 S 3 1 S P P 10 1 P 3 Optimal throughput: processing different data in parallel Resource selection: do not use slowest processor Yves.Robert@ens-lyon.fr Scheduling algorithms 41/ 72

86 Optimization criteria Period P: time interval between the beginning of execution of two consecutive data sets (inverse of throughput) Latency L: maximal time elapsed between beginning and end of execution of a data set Reliability: inverse of F, probability of failure of the application (i.e., some data sets will not be processed) Energy E: total power dissipated by platform period period period Yves.Robert@ens-lyon.fr Scheduling algorithms 42/ 72

87 Optimization criteria Period P: time interval between the beginning of execution of two consecutive data sets (inverse of throughput) Latency L: maximal time elapsed between beginning and end of execution of a data set Reliability: inverse of F, probability of failure of the application (i.e., some data sets will not be processed) Energy E: total power dissipated by platform latency Yves.Robert@ens-lyon.fr Scheduling algorithms 42/ 72

88 Optimization criteria Period P: time interval between the beginning of execution of two consecutive data sets (inverse of throughput) Latency L: maximal time elapsed between beginning and end of execution of a data set Reliability: inverse of F, probability of failure of the application (i.e., some data sets will not be processed) Energy E: total power dissipated by platform failure probability Yves.Robert@ens-lyon.fr Scheduling algorithms 42/ 72

89 Optimization criteria Period P: time interval between the beginning of execution of two consecutive data sets (inverse of throughput) Latency L: maximal time elapsed between beginning and end of execution of a data set Reliability: inverse of F, probability of failure of the application (i.e., some data sets will not be processed) Energy E: total power dissipated by platform The internet begins with coal Yves.Robert@ens-lyon.fr Scheduling algorithms 42/ 72

90 A primer on energy consumption Algorithmic techniques: Shut down idle processors Dynamic speed scaling: processors can run at variable speed, e.g., Intel XScale, Intel Speed Step, AMD PowerNow VDD Hopping: from discrete modes to continuous speed Dissipated power The higher the speed, the higher the power consumption Power = f V 2, and V (voltage) increases with f (frequency) Speed s: P(s) = s α + P static, with 2 α 3 Yves.Robert@ens-lyon.fr Scheduling algorithms 43/ 72

91 A primer on energy consumption Algorithmic techniques: Shut down idle processors Dynamic speed scaling: processors can run at variable speed, e.g., Intel XScale, Intel Speed Step, AMD PowerNow VDD Hopping: from discrete modes to continuous speed Dissipated power The higher the speed, the higher the power consumption Power = f V 2, and V (voltage) increases with f (frequency) Speed s: P(s) = s α + P static, with 2 α 3 Yves.Robert@ens-lyon.fr Scheduling algorithms 43/ 72

92 Example Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

93 Example Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

94 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

95 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

96 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

97 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

98 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

99 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

100 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

101 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

102 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

103 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

104 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

105 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

106 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

107 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

108 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

109 Example P = = 539 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

110 Example P = 539 P = 8 Period: T = 3 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

111 Example P = 539 P = 8 Period: T = 3 T = 15 Latency: L = 8 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

112 Example P = 539 P = 8 Period: T = 3 T = 15 Latency: L = 8 L = 17 Yves.Robert@ens-lyon.fr Scheduling algorithms 44/ 72

113 Multi-criteria optimization Performance-oriented objectives Environmental objectives Objective function Minimize period (inverse of throughput) Minimize latency (time to process a data set) Minimize application failure probability Minimize energy consumption Minimize platform cost Minimize α.p + β.l + γ.e? Minimize period, given a response time bound and an energy budget! Yves.Robert@ens-lyon.fr Scheduling algorithms 45/ 72

114 Multi-criteria optimization Performance-oriented objectives Environmental objectives Objective function Minimize period (inverse of throughput) P Minimize latency (time to process a data set) L Minimize application failure probability Minimize energy consumption E Minimize platform cost Minimize α.p + β.l + γ.e? Minimize period, given a response time bound and an energy budget! Yves.Robert@ens-lyon.fr Scheduling algorithms 45/ 72

115 Multi-criteria optimization Performance-oriented objectives Environmental objectives Objective function Minimize period (inverse of throughput) P Minimize latency (time to process a data set) L Minimize application failure probability Minimize energy consumption E Minimize platform cost Minimize α.p + β.l + γ.e? Minimize period, given a response time bound and an energy budget! Yves.Robert@ens-lyon.fr Scheduling algorithms 45/ 72

116 Research problems Single workflow Performance (period/latency) vs. environment (energy, cost) Robust mappings (replication for performance vs. for reliability) Several (concurrent) workflows Competition for CPU and network resources Fairness between applications: - max-min throughput - min-max stretch (slowdown factor) Sensitivity to application/platform parameter changes Yves.Robert@ens-lyon.fr Scheduling algorithms 46/ 72

117 Research problems Single workflow Performance (period/latency) vs. environment (energy, cost) Robust mappings (replication for performance vs. for reliability) Several (concurrent) workflows Competition for CPU and network resources Fairness between applications: - max-min throughput - min-max stretch (slowdown factor) Sensitivity to application/platform parameter changes Yves.Robert@ens-lyon.fr Scheduling algorithms 46/ 72

118 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 47/ 72

119 Cycle-stealing scenario (1/2) Execute 4 jobs A B C D during week-end Replicate them on 3 machines P 1, P 2 and P 3 Risk increases with time Machines reclaimed at 8am on Monday with probability 1 Yves.Robert@ens-lyon.fr Scheduling algorithms 48/ 72

120 Cycle-stealing scenario (1/2) Execute 4 jobs A B C D during week-end Replicate them on 3 machines P 1, P 2 and P 3 Risk increases linearly with time Machines reclaimed at 8am on Monday with probability 1 Yves.Robert@ens-lyon.fr Scheduling algorithms 48/ 72

121 Cycle-stealing scenario (2/2) A B C D P Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

122 Cycle-stealing scenario (2/2) A B C D P P 2 Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

123 Cycle-stealing scenario (2/2) A B C D P P Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

124 Cycle-stealing scenario (2/2) A B C D P P P 3 Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

125 Cycle-stealing scenario (2/2) A B C D P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

126 Cycle-stealing scenario (2/2) A B C D P P P A B C D P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 49/ 72

127 Problem n chunks (divisible computational workload) p identical computers Replicate p times the execution of each chunk 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Unrecoverable (fail-stop) interruptions A-priori knowledge of risk (failure probability) Goal: maximize expected amount of work done Yves.Robert@ens-lyon.fr Scheduling algorithms 50/ 72

128 Problem n chunks (divisible computational workload) p identical computers Replicate p times the execution of each chunk 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Unrecoverable (fail-stop) interruptions A-priori knowledge of risk (failure probability) Goal: maximize expected amount of work done Yves.Robert@ens-lyon.fr Scheduling algorithms 50/ 72

129 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

130 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

131 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

132 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

133 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

134 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

135 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

136 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

137 Symmetric schedules (1/2) 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } 1 } 2 3 {{ 4 } Group 1 Group 1 Group 3 Time P P P P Yves.Robert@ens-lyon.fr Scheduling algorithms 51/ 72

138 Symmetric schedules (2/2) Time P P P P Group 1 Group 2 Group Time-steps for group execution G ij = i-th execution of group j Yves.Robert@ens-lyon.fr Scheduling algorithms 52/ 72

139 Optimization problem (1/2) Group 1 Group 2 Group Yves.Robert@ens-lyon.fr Scheduling algorithms 53/ 72

140 Optimization problem (1/2) Group 1 Group 2 Group All four executions fail with probability prop. to Yves.Robert@ens-lyon.fr Scheduling algorithms 53/ 72

141 Optimization problem (1/2) Group 1 Group 2 Group All four executions fail with probability prop. to First execution failure probability λωg 11 G 11 = 1 Second execution failure probability λωg 21 G 21 = 6 Third execution failure probability λωg 31 G 31 = 9 Fourth execution failure probability λωg 41 G 41 = 12 Whole schedule failure probability λ 4 ω 4 Π 4 i=1 G i Yves.Robert@ens-lyon.fr Scheduling algorithms 53/ 72

142 Optimization problem (1/2) Group 1 Group 2 Group All four executions fail with probability prop. to Yves.Robert@ens-lyon.fr Scheduling algorithms 53/ 72

143 Optimization problem (1/2) Group 1 Group 2 Group All four executions fail with probability prop. to Yves.Robert@ens-lyon.fr Scheduling algorithms 53/ 72

144 Optimization problem (2/2) Problem E(jobdone) = n ( ) p p pω 1 λ p ω p G i,j j=1 Minimize K = n p p j=1 i=1 G i,j where entries of G are a permutation of [1..n] i=1 Bound K min = n p n/p n! Yves.Robert@ens-lyon.fr Scheduling algorithms 54/ 72

145 Heuristics (1/3) Group 1 Group 2 Group (a) Cyclic: K = 3104 Group 1 Group 2 Group (b) Reverse: K = 2368 Yves.Robert@ens-lyon.fr Scheduling algorithms 55/ 72

146 Heuristics (2/3) Group 1 Group 2 Group (c) Mirror: K = 2572 Group 1 Group 2 Group (d) Snake: K = 2464 Yves.Robert@ens-lyon.fr Scheduling algorithms 56/ 72

147 Heuristics (3/3) Step CCP Step CCP Step CCP Step (e) Greedy: K = 2368 Group 1 Group 2 Group (f) Optimal K = 2352 K min = ! = 2348 Yves.Robert@ens-lyon.fr Scheduling algorithms 57/ 72

148 A nice little algorithmic challenge Group 1 Group 2 Group 3... Group n/p P 1 x x x... x P 2 x x x... x P 3 x x x... x P 4 x x x... x P p x x x... x Fill up matrix with a permutation of [1..n] minimizing the sum of column products OK, OK, Greedy asymptotically optimal Still, this is a frustrating open problem Yves.Robert@ens-lyon.fr Scheduling algorithms 58/ 72

149 A nice little algorithmic challenge Group 1 Group 2 Group 3... Group n/p P 1 x x x... x P 2 x x x... x P 3 x x x... x P 4 x x x... x P p x x x... x Fill up matrix with a permutation of [1..n] minimizing the sum of column products OK, OK, Greedy asymptotically optimal Still, this is a frustrating open problem Yves.Robert@ens-lyon.fr Scheduling algorithms 58/ 72

150 Many potential sources of heterogeneity Processor of different speeds Links of different bandwidths Resources obeying different risk functions - different owner categories? Trade-off between speed and reliability Yves.Robert@ens-lyon.fr Scheduling algorithms 59/ 72

151 Mapping applications on volatile resources Iterative applications Independent / tightly-coupled tasks within each iteration Synchronization (checkpoint) after each iteration Master-worker paradigm Heterogeneous processors Limited available bandwidth from master to workers Volatile desktop grids Resource availability: UP / RECLAIMED / DOWN Markov process (with different transition probabilities) Goal: on-line policies for resource selection Which resources to enroll? fast or reliable? When to change configuration? Yves.Robert@ens-lyon.fr Scheduling algorithms 60/ 72

152 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 61/ 72

153 Motivation Framework Very very large number of processing elements (e.g., 2 20 ) Failure-prone platform (like any realistic platform) Large application to be executed = Failure(s) will indeed occur before completion! Questions When to checkpoint the application? Always use all processors? Yves.Robert@ens-lyon.fr Scheduling algorithms 62/ 72

154 Notations Overall size of work: W Checkpoint cost: C Downtime: D - software rejuvenation via rebooting - replacement by spare Recovery cost after failure: R Yves.Robert@ens-lyon.fr Scheduling algorithms 63/ 72

155 Makespan vs. NextFailure Makespan: Minimize job s expected makespan NextFailure: Maximize expected amount of work completed before next failure Rationale: NextFailure: optimization on a failure-by-failure basis Hopefully a good approximation, at least for large job sizes W Yves.Robert@ens-lyon.fr Scheduling algorithms 64/ 72

156 Makespan T (0 τ) = 0 T (ω τ) = ω 1 + C + T (ω ω 1 τ + ω 1 + C) if the processor does not fail during the next ω 1 + C units of time T wasted (ω 1 + C τ) + T (ω R) otherwise. Makespan: find ω 1 = f (ω τ) ω that minimizes E(T (W 0)) Yves.Robert@ens-lyon.fr Scheduling algorithms 65/ 72

157 NextFailure W (0 τ) = 0 W (ω τ) = ω 1 + W (ω ω 1 τ + ω 1 + C) if the processor does not fail during the next ω 1 + C units of time, 0 otherwise. NextFailure: find ω 1 = f (ω τ) ω that maximizes E(W (W 0)) Yves.Robert@ens-lyon.fr Scheduling algorithms 66/ 72

158 Single-processor jobs Makespan Exponential F (t) = 1 e λt : periodic strategy optimal Other (Weibull F (t) = 1 e tk λ k ): open NextFailure Open for all failure laws Accurate dynamic programming approximations Yves.Robert@ens-lyon.fr Scheduling algorithms 67/ 72

159 Parallel jobs Models Embarrassingly parallel jobs: W(p) = W/p Amdahl parallel jobs: W(p) = W/p + γw Numerical kernels: W(p) = W/p + γw 2/3 / p Accurate dynamic programming approximations Yves.Robert@ens-lyon.fr Scheduling algorithms 68/ 72

160 Parallel jobs γ = 0.01, MTBF = 26.5 years γ = 0.01, MTBF = 53 years γ = 0.01, MTBF = 106 years γ = 0.001, MTBF = 26.5 years γ = 0.001, MTBF = 53 years γ = 0.001, MTBF = 106 years loss(pused) p used : number of processors used (logscale) loss(p used ) vs p used for jobs obeying Amdahl s law on Jaguar Yves.Robert@ens-lyon.fr Scheduling algorithms 68/ 72

161 Problems Software/hardware techniques to reduce checkpoint, recovery, migration times and to improve failure prediction Self-fault-tolerant algorithms (e.g. asynchronous iterative) Multi-criteria scheduling problem throughput/energy/reliability Add replication and group execution Need combine all these approaches! Scheduling algorithms 69/ 72

162 Outline 1 Who needs a scheduler? 2 Background: scheduling DAGs 3 Steady-state scheduling 4 Energy-aware scheduling 5 Scheduling with replication 6 Checkpointing strategies 7 Conclusion Yves.Robert@ens-lyon.fr Scheduling algorithms 70/ 72

163 Tools for the road Forget absolute makespan minimization Resource selection mandatory Divisible load (fractional tasks) Single application: period / latency / energy / robustness Several applications: max-min fairness, MAX stretch Linear programming: absolute bound to assess heuristics Probabilities and stochastic models... unavoidable Yves.Robert@ens-lyon.fr Scheduling algorithms 71/ 72

164 Scheduling for large-scale platforms If platform is well identified and relatively stable, try to: (i) accurately model hierarchical structure (ii) design efficient/robust/energy-aware scheduling algorithms If platform is not stable enough, or if it evolves too fast, dynamic schedulers are the only option Otherwise, grab any opportunity to inject static knowledge into dynamic schedulers Is this opportunity a niche? Does it encompass a wide range of applications? Yves.Robert@ens-lyon.fr Scheduling algorithms 72/ 72

165 Scheduling for large-scale platforms If platform is well identified and relatively stable, try to: (i) accurately model hierarchical structure (ii) design efficient/robust/energy-aware scheduling algorithms If platform is not stable enough, or if it evolves too fast, dynamic schedulers are the only option The future Otherwise, grab any opportunity to inject static knowledge into dynamic schedulers Is this opportunity a niche? Does it encompass a wide range of applications? is on our side! Yves.Robert@ens-lyon.fr Scheduling algorithms 72/ 72

Scheduling divisible loads with return messages on heterogeneous master-worker platforms

Scheduling divisible loads with return messages on heterogeneous master-worker platforms Scheduling divisible loads with return messages on heterogeneous master-worker platforms Olivier Beaumont, Loris Marchal, Yves Robert Laboratoire de l Informatique du Parallélisme École Normale Supérieure

More information

Energy-efficient scheduling

Energy-efficient scheduling Energy-efficient scheduling Guillaume Aupy 1, Anne Benoit 1,2, Paul Renaud-Goud 1 and Yves Robert 1,2,3 1. Ecole Normale Supérieure de Lyon, France 2. Institut Universitaire de France 3. University of

More information

Scheduling divisible loads with return messages on heterogeneous master-worker platforms

Scheduling divisible loads with return messages on heterogeneous master-worker platforms Scheduling divisible loads with return messages on heterogeneous master-worker platforms Olivier Beaumont 1, Loris Marchal 2, and Yves Robert 2 1 LaBRI, UMR CNRS 5800, Bordeaux, France Olivier.Beaumont@labri.fr

More information

Resilient and energy-aware algorithms

Resilient and energy-aware algorithms Resilient and energy-aware algorithms Anne Benoit ENS Lyon Anne.Benoit@ens-lyon.fr http://graal.ens-lyon.fr/~abenoit CR02-2016/2017 Anne.Benoit@ens-lyon.fr CR02 Resilient and energy-aware algorithms 1/

More information

Scheduling Parallel Iterative Applications on Volatile Resources

Scheduling Parallel Iterative Applications on Volatile Resources Scheduling Parallel Iterative Applications on Volatile Resources Henry Casanova, Fanny Dufossé, Yves Robert and Frédéric Vivien Ecole Normale Supérieure de Lyon and University of Hawai i at Mānoa October

More information

How to deal with uncertainties and dynamicity?

How to deal with uncertainties and dynamicity? How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling

More information

Divisible Load Scheduling

Divisible Load Scheduling Divisible Load Scheduling Henri Casanova 1,2 1 Associate Professor Department of Information and Computer Science University of Hawai i at Manoa, U.S.A. 2 Visiting Associate Professor National Institute

More information

On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms

On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms On the Complexity of Mapping Pipelined Filtering Services on Heterogeneous Platforms Anne Benoit, Fanny Dufossé and Yves Robert LIP, École Normale Supérieure de Lyon, France {Anne.Benoit Fanny.Dufosse

More information

Energy-aware checkpointing of divisible tasks with soft or hard deadlines

Energy-aware checkpointing of divisible tasks with soft or hard deadlines Energy-aware checkpointing of divisible tasks with soft or hard deadlines Guillaume Aupy 1, Anne Benoit 1,2, Rami Melhem 3, Paul Renaud-Goud 1 and Yves Robert 1,2,4 1. Ecole Normale Supérieure de Lyon,

More information

Scheduling Parallel Iterative Applications on Volatile Resources

Scheduling Parallel Iterative Applications on Volatile Resources Scheduling Parallel Iterative Applications on Volatile Resources Henri Casanova University of Hawai i at Mānoa, USA henric@hawaiiedu Fanny Dufossé, Yves Robert, Frédéric Vivien Ecole Normale Supérieure

More information

Mapping tightly-coupled applications on volatile resources

Mapping tightly-coupled applications on volatile resources Mapping tightly-coupled applications on volatile resources Henri Casanova, Fanny Dufossé, Yves Robert, Frédéric Vivien To cite this version: Henri Casanova, Fanny Dufossé, Yves Robert, Frédéric Vivien.

More information

Optimal checkpointing periods with fail-stop and silent errors

Optimal checkpointing periods with fail-stop and silent errors Optimal checkpointing periods with fail-stop and silent errors Anne Benoit ENS Lyon Anne.Benoit@ens-lyon.fr http://graal.ens-lyon.fr/~abenoit 3rd JLESC Summer School June 30, 2016 Anne.Benoit@ens-lyon.fr

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.

More information

On scheduling the checkpoints of exascale applications

On scheduling the checkpoints of exascale applications On scheduling the checkpoints of exascale applications Marin Bougeret, Henri Casanova, Mikaël Rabie, Yves Robert, and Frédéric Vivien INRIA, École normale supérieure de Lyon, France Univ. of Hawai i at

More information

Reclaiming the energy of a schedule: models and algorithms

Reclaiming the energy of a schedule: models and algorithms Author manuscript, published in "Concurrency and Computation: Practice and Experience 25 (2013) 1505-1523" DOI : 10.1002/cpe.2889 CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.:

More information

Checkpointing strategies for parallel jobs

Checkpointing strategies for parallel jobs Checkpointing strategies for parallel jobs Marin Bougeret ENS Lyon, France Marin.Bougeret@ens-lyon.fr Yves Robert ENS Lyon, France Yves.Robert@ens-lyon.fr Henri Casanova Univ. of Hawai i at Mānoa, Honolulu,

More information

Static Load-Balancing Techniques for Iterative Computations on Heterogeneous Clusters

Static Load-Balancing Techniques for Iterative Computations on Heterogeneous Clusters Static Load-Balancing Techniques for Iterative Computations on Heterogeneous Clusters Hélène Renard, Yves Robert, and Frédéric Vivien LIP, UMR CNRS-INRIA-UCBL 5668 ENS Lyon, France {Helene.Renard Yves.Robert

More information

Checkpoint Scheduling

Checkpoint Scheduling Checkpoint Scheduling Henri Casanova 1,2 1 Associate Professor Department of Information and Computer Science University of Hawai i at Manoa, U.S.A. 2 Visiting Associate Professor National Institute of

More information

Scheduling Lecture 1: Scheduling on One Machine

Scheduling Lecture 1: Scheduling on One Machine Scheduling Lecture 1: Scheduling on One Machine Loris Marchal October 16, 2012 1 Generalities 1.1 Definition of scheduling allocation of limited resources to activities over time activities: tasks in computer

More information

Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids

Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids Scheduling Tightly-Coupled Applications on Heterogeneous Desktop Grids Henri Casanova, Fanny Dufossé, Yves Robert, Frédéric Vivien To cite this version: Henri Casanova, Fanny Dufossé, Yves Robert, Frédéric

More information

Scheduling Lecture 1: Scheduling on One Machine

Scheduling Lecture 1: Scheduling on One Machine Scheduling Lecture 1: Scheduling on One Machine Loris Marchal 1 Generalities 1.1 Definition of scheduling allocation of limited resources to activities over time activities: tasks in computer environment,

More information

Embedded Systems 14. Overview of embedded systems design

Embedded Systems 14. Overview of embedded systems design Embedded Systems 14-1 - Overview of embedded systems design - 2-1 Point of departure: Scheduling general IT systems In general IT systems, not much is known about the computational processes a priori The

More information

Computing the throughput of probabilistic and replicated streaming applications

Computing the throughput of probabilistic and replicated streaming applications Computing the throughput of probabilistic and replicated streaming applications Anne Benoit, Matthieu Gallet, Bruno Gaujal, Yves Robert To cite this version: Anne Benoit, Matthieu Gallet, Bruno Gaujal,

More information

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC Hybrid static/dynamic scheduling for already optimized dense matrix factorization Simplice Donfack, Laura Grigori, INRIA, France Bill Gropp, Vivek Kale UIUC, USA Joint Laboratory for Petascale Computing,

More information

IN this paper, we deal with the master-slave paradigm on a

IN this paper, we deal with the master-slave paradigm on a IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 15, NO. 4, APRIL 2004 1 Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Platforms Cyril Banino, Olivier Beaumont, Larry

More information

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Energy-efficient Mapping of Big Data Workflows under Deadline Constraints Presenter: Tong Shu Authors: Tong Shu and Prof. Chase Q. Wu Big Data Center Department of Computer Science New Jersey Institute

More information

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Real-time operating systems course 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm Definitions Scheduling Scheduling is the activity of selecting which process/thread should

More information

P C max. NP-complete from partition. Example j p j What is the makespan on 2 machines? 3 machines? 4 machines?

P C max. NP-complete from partition. Example j p j What is the makespan on 2 machines? 3 machines? 4 machines? Multiple Machines Model Multiple Available resources people time slots queues networks of computers Now concerned with both allocation to a machine and ordering on that machine. P C max NP-complete from

More information

Clock-driven scheduling

Clock-driven scheduling Clock-driven scheduling Also known as static or off-line scheduling Michal Sojka Czech Technical University in Prague, Faculty of Electrical Engineering, Department of Control Engineering November 8, 2017

More information

A comparison of sequencing formulations in a constraint generation procedure for avionics scheduling

A comparison of sequencing formulations in a constraint generation procedure for avionics scheduling A comparison of sequencing formulations in a constraint generation procedure for avionics scheduling Department of Mathematics, Linköping University Jessika Boberg LiTH-MAT-EX 2017/18 SE Credits: Level:

More information

Recoverable Robustness in Scheduling Problems

Recoverable Robustness in Scheduling Problems Master Thesis Computing Science Recoverable Robustness in Scheduling Problems Author: J.M.J. Stoef (3470997) J.M.J.Stoef@uu.nl Supervisors: dr. J.A. Hoogeveen J.A.Hoogeveen@uu.nl dr. ir. J.M. van den Akker

More information

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II TDDB68 Concurrent programming and operating systems Lecture: CPU Scheduling II Mikael Asplund, Senior Lecturer Real-time Systems Laboratory Department of Computer and Information Science Copyright Notice:

More information

APTAS for Bin Packing

APTAS for Bin Packing APTAS for Bin Packing Bin Packing has an asymptotic PTAS (APTAS) [de la Vega and Leuker, 1980] For every fixed ε > 0 algorithm outputs a solution of size (1+ε)OPT + 1 in time polynomial in n APTAS for

More information

3. Scheduling issues. Common approaches 3. Common approaches 1. Preemption vs. non preemption. Common approaches 2. Further definitions

3. Scheduling issues. Common approaches 3. Common approaches 1. Preemption vs. non preemption. Common approaches 2. Further definitions Common approaches 3 3. Scheduling issues Priority-driven (event-driven) scheduling This class of algorithms is greedy They never leave available processing resources unutilized An available resource may

More information

Exam Spring Embedded Systems. Prof. L. Thiele

Exam Spring Embedded Systems. Prof. L. Thiele Exam Spring 20 Embedded Systems Prof. L. Thiele NOTE: The given solution is only a proposal. For correctness, completeness, or understandability no responsibility is taken. Sommer 20 Eingebettete Systeme

More information

Algorithm Design. Scheduling Algorithms. Part 2. Parallel machines. Open-shop Scheduling. Job-shop Scheduling.

Algorithm Design. Scheduling Algorithms. Part 2. Parallel machines. Open-shop Scheduling. Job-shop Scheduling. Algorithm Design Scheduling Algorithms Part 2 Parallel machines. Open-shop Scheduling. Job-shop Scheduling. 1 Parallel Machines n jobs need to be scheduled on m machines, M 1,M 2,,M m. Each machine can

More information

Scheduling. Uwe R. Zimmer & Alistair Rendell The Australian National University

Scheduling. Uwe R. Zimmer & Alistair Rendell The Australian National University 6 Scheduling Uwe R. Zimmer & Alistair Rendell The Australian National University References for this chapter [Bacon98] J. Bacon Concurrent Systems 1998 (2nd Edition) Addison Wesley Longman Ltd, ISBN 0-201-17767-6

More information

CPU SCHEDULING RONG ZHENG

CPU SCHEDULING RONG ZHENG CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM

More information

Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund

Multiprocessor Scheduling I: Partitioned Scheduling. LS 12, TU Dortmund Multiprocessor Scheduling I: Partitioned Scheduling Prof. Dr. Jian-Jia Chen LS 12, TU Dortmund 22/23, June, 2015 Prof. Dr. Jian-Jia Chen (LS 12, TU Dortmund) 1 / 47 Outline Introduction to Multiprocessor

More information

Real-Time Systems. Event-Driven Scheduling

Real-Time Systems. Event-Driven Scheduling Real-Time Systems Event-Driven Scheduling Marcus Völp, Hermann Härtig WS 2013/14 Outline mostly following Jane Liu, Real-Time Systems Principles Scheduling EDF and LST as dynamic scheduling methods Fixed

More information

Process Scheduling for RTS. RTS Scheduling Approach. Cyclic Executive Approach

Process Scheduling for RTS. RTS Scheduling Approach. Cyclic Executive Approach Process Scheduling for RTS Dr. Hugh Melvin, Dept. of IT, NUI,G RTS Scheduling Approach RTS typically control multiple parameters concurrently Eg. Flight Control System Speed, altitude, inclination etc..

More information

Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems

Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems Jian-Jia Chen *, Chuan Yue Yang, Tei-Wei Kuo, and Chi-Sheng Shih Embedded Systems and Wireless Networking Lab. Department of Computer

More information

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved. Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved. 1 Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should

More information

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1

Input-queued switches: Scheduling algorithms for a crossbar switch. EE 384X Packet Switch Architectures 1 Input-queued switches: Scheduling algorithms for a crossbar switch EE 84X Packet Switch Architectures Overview Today s lecture - the input-buffered switch architecture - the head-of-line blocking phenomenon

More information

On Two Class-Constrained Versions of the Multiple Knapsack Problem

On Two Class-Constrained Versions of the Multiple Knapsack Problem On Two Class-Constrained Versions of the Multiple Knapsack Problem Hadas Shachnai Tami Tamir Department of Computer Science The Technion, Haifa 32000, Israel Abstract We study two variants of the classic

More information

Divisible Load Scheduling on Heterogeneous Distributed Systems

Divisible Load Scheduling on Heterogeneous Distributed Systems Divisible Load Scheduling on Heterogeneous Distributed Systems July 2008 Graduate School of Global Information and Telecommunication Studies Waseda University Audiovisual Information Creation Technology

More information

Energy-aware scheduling for GreenIT in large-scale distributed systems

Energy-aware scheduling for GreenIT in large-scale distributed systems Energy-aware scheduling for GreenIT in large-scale distributed systems 1 PASCAL BOUVRY UNIVERSITY OF LUXEMBOURG GreenIT CORE/FNR project Context and Motivation Outline 2 Problem Description Proposed Solution

More information

Real-Time Scheduling and Resource Management

Real-Time Scheduling and Resource Management ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 2008 Real-Time Scheduling and Resource Management Lecturer: Giorgio Buttazzo Full Professor Scuola Superiore Sant Anna

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee & Dianna Xu University of Pennsylvania, Fall 2003 Lecture Note 3: CPU Scheduling 1 CPU SCHEDULING q How can OS schedule the allocation of CPU cycles

More information

Scheduling multiple bags of tasks on heterogeneous master-worker platforms: centralized versus distributed solutions

Scheduling multiple bags of tasks on heterogeneous master-worker platforms: centralized versus distributed solutions Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Scheduling multiple bags of tasks on heterogeneous master-worker

More information

Overlay networks maximizing throughput

Overlay networks maximizing throughput Overlay networks maximizing throughput Olivier Beaumont, Lionel Eyraud-Dubois, Shailesh Kumar Agrawal Cepage team, LaBRI, Bordeaux, France IPDPS April 20, 2010 Outline Introduction 1 Introduction 2 Complexity

More information

Distributed Optimization. Song Chong EE, KAIST

Distributed Optimization. Song Chong EE, KAIST Distributed Optimization Song Chong EE, KAIST songchong@kaist.edu Dynamic Programming for Path Planning A path-planning problem consists of a weighted directed graph with a set of n nodes N, directed links

More information

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden

Embedded Systems Design: Optimization Challenges. Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden of /4 4 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden Outline! Embedded systems " Example area: automotive electronics " Embedded systems

More information

CPU scheduling. CPU Scheduling

CPU scheduling. CPU Scheduling EECS 3221 Operating System Fundamentals No.4 CPU scheduling Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University CPU Scheduling CPU scheduling is the basis of multiprogramming

More information

CEE 618 Scientific Parallel Computing (Lecture 7): OpenMP (con td) and Matrix Multiplication

CEE 618 Scientific Parallel Computing (Lecture 7): OpenMP (con td) and Matrix Multiplication 1 / 26 CEE 618 Scientific Parallel Computing (Lecture 7): OpenMP (con td) and Matrix Multiplication Albert S. Kim Department of Civil and Environmental Engineering University of Hawai i at Manoa 2540 Dole

More information

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling Real-Time Systems Lecture 6 Dynamic Priority Scheduling Online scheduling with dynamic priorities: Earliest Deadline First scheduling CPU utilization bound Optimality and comparison with RM: Schedulability

More information

Scheduling communication requests traversing a switch: complexity and algorithms

Scheduling communication requests traversing a switch: complexity and algorithms Scheduling communication requests traversing a switch: complexity and algorithms Matthieu Gallet Yves Robert Frédéric Vivien Laboratoire de l'informatique du Parallélisme UMR CNRS-ENS Lyon-INRIA-UCBL 5668

More information

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions

Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Non-Work-Conserving Non-Preemptive Scheduling: Motivations, Challenges, and Potential Solutions Mitra Nasri Chair of Real-time Systems, Technische Universität Kaiserslautern, Germany nasri@eit.uni-kl.de

More information

Revisiting Matrix Product on Master-Worker Platforms

Revisiting Matrix Product on Master-Worker Platforms Revisiting Matrix Product on Master-Worker Platforms Jack Dongarra 2, Jean-François Pineau 1, Yves Robert 1,ZhiaoShi 2,andFrédéric Vivien 1 1 LIP, NRS-ENS Lyon-INRIA-UBL 2 Innovative omputing Laboratory

More information

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano ... Our contribution PIPS-PSBB*: Multi-level parallelism for Stochastic

More information

Stochastic Optimization for Undergraduate Computer Science Students

Stochastic Optimization for Undergraduate Computer Science Students Stochastic Optimization for Undergraduate Computer Science Students Professor Joongheon Kim School of Computer Science and Engineering, Chung-Ang University, Seoul, Republic of Korea 1 Reference 2 Outline

More information

1 Markov decision processes

1 Markov decision processes 2.997 Decision-Making in Large-Scale Systems February 4 MI, Spring 2004 Handout #1 Lecture Note 1 1 Markov decision processes In this class we will study discrete-time stochastic systems. We can describe

More information

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value A -Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value Shuhui Li, Miao Song, Peng-Jun Wan, Shangping Ren Department of Engineering Mechanics,

More information

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers Mohammad H. Yarmand and Douglas G. Down Department of Computing and Software, McMaster University, Hamilton, ON, L8S

More information

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling Scheduling I Today! Introduction to scheduling! Classical algorithms Next Time! Advanced topics on scheduling Scheduling out there! You are the manager of a supermarket (ok, things don t always turn out

More information

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i Embedded Systems 15-1 - REVIEW: Aperiodic scheduling C i J i 0 a i s i f i d i Given: A set of non-periodic tasks {J 1,, J n } with arrival times a i, deadlines d i, computation times C i precedence constraints

More information

Average-Case Performance Analysis of Online Non-clairvoyant Scheduling of Parallel Tasks with Precedence Constraints

Average-Case Performance Analysis of Online Non-clairvoyant Scheduling of Parallel Tasks with Precedence Constraints Average-Case Performance Analysis of Online Non-clairvoyant Scheduling of Parallel Tasks with Precedence Constraints Keqin Li Department of Computer Science State University of New York New Paltz, New

More information

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1 1 Deparment of Computer

More information

Parallel Performance Theory

Parallel Performance Theory AMS 250: An Introduction to High Performance Computing Parallel Performance Theory Shawfeng Dong shaw@ucsc.edu (831) 502-7743 Applied Mathematics & Statistics University of California, Santa Cruz Outline

More information

Lecture 2: Metrics to Evaluate Systems

Lecture 2: Metrics to Evaluate Systems Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video

More information

Periodic I/O Scheduling for Supercomputers

Periodic I/O Scheduling for Supercomputers Periodic I/O Scheduling for Supercomputers Guillaume Aupy 1, Ana Gainaru 2, Valentin Le Fèvre 3 1 Inria & U. of Bordeaux 2 Vanderbilt University 3 ENS Lyon & Inria PMBS Workshop, November 217 Slides available

More information

Modeling Fixed Priority Non-Preemptive Scheduling with Real-Time Calculus

Modeling Fixed Priority Non-Preemptive Scheduling with Real-Time Calculus Modeling Fixed Priority Non-Preemptive Scheduling with Real-Time Calculus Devesh B. Chokshi and Purandar Bhaduri Department of Computer Science and Engineering Indian Institute of Technology Guwahati,

More information

End-to-End Scheduling to Meet Deadlines in Distributed Systems

End-to-End Scheduling to Meet Deadlines in Distributed Systems Appeared in: Proceedings of the 12th International Conference on Distributed Computing Systems, Yokohama, Japan, June 1992, pages 452--459. End-to-End Scheduling to Meet Deadlines in Distributed Systems

More information

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and

More information

Power-Aware Linear Programming Based Scheduling for Heterogeneous Computer Clusters

Power-Aware Linear Programming Based Scheduling for Heterogeneous Computer Clusters Power-Aware Linear Programming Based Scheduling for Heterogeneous Computer Clusters Hadil Al-Daoud Department of Computing and Software McMaster University Email: aldaouhb@mcmasterca Issam Al-Azzoni INRIA

More information

Data redistribution algorithms for homogeneous and heterogeneous processor rings

Data redistribution algorithms for homogeneous and heterogeneous processor rings Data redistribution algorithms for homogeneous and heterogeneous processor rings Hélène Renard, Yves Robert and Frédéric Vivien LIP, UMR CNRS-INRIA-UCBL 5668 ENS Lyon, France {Helene.Renard Yves.Robert

More information

Flow Shop and Job Shop Models

Flow Shop and Job Shop Models Outline DM87 SCHEDULING, TIMETABLING AND ROUTING Lecture 11 Flow Shop and Job Shop Models 1. Flow Shop 2. Job Shop Marco Chiarandini DM87 Scheduling, Timetabling and Routing 2 Outline Resume Permutation

More information

FOR decades it has been realized that many algorithms

FOR decades it has been realized that many algorithms IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS 1 Scheduling Divisible Loads with Nonlinear Communication Time Kai Wang, and Thomas G. Robertazzi, Fellow, IEEE Abstract A scheduling model for single

More information

Module 5: CPU Scheduling

Module 5: CPU Scheduling Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 5.1 Basic Concepts Maximum CPU utilization obtained

More information

Steady-state scheduling of task graphs on heterogeneous computing platforms

Steady-state scheduling of task graphs on heterogeneous computing platforms Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 5668 Steady-state scheduling of task graphs on heterogeneous computing platforms

More information

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013 Lecture 3 Real-Time Scheduling Daniel Kästner AbsInt GmbH 203 Model-based Software Development 2 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S

More information

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Song Han 1 Deji Chen 2 Ming Xiong 3 Aloysius K. Mok 1 1 The University of Texas at Austin 2 Emerson Process Management

More information

Spatio-Temporal Thermal-Aware Scheduling for Homogeneous High-Performance Computing Datacenters

Spatio-Temporal Thermal-Aware Scheduling for Homogeneous High-Performance Computing Datacenters Spatio-Temporal Thermal-Aware Scheduling for Homogeneous High-Performance Computing Datacenters Hongyang Sun a,, Patricia Stolf b, Jean-Marc Pierson b a Ecole Normale Superieure de Lyon & INRIA, France

More information

Analytical Modeling of Parallel Systems

Analytical Modeling of Parallel Systems Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation 6.1 Basic Concepts Maximum CPU utilization obtained

More information

Parallel Performance Theory - 1

Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science Outline q Performance scalability q Analytical performance measures q Amdahl s law and Gustafson-Barsis

More information

arxiv: v1 [cs.dc] 10 Feb 2014

arxiv: v1 [cs.dc] 10 Feb 2014 We Are Impatient: Algorithms for Geographically Distributed Load Balancing with (Almost) Arbitrary Load Functions arxiv:1402.2090v1 [cs.dc] 10 Feb 2014 Piotr Skowron Institute of Informatics University

More information

Saving Energy in Sparse and Dense Linear Algebra Computations

Saving Energy in Sparse and Dense Linear Algebra Computations Saving Energy in Sparse and Dense Linear Algebra Computations P. Alonso, M. F. Dolz, F. Igual, R. Mayo, E. S. Quintana-Ortí, V. Roca Univ. Politécnica Univ. Jaume I The Univ. of Texas de Valencia, Spain

More information

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Anoop Bhagyanath and Klaus Schneider Embedded Systems Chair University of Kaiserslautern

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm

Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm Failure Tolerance of Multicore Real-Time Systems scheduled by a Pfair Algorithm Yves MOUAFO Supervisors A. CHOQUET-GENIET, G. LARGETEAU-SKAPIN OUTLINES 2 1. Context and Problematic 2. State of the art

More information

Lecture 2: Scheduling on Parallel Machines

Lecture 2: Scheduling on Parallel Machines Lecture 2: Scheduling on Parallel Machines Loris Marchal October 17, 2012 Parallel environment alpha in Graham s notation): P parallel identical Q uniform machines: each machine has a given speed speed

More information

416 Distributed Systems

416 Distributed Systems 416 Distributed Systems RAID, Feb 26 2018 Thanks to Greg Ganger and Remzi Arapaci-Dusseau for slides Outline Using multiple disks Why have multiple disks? problem and approaches RAID levels and performance

More information

CHAPTER 5 - PROCESS SCHEDULING

CHAPTER 5 - PROCESS SCHEDULING CHAPTER 5 - PROCESS SCHEDULING OBJECTIVES To introduce CPU scheduling, which is the basis for multiprogrammed operating systems To describe various CPU-scheduling algorithms To discuss evaluation criteria

More information

Aperiodic Task Scheduling

Aperiodic Task Scheduling Aperiodic Task Scheduling Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 Germany Springer, 2010 2017 年 11 月 29 日 These slides use Microsoft clip arts. Microsoft copyright

More information

P, NP, NP-Complete, and NPhard

P, NP, NP-Complete, and NPhard P, NP, NP-Complete, and NPhard Problems Zhenjiang Li 21/09/2011 Outline Algorithm time complicity P and NP problems NP-Complete and NP-Hard problems Algorithm time complicity Outline What is this course

More information

Divisible load theory

Divisible load theory Divisible load theory Loris Marchal November 5, 2012 1 The context Context of the study Scientific computing : large needs in computation or storage resources Need to use systems with several processors

More information

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007

1 Introduction. 1.1 The Problem Domain. Self-Stablization UC Davis Earl Barr. Lecture 1 Introduction Winter 2007 Lecture 1 Introduction 1 Introduction 1.1 The Problem Domain Today, we are going to ask whether a system can recover from perturbation. Consider a children s top: If it is perfectly vertically, you can

More information

Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms

Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Assessing the impact and limits of steady-state scheduling for

More information

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2

Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Real-time Scheduling of Periodic Tasks (1) Advanced Operating Systems Lecture 2 Lecture Outline Scheduling periodic tasks The rate monotonic algorithm Definition Non-optimality Time-demand analysis...!2

More information