Energy-aware scheduling for GreenIT in large-scale distributed systems

Size: px

Start display at page:

Download "Energy-aware scheduling for GreenIT in large-scale distributed systems"

Charles Harvey Brooks
5 years ago
Views:

1 Energy-aware scheduling for GreenIT in large-scale distributed systems 1 PASCAL BOUVRY UNIVERSITY OF LUXEMBOURG GreenIT CORE/FNR project

2 Context and Motivation Outline 2 Problem Description Proposed Solution Performance Evaluation and Experiments Conclusions and Perspectives

3 GreenIT project The aim: to provide a holistic autonomic energyefficient solution to manage, provision, and administer the various resources within large-scale distributed systems Main research challenges: Development of meta-models To adequately define a unified performance metric of the system, system s properties, constraints and optimization criteria Develop scheduling & resource management methodologies Resulting in multi-objective multi-constraint problems Develop autonomic resource management It is planned to use MAS 3

4 Context and Motivations Energy consumption issues in distributed computing systems rises: Environmental concerns Carbon Emission Monetary issues Energy bills Cooling system acquisitions & maintenance Performance concern Reliability Efficiency/Scalability 4

5 Hardware approach Current State & Efforts Energy-efficient microprocessors Solid state drives Energy-efficient monitors Software approach Virtualization Energy-aware Scheduling & Resource Allocations 5

6 Hardware approach Current State & Efforts Energy-efficient microprocessors Solid state drives Energy-efficient monitors Software approach Virtualization Energy-aware Scheduling & Resource Allocations 6

7 Software Approach Current State & Efforts CPU throttling using DVS/DFS (DVFS) Intel SpeedStep Technology Pentium M: 1.6(1.484 V) to 0.6 GHz(0.956 V) 7 AMD Cool n Quiet technology Athlon 64: 2.4(89 W) to 0.8 GHz(32/22 W) AMD PowerNow! technology Turion 64(Lion): 2.4(35 W) to 2.0 GHz(31 W)

8 Software Approach Current State & Efforts Energy-aware scheduling and resource allocation! H%*4.I*'*6?A67 '(&)!"#$%&&#"& !#7%"-3#8%* P = CV 2 f &*($) 123% &+!!*,-.#*'(/%-"(0/% 23 Albert Y Zomaya

9 Context and Motivation Outline 9 Problem Description Proposed Solution Performance Evaluation and Experiments Conclusions and Perspectives

10 System Model Problem Description 10 Large-scale systems composed of a set M of m heterogeneous and DVFS-enabled processors that are fully interconnected The inter-processor communications are assumed to perform with the same speed on all links without contention It is also assumed that a message can be transmitted while executing tasks Table 1. Voltage-Relative Speed Pairs [7, 10] Pair 1 Pair 2 Pair 3 Pair 4 Pair 5 Pair 6 Voltage Relative Voltage Relative Voltage Relative Voltage Relative Voltage Relative Voltage Relative Level V k Speed V k Speed V k Speed V k Speed V k Speed V k Speed (%) (%) (%) (%) (%) (%)

11 Application Model Problem Description A parallel program (DAG) G = (T, E) consists of a set T of n tasks, and a set E of e edges 11 Table 2: Computation cost (p i at level L 0 ) and task priorities (b-level and t-level)! "" "( "% "" " # $ % "$ "! ") "$ #( & ' #" "$ ( task p i on m 0 p i on m 1 p i on m 2 p i b-level t-level b level(t i )=p i + max tj succ(t i ){b level(t j )+c ij },

12 Energy Model Problem Description Derived from the power consumption model in complementary metal-oxide semiconductor (CMOS) logic circuits 12 Capacitive power P C = AC ef V 22f Our Energy Model Ec = nx i=1 X AC ef V i 2 n f.p i = KV i 2 p i, i=1

as possible 13 m 0 m 1 m 2! "" "( "% "" " # $ % "$ "!

13 Scheduling Model Problem Description Allocation of a set N of n tasks to a set M of m processors (without violating precedence constraints) aiming to minimize makespan with Energy consumption as low as possible 13 m 0 m 1 m 2! "" "( "% "" " # $ % "$ "! ") "$ #( & ' #" "$ ( M A P P I N G DVS-enabled procs t 3 t 1 t v 1.40v 1.20v 0.90v t 0 t 4 t 2 t 5 t 7 Supply voltage Levels

14 Proposed Solution 14 We consider the problem as a weighted multi-criteria scheduling problem on unrelated parallel machines A scheduling algorithm based on cellular GAs

Cellular Genetic Algorithms (cga) 15 A GA with structured population Individuals arranged on two-dimensional, toroidal mesh Locality known as

15 Cellular Genetic Algorithms (cga) 15 A GA with structured population Individuals arranged on two-dimensional, toroidal mesh Locality known as isolation by distance Interaction between individuals only in neighborhoods Gives good exploration/exploitation balance, slows down the convergence

16 Toroidal Population duction to Cellular Genetic Algorithms 16 Toroidal population and most typically used neighborhoods L5 L9 C9 C13 C21 C25

17 Canonical cga Algorithm 17 Algorithm 1 Pseudo-code for a canonical CGA (asynchronous). 1: while! StopCondition() do 2: for all ind in population do 3: neigh get neighborhood(ind); 4: parents select(neigh); 5: of f spring recombine(p comb, parents); 6: mutate(p mut, of f spring); 7: evaluate(of f spring); 8: replace(ind, of f spring); 9: end for 10: end while

18 Hybrid cga (EACS) L5 neighborhood Proposed Solution Algorithm 2 Pseudo-code Hybrid CGA 1: while! StopCondition() do 2: for all ind in population do 3: neigh get neighborhood(ind); 4: parents select(neigh); 5: of f spring recombine(p comb, parents); 6: mutate(p mut, of f spring); 7: evaluate(of f spring); 8: replace(ind, of f spring); 9: local search(best of f spring); 10: elitism(replace worst of f spring); 11: end for 12: end while 18

19 Solution representation 19! "" "( "% "" " # $ % Tasks : t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 "$ "! ") "$ #( Processors : m 0 m 2 m 0 m 0 m 1 m 2 m 1 m 2 & #" "$ ' Voltage : V 4 V 2 V 1 V 1 V 4 (

20 Genetic Operators Selection: tournament best of 2 Recombination: single point recombination 20 Parent 1 Parent 2 Recombination point Recombination point t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 m 0 m 2 m 0 m 0 m 1 m 2 m 1 m 2 m 0 m 0 m 0 m 0 m 1 m 2 m 1 m 2 V 4 V 2 V 1 V 1 V 4 V 1 V 1 V 1 New offspring t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 m 0 m 2 m 0 m 0 m 1 m 2 m 1 m 2 V 4 V 2 V 1

21 Genetic Operators Mutation: single point mutation 21 offspring t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 m 0 m 2 m 0 m 0 m 1 m 2 m 1 m 2 V 4 V 2 V 1 mutation mutated offspring t 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 m 0 m 2 m 0 m 2 m 0 m 2 m 1 m 2 V 4 V 2 V 1 V 1

22 Fitness function Normalized fitness function limited to the range [0, 1] f i (x) = 22 f i (x) min f i(x) x F max f i(x) min f i(x) x F x F alues of every normalized functio

voltage v k from the corresponding set of voltage of m j randomly 6: Assign t i on the processor m j with the operating

23 Local Search Iterative random local search with DVFS 23 Algorithm 2 Pseudo-code for random local search and voltage scale. 1: searchstep = 0 2: while searchstep < MAXSTEPS do 3: Pick a task t i randomly 4: Pick a processor m j randomly 5: Pick a voltage v k from the corresponding set of voltage of m j randomly 6: Assign t i on the processor m j with the operating voltage v k that minimizes both Energy and EFT of task t i or minimizes Energy without increasing EFT 7: end while DVFS Property

24 Experiment results Real application graphs from literature and Standard Task Graph set Benchmark employed in the simulations and their main characteristics 24 Application # of Tasks # of Edges fpppp Sparse matrix LIGO MDCode Set of processors {8, 16, 32} Different communication to computation ratio (CCR) 0.1, 0.5, 1, 2, 5 Performance metrics: MAKESPAN and ENERGY consumption

Performance Evaluation 25 HEFT without DVS HEFT + DVS Proposed Solution 0 m 0 m 1 m 2 0 P 0 P 1 P 2 0 m 0 m 1 m 2 t 0 t 0 t 0 9 9 9 18 t 4 18 t 4 18 t 3 27 36 45 t 3 t 1 t 2 27 36 45 t 3 t 1 t 2 27

25 Performance Evaluation 25 HEFT without DVS HEFT + DVS Proposed Solution 0 m 0 m 1 m 2 0 P 0 P 1 P 2 0 m 0 m 1 m 2 t 0 t 0 t t 4 18 t 4 18 t t 3 t 1 t t 3 t 1 t t 1 t 2 t 4 t 6 54 t t 6 t 6 t t t t 7 81 t v 1.40v 1.20v 0.90v 1.75v 1.40v 1.20v 0.90v 1.75v 1.40v 1.20v 0.90v MKSP = 89, Energy = 380 MKSP = 89, Energy = 333 MKSP = 74, Energy = 236

26 Simulations framework 26 l l EACS compared with simple Genetic Algorithm (GAC) GAC characteristics: l Unstructured population l Tournament selection l Single point crossover l Single point mutation Simulation parameters: l P mutation = l P cross-over = 0.85 l Lambda = 0.75 l Population: 50 (10x5 grid for EACS) individuals l Stopping condition: 500 generations

27 Sparse Application 27 Problem sparse with 16 Processors Problem sparse with 16 Processors Average Makespan Average Consumed Energy EACS_0.5 GAC_0.5 EACS_1.0 GAC_1.0 EACS_2.0 GAC_2.0 EACS_5.0 GAC_5.0 EACS_0.5 GAC_0.5 EACS_1.0 GAC_1.0 EACS_2.0 GAC_2.0 EACS_5.0 GAC_5.0

28 Sparse Application 28 Problem sparse with 32 Processors Problem sparse with 32 Processors Average Makespan Average Consumed Energy EACS_0.5 GAC_0.5 EACS_1.0 GAC_1.0 EACS_2.0 GAC_2.0 EACS_5.0 GAC_5.0 EACS_0.5 GAC_0.5 EACS_1.0 GAC_1.0 EACS_2.0 GAC_2.0 EACS_5.0 GAC_5.0

29 Comparison against List Sched. Alg. HEFT: List based scheduling algorithm for heterogeneous machines Same benchmark 29 Aggregated average results

30 Results regarding Processors 30 Table 2. Improvement according to the number of processors Processors Makespan (%) Energy (%)

31 Results according CCR 31 Table 3. Improvement according to the CCR CCR Makespan (%) Energy (%)

32 Results Experiments Average Makespan and Energy for fpppp334 DAGs 32 LIGO76 application LIGO76 application Avg makespan (sec.) Avg Energy (milli-joules) HEFT CCR EACS HEFT CCR EACS

33 Conclusions and Perspective We have investigated the energy efficiency and scheduling problem on scalable computing systems We proposed an evolutionary algorithm Based on a cga + Local search Combined iterative random local search with DVS technique to minimize energy consumption without makespan degradation The proposed solution outperforms related approaches in terms of energy and the completion time 33

34 Conclusions and Perspectives (cont ) Future work include to investigate the proposed approach on applications with arbitrary structure and large number of tasks We plan to validate the approach by using greencloud.gforge.uni.lu and real environments 34

35 Questions? Thank you for your attention! 35

Minimization of Energy Loss using Integrated Evolutionary Approaches

Minimization of Energy Loss using Integrated Evolutionary Approaches Attia A. El-Fergany, Member, IEEE, Mahdi El-Arini, Senior Member, IEEE Paper Number: 1569614661 Presentation's Outline Aim of this work,