Global EDF Scheduling for Parallel Real-Time Tasks

Size: px

Start display at page:

Download "Global EDF Scheduling for Parallel Real-Time Tasks"

Simon Heath
5 years ago
Views:

1 Washngton Unversty n St. Lous Washngton Unversty Open Scholarshp Engneerng and Appled Scence Theses & Dssertatons Engneerng and Appled Scence Sprng Global EDF Schedulng for Parallel Real-Tme Tasks Jng L Washngton Unversty n St. Lous Follow ths and addtonal works at: Part of the Computer Engneerng Commons, and the Computer Scences Commons Recommended Ctaton L, Jng, "Global EDF Schedulng for Parallel Real-Tme Tasks" (2014). Engneerng and Appled Scence Theses & Dssertatons Ths Thess s brought to you for free and open access by the Engneerng and Appled Scence at Washngton Unversty Open Scholarshp. It has been accepted for ncluson n Engneerng and Appled Scence Theses & Dssertatons by an authorzed admnstrator of Washngton Unversty Open Scholarshp. For more nformaton, please contact dgtal@wumal.wustl.edu.

2 Washngton Unversty n St. Lous School of Engneerng and Appled Scence Department of Computer Scence and Engneerng Thess Examnaton Commttee: Chenyang Lu Kunal Agrawal Roger Chamberlan Global EDF Schedulng for Parallel Real-Tme Tasks by Jng L A thess presented to the Graduate School of Arts and Scences of Washngton Unversty n partal fulfllment of the requrements for the degree of Master of Scence May 2014 Sant Lous, Mssour

3 copyrght by Jng L 2014

4 Contents Lst of Tables v Lst of Fgures v Acknowledgments Abstract v v 1 Introducton Related Work Task Model and Defntons Capacty Augmentaton Bound of 4 2 for GEDF m 4.1 Notaton Proof for Capacty Augmentaton Bound Fxed Pont Schedulablty Test Basc Schedulablty Test Improvng the Carry-In Work Calculaton Improvng the Calculaton for Completon Tme Resource Augmentaton Bound of 2 1 for GEDF m 6.1 Proof for Resource Augmentaton Bound An Example Provdng an Intuton for the Proof Lower Bound on Capacty Augmentaton Bound of GEDF Smulaton Evaluaton Task Sets and Expermental Setup Smulaton Results Erdos-Reny Method Synchronous Method Parallel GEDF Platform

5 9.1 Background OpenMP Overvew LITMUS RT Overvew PGEDF Platform Implementaton Programmng Interface PGEDF Operaton Expermental Evaluaton of PGEDF Expermental Machnes Task Set Generaton Baselne Platform Experment Results Conclusons

6 Lst of Tables 10.1 Task Set Characterstcs v

7 Lst of Fgures 3.1 Example task wth work C = 8 and crtcal-path length L = Example task set executon trace Examples of task set executon on 2 cores Structure of the task set that demonstrates GEDF does not provde a capacty augmentaton bound less than (3 + 5)/ Executon of the task set under GEDF at speed Falure rato of GEDF (sold lne) vs. DECOMP (dashed lne) for G(n, p) tasks wth dfferent task set utlzaton percentages (speedups). The left three fgures show the results for 64-core, and rght three for 16-core. From top down, fgures show results wth small, medum and large values of p respectvely The left fgure shows the effect of varyng p on the speedup requred to make all task sets schedulable.the rght fgure shows the effect of varyng m on the speedup requred to make all task sets schedulable. (harmonc perod) Performance of GEDF (sold lne) vs. DECOMP (dashed lne) for dfferent values of m. GEDF s always better than DECOMP. In general, ncreasng the number of processors generally ncreases falure rates Task Program Format Format of the Confguraton Fle Man Structure of Each Real-Tme Task n PGEDF Falure rato of PGEDF vs. RT-OpenMP wth dfferent percentages of utlzaton (speedup) for 14-core task sets wth low-slack and harmonc perods Falure rato of PGEDF vs. RT-OpenMP wth dfferent percentages of utlzaton (speedup) for 14-core task sets wth hgh-slack and harmonc perods Falure rato of PGEDF vs. RT-OpenMP wth dfferent percentages of utlzaton (speedup) for 14-core task sets wth low-slack and hgh-parallelsm Falure rato of PGEDF vs. RT-OpenMP wth dfferent percentages of utlzaton (speedup) for 7-core task sets v

8 Acknowledgments I would lke to frst thank both my academc advsors, Dr. Chenyang Lu and Dr. Kunal Agrawal, who has been gudng my study and teachng me how to do great research from bg pctures (lke fndng mportant and hard problems to tackle) to every detal (such as presentaton sklls, wrtng sklls and provng technques). I am grateful to Dr. Chrstopher Gll, who s extremely supportve n all of my researches. I am also thankful for the many collaborators who made ths research possble: Abusayeed Safullah, who started the research on parallel real-tme schedulng and led me nto ths research area; Davd Ferry, who started the mplementaton of the frst parallel real-tme platform and also who I have been closely workng wth and learn from; Kevn Keselbach, who helped me to buld the platform nterface and framework; Zheng Luo, who helped me to mplement the prototype system and to run extensve measurements to test the system. Fnally, ths research was supported n part by NSF grants CCF (CPS), CCF and CCF Jng L Washngton Unversty n Sant Lous May 2014 v

9 ABSTRACT OF THE THESIS Global EDF Schedulng for Parallel Real-Tme Tasks by Jng L Master of Scence n Computer Scence Washngton Unversty n St. Lous, May 2014 Research Advsor: Professor Chenyang Lu, Professor Kunal Agrawal As multcore processors become ever more prevalent, t s mportant for real-tme programs to take advantage of ntra-task parallelsm n order to support computaton-ntensve applcatons wth tght deadlnes. In ths thess, we consder the Global Earlest Deadlne Frst (GEDF) schedulng polcy for task sets consstng of parallel tasks. Each task can be represented by a drected acyclc graph (DAG) where nodes represent computatonal work and edges represent dependences between nodes. In ths model, we prove that GEDF provdes a capacty augmentaton bound of 4 2 m and a resource augmentaton bound of 2 1. The capacty augmentaton bound acts as a lnear- m tme schedulablty test snce t guarantees that any task set wth total utlzaton of at most m/(4 2 ) where each task s crtcal-path length s at most 1/(4 2 ) of ts deadlne m m s schedulable on m cores under GEDF. In addton, we present a pseudo-polynomal tme fxed-pont schedulablty test for GEDF; ths test uses a carry-n work calculaton based on the proof for the capacty bound. v

10 Fnally, we present and evaluate a prototype platform called PGEDF for schedulng parallel tasks usng GEDF. PGEDF s bult by combnng the GNU OpenMP runtme system and the LITMUS RT operatng system. Ths platform allows programmers to wrte parallel OpenMP tasks and specfy real-tme parameters such as deadlnes for tasks. We perform two knds of experments to evaluate the performance of GEDF for parallel tasks. (1) We run numercal smulatons for DAG tasks. (2) We execute randomly generated tasks usng PGEDF. Both sets of experments ndcate that GEDF performs surprsngly well and outperforms an exstng schedulng technques that nvolves task decomposton. v

11 Chapter 1 Introducton Durng the last decade, the ncrease n performance processor chps has come prmarly from ncreasng numbers of cores. Ths has led to extensve work on real-tme schedulng technques that can explot multcore and multprocessor systems. Most pror work has concentrated on nter-task parallelsm, where each task runs sequentally (and therefore can only run on a sngle core) and multple cores are exploted by ncreasng the number of tasks. Ths type of schedulng s called multprocessor schedulng. When a model s lmted to nter-task parallelsm, each ndvdual task s total executon requrement must be smaller than ts deadlne snce ndvdual tasks cannot run any faster than on a sngle-core machne. In order to enable tasks wth hgher executon demands and tghter deadlnes, such as those used n autonomous vehcles, vdeo survellance, computer vson, radar trackng and real-tme hybrd testng [54], we must enable parallelsm wthn tasks. In ths paper, we are nterested n parallel schedulng, where n addton to nter-task parallelsm, task sets contan ntra-task parallelsm, whch allows threads from one task to run n parallel on more than a sngle core. Whle there has been some recent work n ths area, many of these approaches are based on task decomposton [43, 64, 63], whch frst decomposes each parallel task nto a set of sequental subtasks wth assgned ntermedate release tmes and deadlnes, and then schedules these sequental subtasks usng a known multprocessor schedulng algorthm. In ths work, we are nterested n analyzng the performance of global EDF (GEDF ) schedulers wthout any decomposton. We consder a general task model, where each task s represented as a drected acyclc graph (DAG) and where each node represents a sequence of nstructons (thread) and each edge represents a dependency between nodes. A node s ready to be executed when 1

12 all ts predecessors have been executed. GEDF works as follows: for ready nodes at each tme step, the scheduler frst tres to schedule as many jobs wth the earlest deadlne as t can; then t schedules jobs wth the next earlest deadlne, and so on, untl ether all cores are busy or no more nodes are ready. Compared wth other schedulers, GEDF has benefts, such as automatc load balancng. Effcent and scalable mplementatons of GEDF for sequental tasks are avalable for Lnux [47] and LITMUS RT [16], whch can be used to mplement GEDF for parallel tasks f decomposton s not requred. Pror theory analyzng GEDF for parallel tasks s ether restrcted to a sngle recurrng task [10] or consders response tme analyss for soft-real tme tasks [51]. In ths paper, we consder task sets wth n tasks and analyze ther schedulablty under a GEDF scheduler n terms of augmentaton bounds. We dstngush between two types of augmentaton bounds, both of whch are called resource augmentaton n the prevous lterature. By standard defnton, a scheduler S provdes a resource augmentaton bound of b f the followng condton holds: f an deal scheduler can schedule a task set on m unt-speed cores, then S can schedule that task set on m cores of speed b. Note that the deal scheduler (optmal schedule) s only a hypothetcal scheduler, meanng that f a feasble schedule ever exsts for a task set then ths deal scheduler can guarantee to schedule t. Unfortunately, Fsher et al. [32] proved that optmal onlne multprocessor schedulng of sporadc task systems s mpossble. Snce there may be no way to tell whether the deal scheduler can schedule a gven task set on unt-speed cores, a resource augmentaton bound may not provde a schedulablty test. Therefore, we dstngush resource augmentaton from a capacty augmentaton bound that can serve as an easy schedulablty test. If on unt-speed cores, a task set has total utlzaton of at most m and the crtcal-path length of each task s smaller than ts deadlne, then scheduler S wth capacty augmentaton bound b can schedule ths task set on m cores of speed b. Note that the deal scheduler cannot schedule any task set that does not meet these utlzaton and crtcal-path length bounds on unt-speed cores; therefore, a capacty augmentaton bound of b mples a resource augmentaton bound of b. Capacty augmentaton bounds have the advantage that they drectly lead to schedulablty tests, snce one can easly check the bounds on utlzaton and crtcal-path length for any task set. The contrbutons presented n ths paper are as follows: 2

13 1. For a system wth m dentcal cores, we prove a capacty augmentaton bound of 4 2 m (whch approaches 4 as m approaches nfnty) for sporadc task sets wth mplct deadlnes the relatve deadlne of each task s equal to ts perod. Another way to understand ths bound s: f a task set has total utlzaton at most m/(4 2 ) and the m crtcal-path length of each task s at most 1/(4 2 ) of ts deadlne, then t can be m scheduled usng GEDF on unt-speed cores. 2. Whle the capacty augmentaton bound functons as a lnear-tme schedulablty test, we further provde a fxed-pont schedulablty test that may admt more task sets but takes pseudo-polynomal tme to compute. 3. For a system wth m dentcal cores, we prove a resource augmentaton bound of 2 1 m (whch approaches 2 as m approaches nfnty) for sporadc task sets wth arbtrary deadlnes. 4. We also show that GEDF s capacty bound for parallel task sets (even wth mplct deadlnes) s lower bounded by 2 1. In partcular, we show example task sets wth m utlzaton m where the crtcal-path length of each task s no more than ts deadlne, whle GEDF msses a deadlne on m cores wth speed less than We conduct smulaton experments to show that the capacty augmentaton bound s safe for task sets wth dfferent DAG structures (as mentoned above, checkng the resource augmentaton bound s dffcult snce we cannot compute the optmal schedule). Smulatons show that GEDF performs surprsngly well. All smulated random task sets meet ther deadlnes wth 50% utlzaton (core speed of 2). We also compare GEDF wth a schedulng technque that decomposes parallel tasks and then schedules decomposed subtasks usng GEDF [63]. For all of the DAG task sets consdered n our experments, the GEDF scheduler wthout decomposton has better performance. 6. To demonstrate the feasblty of parallel GEDF schedulng n real systems, we mplement a prototype platform named PGEDF. PGEDF supports standard OpenMP programs wth parallel for-loops. Therefore, t supports a subset of DAGs namely synchronous tasks where the program conssts of a sequence of segments whch can be parallel or sequental and parallel segments are represented usng parallel for-loops. Whle not as general as DAGs, these synchronous tasks consttute a large subset of nterestng parallel programs. PGEDF ntegrates the GNU OpenMP runtme system [60] 3

14 and LITMUS RT patched Lnux kernel [16], where the former executes each task wth parallel threads and the latter s responsble for schedulng threads of all tasks under GEDF schedulng. 7. We evaluate the performance of PGEDF wth randomly generated synthetc task sets. Wth those task sets, all deadlnes are met when total utlzaton s less than 30% (core speed of 3.3) n PGEDF. We compare PGEDF wth an exstng parallel realtme platform, RT-OpenMP [30], whch was also desgned for synchronous tasks but under decomposed fxed prorty schedulng. We fnd that for most task sets, PGEDF performs better. In the rest of the paper, Chapter 2 revews related work and Chapter 3 descrbes the DAG task model wth ntra-task parallelsm. Proof for a capacty augmentaton bound of 4 2 m and a fxed pont schedulablty test based on capacty augmentaton bound are presented n Chapters 4 and 5 respectvely. We prove a resource augmentaton bound of 2 1 n m Chapter 6. In Chapter 7, we present an example to show the lower bound on capacty bound for GEDF. Chapter 8 shows the smulaton results. Then we descrbe the mplementaton of our PGEDF platform n Chapter 9 and evaluate t n Chapter 10. Fnally, Chapter 11 gves concludng remarks. 4

15 Chapter 2 Related Work Most pror work on hard real-tme schedulng atop multprocessors has concentrated on sequental tasks [22]. In ths context, many suffcent schedulablty tests for GEDF and other global fxed prorty schedulng algorthms have been proposed [3, 66, 34, 12, 8, 7, 44, 11, 13]. In partcular, for mplct deadlne hard-real tme tasks, the best known utlzaton bound s 50% usng parttoned fxed prorty schedulng [4] or parttoned EDF [9, 52]; ths trvally mples a capacty bound of 2. [9] proved that global EDF has a capacty augmentaton bound of 2 1/m for sequental tasks on multprocessors. Earler work consderng ntra-task parallelsm makes strong assumptons on task models [45, 20, 55]. For more realstc parallel tasks, e.g. synchronous tasks, Kato et al.[38] proposed a gang schedulng approach. The synchronous model, a specal case of the more general DAG model, represents tasks wth a sequence of mult-threaded segments wth synchronzaton ponts between them (such as those generated by parallel for-loops). Most other approaches for schedulng synchronous tasks nvolve decomposng parallel tasks nto ndependent sequental subtasks, whch are then scheduled usng known multprocessor schedulng technques, such as deadlne monotonc [31] or GEDF [8]. For a restrcted set of synchronous tasks, Lakshmanan et al. [43] prove a capacty augmentaton bound of 3.42 usng deadlne monotonc schedulng for decomposed tasks. For more general synchronous tasks, Safullah et al. [64] proved a capacty augmentaton bound of 4 for GEDF and 5 for deadlne monotonc schedulng. The decomposton strategy was mproved n [58] for usng less cores. For the same general synchronous model, the best known augmentaton bound s 3.73 [39] also usng decomposton. The decomposton approach n [64] was recently extended to general DAGs [63] to acheve a capacty augmentaton bound of 4 under GEDF on decomposed tasks 5

16 (note that n that work GEDF s used to schedule sequental decomposed tasks, not parallel tasks drectly). Ths s the best augmentaton bound known for task sets wth multple DAGs. For schedulng synchronous tasks wthough decomposton, [19] and [6] presented schedulablty tests for GEDF and parttoned fxed prorty schedulng respectvely. More recently, there has been some work on schedulng general DAGs wthout decomposton. Noguera et al. [59] explored the use of work-stealng for real-tme schedulng. The paper s mostly expermental and focused on soft real-tme performance. The bounds for hard real-tme schedulng only guarantee that tasks meet deadlnes f ther utlzaton s smaller than 1. Lu and Anderson [51] analyzed the response tme of GEDF wthout decomposton for soft real-tme tasks. A resource augmentaton bound of 2 1 for GEDF was proved m for a staged DAG model [5]. Baruah et al. [10] proved that when the task set s a sngle DAG task wth arbtrary deadlnes, GEDF provdes a resource augmentaton bound of 2. For multple DAGs, Bonfac et al. [14] also show the same resource augmentaton bound 2 1, but do not consder capacty augmentaton. They also proved that global deadlne m monotonc schedulng has a resource augmentaton bound of 3 1. m Varous platforms support sequental real-tme tasks on parallel machnes [16, 47]. Our platform prototype, PGEDF, s based on LITMUS RT [16]. As for parallel tasks, we are aware of two systems [39, 30] that support parallel real-tme tasks based on dfferent decomposton strateges. Km et al. [39] used a reservaton-based OS to mplement a system that can run parallel real-tme programs for an autonomous vehcle applcaton, demonstratng that parallelsm can enhance performance for complex tasks. Ferry et al. [30] developed a parallel real-tme schedulng servce on standard Lnux. However, snce both systems adopted task decomposton approaches, they requre users to provde exact task structures and subtask executon tme detals n order to decompose tasks correctly. The system presented [30] also requres modfcatons to the compler and runtme system to decompose, dspatch and execute parallel applcatons. The platform prototype presented here does not requre decomposton or such detaled nformaton. Schedulng parallel tasks wthout deadlnes has been addressed by parallel-computng researchers [62, 26, 23, 1]. Soft real-tme schedulng has been studed for varous optmzaton crtera, such as cache msses [17, 2], makespan [67] and total work done by tasks that meet deadlnes [42]. 6

17 Chapter 3 Task Model and Defntons Ths chapter presents a model for DAG tasks. We consder a system wth m dentcal untspeed cores. The task set τ conssts of n tasks τ = {τ 1, τ 2,..., τ n }. Each task τ s represented by a drected acyclc graph (DAG), and has a perod P and deadlne D. We represent the to W k means that has fnshed executng. A node s ready to be executed j-th subtask of the th task as node W j j. A drected edge from node W W k can only be executed after W j as soon as all of ts predecessors have been executed. Each node has ts own worst-case executon tme C j. Multple source nodes and snk nodes are allowed n the DAG, and the DAG s not requred to be fully connected. Fgure 3.1 shows an example of a task consstng of 5 subtasks n the DAG structure. For each task τ n task set τ, let C = j Cj be the total worst-case executon tme on a sngle core, also called the work of the task. Let L be the crtcal-path length (.e. the worst-case executon tme of the task on an nfnte number of cores). In Fgure 3.1, the crtcal-path (.e. the longest path) starts from node W 1 1, goes through W 3 1 and ends at node W 4 1, so the crtcal-path length of DAG W 1 s = 6. The work and the crtcal-path length of any job generated by task τ are the same as those of task τ. We also defne the noton of remanng work and remanng crtcal-path length of a partally executed job. The remanng work s the total work mnus the work that has already been done. The remanng crtcal-path length s the length of the longest path n the unexecuted porton of the DAG (ncludng partally executed nodes). For example, n Fgure 3.1, f W 1 1 and W 2 1 are completely executed, and W 3 1 s partally executed such that 1 unt (out of 3) of work has been done for t, then the remanng crtcal-path length s = 4. 7

18 Fgure 3.1: Example task wth work C = 8 and crtcal-path length L = 6. Nodes do not have ndvdual release offsets and deadlnes when scheduled by the GEDF scheduler; they share the same absolute deadlne of ther jobs. Therefore, to analyze the GEDF scheduler, we do not requre any knowledge of the DAG structure beyond the total worst-case executon tme C, deadlne D, perod P and crtcal-path length L. We also defne the utlzaton of a task τ as u = C P. On unt speed cores, a task set s not schedulable (by any scheduler) unless the followng condtons hold: The crtcal-path length of each task s less than ts deadlne. L D (3.1) The total utlzaton s smaller than the number of cores. u m (3.2) In addton, we denote J k,a as the a-th job nstance of task k n system executon. For example, the -th node of J k,a s represented as Wk,a. We denote r k,a and d k,a as the absolute release tme and absolute deadlne of job J k,a respectvely. Relatve deadlne D k s equal to d k,a r k,a. Snce n ths paper we address sporadc tasks, the absolute release tme has the followng propertes: r k,a+1 d k,a r k,a+1 r k,a d k,a r k,a = D k 8

19 Chapter 4 Capacty Augmentaton Bound of 4 2 m for GEDF In ths chapter, we propose a capacty augmentaton bound of 4 2 for mplct deadlne m tasks, whch yelds an easy schedulablty test. In partcular, we show that GEDF can successfully schedule a task set, f the task set satsfes two condtons: (1) ts total utlzaton s at most m/(4 2 m ) and (2) the crtcal-path length of each task s at most 1/(4 2 m ) of ts perod (and deadlne). Note that ths s equvalent to sayng that f a task set meets condtons from Inequaltes 3.1 and 3.2 on processors of unt speed, then t can be scheduled on m cores of speed 4 2 m (whch approaches 4 as m approaches nfnty). The gst of the proof s the followng: at a job s release tme, we can bound the remanng work from other tasks under GEDF wth speedup 4 2. Bounded remanng work leads m to bounded nterference from other tasks, and hence GEDF can successfully schedule all of them. 4.1 Notaton We frst defne a noton of nterference. Consder a job J k,a, whch s the a-th nstance of task τ k. Under GEDF schedulng, only jobs that have absolute deadlnes earler than the absolute deadlne of J k,a can nterfere wth J k,a. We say that a job s unfnshed f the job has been released but has not completed yet. Due to mplct deadlnes (D = P ), at most one job of each task can be unfnshed at any tme. 9

20 There are two sources of nterference for job J k,a. (1) Carry-n work s the work from jobs that were released before J k,a, dd not fnsh before J k,a was released, and have deadlnes before the deadlne of J k,a. Let R k,a be the carry-n work due to task τ and let R k,a = Rk,a be the total carry-n from the entre task set onto the job J k,a. (2) Other than carry-n work, the jobs that were released after (or at the same tme as) J k,a was released can also nterfere wth t f ther deadlnes are ether before or at the same tme as J k,a. Let n k,a be the number of jobs of task τ, whch are released after the release tme of J k,a but have deadlnes no later than the deadlne of J k,a (that s, the number of jobs from task τ that entrely fall n between the release tme and deadlne of J k,a,.e. the tme nterval [r k,a, d k,a ].) For example, n the rght hand sde of Fgure 4.1, one entre job J 1,3 falls wthn tme nterval [r 3,1, d 3,1 ] of job J 3,1, so n 3,1 1 = 1. By defnton (and D = P ), every task has the property that n k,a D D k (4.1) Fgure 4.1: Example task set executon trace Therefore, the total amount of work A k,a, that can nterfere wth J k,a (ncludng J k,a s work) and (to prevent any deadlne msses) must be fnshed before the deadlne of J k,a s the sum 10

21 of the carry-n work and the work that was released at or after J k,a s release. A k,a = R k,a + u n k,a D. (4.2) Note that the work of the job J k,a tself s also ncluded n ths formula. That s, n ths formulaton, each job nterferes wth tself. 4.2 Proof for Capacty Augmentaton Bound Consder a GEDF schedule wth m cores each of speed b. Each tme step can be dvded nto b sub-steps such that each core can do one unt of work n each sub-step. We say a sub-step s complete f all cores are workng durng that sub-step, and otherwse we say t s ncomplete. Frst, a couple of straght-forward lemmas. Lemma 1 On every ncomplete sub-step, the remanng crtcal-path length of each unfnshed job reduces by 1. Lemma 2 In any t contguous tme steps (bt sub-steps) wth unfnshed jobs, f there are t ncomplete sub-steps, then the total work done durng ths tme, F t s at least F t bmt (m 1)t. Proof. The total number of complete sub-steps durng t steps s bt t, and the total work durng these complete steps s m(bt t ). On an ncomplete sub-step, at least one unt of work s done. Therefore, the total work done n ncomplete sub-steps s at least t. Addng the two gves us the bound. We now prove a suffcent condton for the schedulablty of a job. 11

22 Lemma 3 If nterference A k,a on a job J k,a s bounded by A k,a bmd k (m 1)D k, then job J k,a can meet ts deadlne on m dentcal cores wth speed of b. Proof. Note that there are D k tme steps (therefore bd k sub-steps) between the release tme and deadlne of ths job. There are two cases: Case 1: The total number of ncomplete sub-steps between the release tme and deadlne of J k,a s more than D k, and therefore, also more than L k. In ths case, J k,a s crtcal-path length reduces on all of these sub-steps. After at most L k ncomplete steps, the crtcal-path s 0 and the job has fnshed executng. Therefore, t can not mss the deadlne. Case 2: The total number of ncomplete sub-steps between the release and deadlne of J k,a s smaller than D k. Therefore, the total amount of work done durng ths tme s more than bmd k (m 1)D k by the condton n Lemma 2. Snce the total nterference (ncludng J k,a s work) s at most ths quantty, the job cannot mss ts deadlne. We now defne addtonal notaton n order to prove that f the carry-n work for a job s bounded, then GEDF guarantees a capacty augmentaton bound of b. Let α k,a be the number of tme steps between the absolute release tme of J k,a and the absolute deadlne of the carry-n job of task. Hence, for J k,a and ts carry-n job J j,b of task j α k,a j = d j,b r k,a (4.3) To make the notaton clearer, we gve an example that s also llustrated n Fgure 4.1. There are 3 sporadc tasks wth mplct deadlnes: the (executon tme, deadlne, perod) for tasks τ 1, τ 2 and τ 3 are (2, 3, 3), (7, 7, 7) and (6, 6, 6) respectvely. For smplcty, assume they are sequental tasks. Snce tasks are sporadc, r 1,2 > d 1,1. α 3,1 1 s the number of tme steps between the release tme of job J 3,1 and the deadlne of the carry-n job J 1,2 from task 1. In ths example, α 3,1 1 = 2. Smlarly, α 3,1 2 = 3. Also, n 3,1 1 = 1. 12

23 For ether perodc or sporadc tasks, task has the property α k,a + n k,a D D k (4.4) Snce α k,a s the remanng length of the carry-n job and n k,a s the number of jobs of task τ entrely fallng n the perod (relatve deadlne) of job J k,a, then as n Fgure 4.1, α 3,1 1 + n 3,1 1 D 1 = = 5 < 6 = D 3. Lemma 4 If the cores speed s b 4 2 m and the total carry-n work Rk,a from every task τ satsfes the condton R k,a u α k,a + m max (α k,a ), then job J k,a always meets ts deadlne under global EDF. Proof. The total amount of nterferng work (ncludng J k,a s work) s A k,a = R k,a + u n k,a D. Hence, accordng to the condton n Lemma 4, the total amount of work s: A k,a = R k,a + u α k,a u (α k,a + m max + n k,a u n k,a D (α k,a ) + u n k,a D D ) + m max (α k,a ) Usng eq.(4.4) to substtute D k nto the formula, then A k,a u D k + md k 13

24 Snce the total task set utlzaton does not exceed the number of cores m, by eq.(3.2), we replace u wth m. And snce b 4 2 m and m 1, we get A k,a 2mD k (3m 1)D k (4 2 m )md k (m 1)D k bmd k (m 1)D k Fnally, accordng to Lemma 3, snce the nterference satsfes the bound, job J k,a can meet ts deadlne. We now complete the proof by showng that the carry-n work s bounded as requred by Lemma 4 for every job. Lemma 5 If the core s speed b 4 2, then, for ether perodc or sporadc task sets wth m mplct deadlnes, the total carry-n work R k,a for every job J k,a n the task set s bounded by R k,a u α k,a + m max (α k,a ) Proof. J k,a. We prove ths theorem by nducton from absolute tme 0 to the release tme of job Base Case: For the very frst job of all the tasks released n the system (denoted J l,1 ), no carry-n jobs are released before ths job. Therefore, the condton trvally holds and the job can meet ts deadlne by Lemma 4. R l,1 = 0 u α l,1 + m max (α l,1 ) Inductve Step: Assume that for every job wth an earler release tme than J k,a, the condton holds. Therefore, accordng to Lemma 4, every earler released job meets ts deadlne. Now we prove that the condton also holds for job J k,a. 14

25 For job J k,a, f there s no carry-n work from jobs released earler than J k,a, so that R k,a = 0, the property trvally holds. Otherwse, there s at least one unfnshed job (a job wth carry-n work) at the release tme of J k,a. We now defne J j,b as the job wth the earlest release tme among all the unfnshed jobs at the tme that J k,a was released. For example, at release tme r 3,1 of J 3,1 n Fgure 4.1, both J 1,2 and J 2,1 are unfnshed, but J 2,1 has the earlest release tme. By the nductve assumpton, the carry-n work R j,b at the release tme of job J j,b s bounded by R j,b u α j,b + m max (α j,b ) (4.5) Let t be the number of tme steps between the release tme r j,b of J j,b and the release tme r k,a of J k,a. t = r k,a r j,b Note that J j,b has not fnshed at tme r k,a, but by assumpton t can meet ts deadlne. Therefore ts absolute deadlne d j,b s later than the release tme r k,a. So, by eq.(4.3) t + α k,a j = r k,a r j,b + α k,a j = d j,b r j,b = D j (4.6) In Fgure 4.1, t + α 3,1 1 = r 3,1 r 2,1 + α 3,1 1 = d 2,1 r 2,1 = D 2. For each τ, let n t be the number of jobs that are released after the release tme r j,b of J j,b but before the release tme r k,a of J k,a. The last such job may have a deadlne after the release tme of r k,a, but ts release tme s before r k,a. In other words, n t s the number of jobs of task τ, whch fall entrely nto the tme nterval [r j,b, r k,a + D ]. By defnton of α k,a, to job J k,a, the deadlne of the unfnshed job of task τ s r k,a + α k,a. Therefore, for every τ, α j,b + n t D r k,a + α k,a r j,b = t + α k,a (4.7) As n the example n Fgure 4.1, one entre job of task τ 1 falls wthn [r 2,1, r 3,1 + D 1 ], makng n t 1 = 1 and d 1,2 = r 3,1 + α 3,1 1. Also, snce d 1,1 r 1,2, α 2,1 1 + n t 1D 1 = α 2,1 1 + D 1 d 1,2 r 2,1 = r 3,1 + α 3,1 1 r 2,1 = t + α 3,1 1 t + D 1. 15

26 Comparng between t and α k,a j, when t 1D 2 j, by eq.(4.6), α k,a j = D j t 1D 2 j t. There are two cases: Case 1: t 1 2 D j and hence α k,a j t: Snce by defnton J j,b s the earlest carry-n job, other carry-n jobs to J k,a are released after the release tme of J j,b and therefore are not carry-n jobs to J j,b. In other words, the carry-n jobs to J j,b must have been fnshed before the release tme of J k,a, whch means that the carry-n work R j,b s not part of the carry-n work R k,a. So the carry-n work R k,a s the sum of those released later than J j,b R k,a = u n t D u (t + α k,a ) (from eq.(4.7)) ( By assumpton of case 1, t α k,a j max eq.(3.2), we can prove that R k,a u α k,a α k,a ). Hence, replacng u wth m usng ( + m max α k,a ) Case 2: t > 1D 2 j: Snce J j,b has not fnshed executng at the release tme of J k,a, the total number of ncomplete sub-steps durng the t tme steps (r j,b, r k,a ] s less than L j. Therefore, the total work done durng ths tme s at least F t where F t = bmt (m 1)L j (from Lemma 2) bmt (m 1)D j (from eq.(3.1)) The total amount of work from jobs that are released n tme nterval (r j,b, r k,a ] (.e, entre jobs that fall n between the release tme of job J j,b and the release tme of job J k,a plus ts deadlne) s u n t D, by the defnton of n t. The carry-n work R k,a at the release tme of job J k,a s the sum of the carry-n work R j,b and newly released work u n t D mnus the 16

27 fnshed work durng tme nterval t, whch s R k,a = R j,b + u n t D F t R j,b + u n t D (bmt (m 1)D j ) (4.8) By the assumpton n eq.(4.5), we can replace R j,b and get R k,a u α j,b ( + m max α j,b ) + u n t D bmt + (m 1)D j ( ) ( u α j,b + n t D + m max α j,b ) bmt + (m 1)D j Accordng to eq.(4.7), we can replace α j,b get R k,a ( ) u t + α k,a + n t D wth t + α k,a, reorganze the formula, and + m max (α j,b ) bmt + (m 1)D j ( ) ( ) u t + α k,a mt +m max (α j,b ) + (m 1)D j (b 1)mt 17

28 Usng eq.(3.2) to replace m wth u n the frst tem, usng eq.(4.4) to get max (α j,b ) D j and to replace max (α j,b ) wth D j n the second tem, and snce t > 1D 2 j, R k,a u α k,a u α k,a + md j + (m 1)D j (b 2)mt mt + md j mt + 2(m 1)t (b 2)mt u α k,a + m(d j t) + 0 (snce b 4 2 m ) u α k,a + mα k,a j (from eq.(4.6)) Fnally, snce α k,a j max ( α k,a ), we can prove that R k,a u α k,a ( + m max α k,a ) Hence, by nducton, f the core speed b 4 2 m, for every J k,a n task set R k,a u α k,a ( + m max α k,a ) From Lemmas 4 and 5, we can easly derve the followng capacty augmentaton bound theorem. Theorem 1 If, on unt speed cores, the utlzaton of a sporadc task set s at most m, and the crtcal-path length of each job s at most ts deadlne, then the task set can meet all ther mplct deadlnes on m cores of speed 4 2 m. Theorem 1 proves the speedup factor of GEDF and t also can be restated as follows: 18

29 Corollary 1 Gven that a sporadc task set τ wth mplct deadlnes satsfes the followng condtons: (1) total utlzaton s at most 1/(4 2 ) of the total system capacty m and (2) m the crtcal path L of every task τ τ s at most D /(4 2 ), then GEDF can schedule ths m task set τ on m cores. 19

30 Chapter 5 Fxed Pont Schedulablty Test In Chapter 4, we descrbed a capacty augmentaton bound for the GEDF scheduler, whch acts as a smple lnear tme schedulablty test. In ths chapter, we descrbe a tghter fxed pont schedulablty test for parallel task sets under a GEDF scheduler. We start wth a schedulablty test smlar to one for sequental tasks. Then, we mprove the calculaton of the carry-n work ths mprovement s based on some of the equatons used n the proof for our capacty augmentaton bound. Fnally, we further mprove the nterference calculaton by consderng the calculated fnsh tme and altogether derve the fxed pont schedulablty test. 5.1 Basc Schedulablty Test Gven a task set, we denote R k as an upper bound on the carry-n work from task τ to a job of task τ k, and R k = R k as an upper bound on the total carry-n work from the entre task set to a job of task τ k. We also denote Âk and Âk as the correspondng upper bounds on ndvdual and total nterference to task τ k. In addton, n k s an upper bound on the number of task τ s nterferng jobs, whch are not part of the carry-n jobs, but nterfere wth task τ k. Fnally, we use f k to denote an upper bound on the relatve completon tme of task τ k. If f k D k, then task τ k s schedulable, and otherwse t s not. Then from equaton 4.2, we can derve Â k R k + u n k D = R k + n k C (5.1) 20

31 Â k = Â k ( Rk + n ) k C = R k + ) ( nk C (5.2) From Lemma 2, we can easly derve that on a unt-speed system wth m cores, the maxmum completon tme of task τ k s f k 1 m (Âk + (m 1)L k ) (5.3) Ths s smply because the maxmum number of ncomplete steps before the completon of task τ k s ts crtcal-path length L k and the maxmum total avalable work (havng deadlnes no later than the completon tme) s the maxmum total nterference Âk. Note that the executon tme of task τ k s ncorporated n the calculaton of total nterference, whch we wll show below. Consder a job J k,a of task τ k, whch fnshes at ts absolute deadlne d k,a. Note that, n order to acheve the maxmum nterference n order to calculate the upper bound on Âk, the last job of task τ whch nterferes wth J k,a should have the same absolute deadlne as J k,a, that s, d k,a. Hence, n the worst case, the upper bound on the number of nterferng jobs that begn after J k,a s released (that s, they are not carry-n jobs) s n k = Dk D (5.4) Note that the executon tme of task τ k tself s consdered as part of ts nterference as well,.e. n k k = 1. Obvously there could at most be one carry-n job of task τ to the job J k,a of task τ k. Moreover, f n the worst-case of Âk, ths job has already fnshed before the release tme of J k,a, then R k = 0. By the defnton of carry-n jobs and Equaton (5.4) for n k, we can see that the length between the deadlne of carry-n job and the release tme of job J k,a s D k n k D. If the carry-n job has not fnshed when job J k,a s released, then D k n k D has to be longer than D k f, where f s the upper bound of task τ s completon tme. 21

32 We denote X k below as the upper bound for the maxmum carry-n work ( C D k n k X k = D > D f ) 0 (D k n k D D f ) = D k n k D D f 1 C Then obvously, the upper bound of total carry-n work to task τ k s R k = R k X k (5.5) Combnng the above calculatons together, we can derve the basc fxed pont calculaton of the maxmum completon tme of task τ k : ( R k + f k 1 m ( 1 m ) ) ( nk C + (m 1)L k (( D k n k D D f 1 + Dk D ) C ) + (m 1)L k ) (5.6) (5.7) The fxed pont schedulablty test works as follows: n the begnnng, we set the completon tme f k of each task to be the same as ts relatve deadlne D k ; then we teratvely use Equaton (5.7) to calculate a new value of completon tme f k for all τk ; we only update f k f the calculated new value s less than D k ; fnally, the calculaton wll stop f there s no more update for all f k. In the end, we use Equaton (5.7) agan to calculate the fnal upper bound of completon tme f k : f for all tasks fk D k, then the task set s deemed schedulable; otherwse, not. Obvously, before the last step of calculatng f k, n each teraton, fk wll not be larger than D k. After the frst teraton, each f k wll ether stays at D k or decrease (because f k s less than D k ). More mportantly, f k wll decrease or stay the same when at least one f of another task τ decreases. In concluson, f k wll not ncrease n each teraton. Therefore, the fxed pont calculaton wll converge. 22

33 Note that there s a subtlety about ths calculaton. Because of the assumpton f D of Equaton (5.4), Equaton (5.7) s only correct when the fnsh tme of each task n the task set s no more than ts relatve deadlne. Ths s the reason why n the fxed pont calculaton, we do not update f k f the calculated new value f k s larger than D k. After the last step (calculatng f k ) of the fxed pont calculaton, f the task set s schedulable,.e. the assumpton s satsfed, we actually dd correctly calculate an upper bound on the nterference and therefore an upper bound on the completon tme. Therefore, f ths test says that a task set s schedulable, t s ndeed schedulable. If the test says that the task set s unschedulable, then the test may be underestmatng the nterference. In ths case, however, ths naccuracy t does not matter, snce even the underestmaton makes the task set unschedulable, so even the correct estmaton wll also deem the task set unschedulable. 5.2 Improvng the Carry-In Work Calculaton In the basc test, we calculate the carry-n work usng Equaton (5.5). However, ths upper bound calculaton X k may be pessmstc, f task τ k has a very short perod, whle task τ has a very long perod. Ths s because f the carry-n job of τ to τ k has not fnshed before τ k s released, then the entre C wll be counted as nterference. However, GEDF, as a greedy algorthm, mght have already executed most of the computaton of the carry-n job. Inspred by the proof of the capacty augmentaton bound for GEDF, we propose another upper bound for R k. Note that n the proof of Lemma 5, there are the two cases. The calculaton of X k = Xk n the basc test s smlar to Case 1, but wthout knowng the frst carry-n job. Therefore, from Case 2, we can also obtan another upper bound Y k for R k wthout knowng the frst carry-n job. After gettng the two upper bounds of R k, we can smply take the mnmum of X k and Y k and acheve a schedulablty test. For R k, f there s no unfnshed carry-n job, then R k = 0 for job J k,a. Otherwse, say J j,b s the carry-n job wth the earlest release tme among all the unfnshed jobs at the release 23

34 tme of J k,a. From Inequalty (4.8), on m unt-speed cores, R k,a R j,b + n t C + (m 1)L j mt where t s the nterval between the release tme r j,b of J j,b and the release tme r k,a of J k,a and n t s the number of jobs of task τ that are released durng ths tme. In the worst case for A k (where every last nterferng job of each τ has the same deadlne as J k,a s deadlne), from Equaton (5.4), we can calculate t: t = D j + n k j D j D k t Dj + n k j = D j D k n t D D Therefore, f task τ j s ndeed the task havng the frst carry-n job, then the maxmum of the carry-n work R k of task τ k can be bounded by Y k j Y k j Y j + where ( Dj + n k j D j D k D C ) +(m 1)L j m(d j + n k j D j D k ) (5.8) Note that the bound Yj k s an upper bound on R k only f task τ j s ndeed the task whose job J j,b s the unfnshed carry-n job wth the earlest release tme. However, we do not know whch task s actually task τ j n fact, t can be dfferent for each job J k,a of task τ k. Therefore, we take the maxmum of Yj k for all the tasks τ j n the task set. Therefore, wthout knowng task τ j, we can bound the maxmum total carry-n work R k by overestmatng Y k : R k Y k max Yj k (5.9) j 24

35 Both Y k from Inequalty (5.9) and Xk from Inequalty (5.5) can be used to bound the carry-n work R k. Hence, we can mprove the basc test by usng R k mn ( X k, Y k) mn ( for the calculaton of completon tme n Formula (5.6). X k, max Yj k j ) (5.10) 5.3 Improvng the Calculaton for Completon Tme Fnally, note that n Formula (5.6), we calculate the maxmum number of nterferng but not carry-n jobs usng Equaton (5.4), n whch we assume that the completon tme of task τ k s exactly D k. However, f task τ k actually fnshes earler than ts deadlne, t may suffer from less nterference. Such a calculaton s no dfferent than for a sequental task set on a sngle core, so we can smlarly derve the mproved calculaton of n k ( ) n k = mn Dk Dk f k, + 1 D D usng (5.11) We can then use ths new calculaton for n k potentally tghter nterference calculaton. n our calculaton of nterference, leadng to a 25

36 Chapter 6 Resource Augmentaton Bound of 2 1 m for GEDF In ths chapter, we prove the resource augmentaton bound of 2 1 m of arbtrary deadlne tasks. for GEDF schedulng For sake of dscusson, we convert the DAG representng a task nto an equvalent DAG where each sub-node does 1 m unt of work. An example of ths transformaton of Task τ 1 n Fgure 3.1 s shown n job W 1 n Fgure 6.1 (see the upper job). A node wth work w s splt nto a chan of mw sub-nodes wth work 1. For example, snce n Fgure 6.1 m = 2, m node W1 1 wth worst-case executon tme of 1 s splt nto 2 sub-nodes W 1,1 1 and W 1,2 1 each wth length 1. The orgnal ncomng edges come nto the frst node of the chan, whle the 2 outgong edges leave the last node of the chan. Ths transformaton does not change any other characterstc of the DAG, and the schedulng does not depend on ths step the transformaton s done only for clarty of the proof. 6.1 Proof for Resource Augmentaton Bound Frst, some defntons. Snce the GEDF scheduler runs on cores of speed 2 1, each step m under GEDF can be dvded nto (2m 1) sub-steps of length 1. In each sub-step, each core m can do 1 unts of work (.e. execute one sub-node). In a GEDF scheduler, on an ncomplete m step, all ready nodes are executed (Observaton 1). As n Chapter 4, we say that a sub-step s complete f all cores are busy, and ncomplete otherwse. For each sub-step t, we defne 26

37 F I (t) as the set of sub-nodes that have fnshed executng under an deal scheduler after sub-step t, R I (t) as the set of sub-nodes that are ready (all ther predecessors have been executed) to be executed by the deal scheduler before sub-step t, and D I (t) as the set of sub-nodes completed by the deal scheduler n sub-step t. Note that D I (t) = R I (t) F I (t). We smlarly defne F G (t), R G (t), and D G (t) for GEDF scheduler. Observaton 1 The GEDF scheduler completes all the ready nodes n an ncomplete substep. That s, D G (t) = R G (t), f t s ncomplete sub-step, (6.1) Note for the deal scheduler, each orgnal step conssts of m sub-steps, whle for GEDF wth speed 2 1 each step conssts of 2m 1 sub-steps. For example, n Fgure 6.1 for step t m 1, there are two sub-steps t 1(1) and t 1(2) under deal scheduler, whle under GEDF there s an addtonal t 1(3) (snce 2m 1 = 3). Theorem 2 If an deal scheduler can schedule a task set τ (perodc or sporadc tasks wth arbtrary deadlnes) on a unt-speed system wth m dentcal cores, then global EDF can schedule τ on m cores of speed 2 1 m. Proof. In a GEDF scheduler, on an ncomplete sub-step, all ready sub-nodes are executed (Observaton 1). Therefore, after an ncomplete sub-step, GEDF must have fnshed all the released sub-nodes and hence must have done at least as much work as the deal scheduler. Thus, for brevty of our proof, we leave out any tme nterval when all cores under GEDF are dlng, snce at ths tme GEDF has fnshed all avalable work and at ths tme the Theorem s obvously true. We defne tme 0 as the frst nstant when not all cores are dlng under GEDF and tme t as any tme such that for every sub-step durng tme nterval [0, t] at least one core under GEDF s workng. Therefore for every ncomplete sub-step GEDF wll fnsh 1 at least 1 sub-node (.e. unt of work). We also defne sub-step 0 as the last sub-step m before tme 0 and hence by defnton, F G (0) F I (0) and F G (0) F I (0) (6.2) 27

38 For each tme t 0, we now prove the followng: If the deal unt-speed system can successfully schedule all tasks wth deadlnes n the tme nterval [0, t], then on speed 2 1 cores, m so can GEDF. Note agan that durng the nterval [0, t] an deal scheduler and GEDF have tm and 2tm t sub-steps respectvely. Case 1: In [0, t], GEDF has at most tm ncomplete sub-steps. Snce there are at least (2tm t) tm = tm t complete steps, the system can complete F G (t) F G (0) m(tm t) + (tm) = tm 2 work, snce each complete sub-step can fnsh executng m sub-nodes and each ncomplete sub-step can fnsh executng at least 1 subnode. We defne I(t) as the set of all sub-nodes from jobs wth absolute deadlnes no later than t. Snce the deal scheduler can schedule ths task set, we know that I(t) F I (0) mt m = tm 2, snce the deal scheduler can only fnsh at most m sub-nodes n each sub-step and durng [0, t] there are mt sub-steps for the deal scheduler. Hence, we have F G (t) F G (0) I(t) F I (0). By eq.(6.2), we get F G (t) I(t). Note that jobs n I(t) have earler deadlnes than the other jobs, so under GEDF, no other jobs can nterfere wth them. The GEDF scheduler wll never execute other sub-nodes unless there are no ready sub-nodes from I(t). Snce F G (t) I(t),.e. GEDF has fnshed at least as many sub-nodes as the number n I(t), ths mples that GEDF must have fnshed all sub-nodes n I(t). Therefore, GEDF can meet all deadlnes snce t has fnshed all work that needed to be done by tme t. Case 2: In [0, t], GEDF has more than tm ncomplete sub-steps. For each nteger s we defne f(s) as the frst tme nstant such that the number of ncomplete sub-steps n nterval [0, f(s)] s exactly s. Note that the sub-step f(s) s always ncomplete, snce otherwse t wouldn t be the frst such nstant. We show, va nducton, that F I (s) F G (f(s)). In other words, after f(s) sub-steps, GEDF has completed all the nodes that the deal scheduler has completed after s sub-steps. Base Case: For s = 0, f(s) = 0. By eq.(6.2), the clam s vacuously true. Inductve Step: Suppose that for s 1 the clam F I (s 1) F G (f(s 1)) s true. Now, we prove that F I (s) F G (f(s)). 28

39 In (s 1, s], the deal system has exactly 1 sub-step. So, F I (s) = F I (s 1) D I (s) F I (s 1) R I (s) (6.3) Snce F I (s 1) F G (f(s 1)), all the sub-nodes that are ready before sub-step s for the deal scheduler, wll ether have already been executed or are also ready for the GEDF scheduler one sub-step after sub-step f(s 1); that s, F I (s 1) R I (s) F G (f(s 1)) R G (f(s 1) + 1) (6.4) For GEDF, from sub-step f(s 1)+1 to f(s), all the ready sub-nodes wth earlest deadlnes wll be executed and then new sub-nodes wll be released nto the ready set. Hence, F G (f(s 1)) R G (f(s 1) + 1) F G (f(s 1) + 1) R G (f(s 1) + 2) (6.5)... F G (f(s) 1) R G (f(s)) Snce sub-step f(s) for GEDF s always ncomplete, F G (f(s)) = F G (f(s) 1) D G (f(s)) = F G (f(s) 1) R G (f(s)) (from eq.(6.1)) F G (f(s 1)) R G (f(s 1) + 1) (from eq.(6.5)) F I (s 1) R I (s) (from eq.(6.4)) F I (s) (from eq.(6.3)) By tme t, there are mt sub-steps for the deal scheduler, so GEDF must have fnshed all the nodes executed by the deal scheduler at sub-step f(mt). Snce there are exactly mt ncomplete sub-steps n [0, f(mt)] and snce the number of ncomplete sub-steps by tme t s at least mt, the tme f(mt) s no later than tme t. Snce the deal system does not mss any deadlne by tme t, GEDF also meets all deadlnes. 29

40 6.2 An Example Provdng an Intuton for the Proof We provde an example n Fgure 6.1 to llustrate the proof of Case 2 and compare the executon trace of an deal scheduler (ths scheduler s only consdered deal n the sense that t makes all the deadlnes) and GEDF. In addton to task 1 from Fgure 3.1, Task τ 2 conssts of two nodes connected to another node, all wth executon tme of 1 (each splt nto 2 sub-nodes n the fgure). All tasks are released by tme t 0. The system has 2 cores, so GEDF has a resource augmentaton bound of 1.5. Fgure 6.1(a) s the executon for the deal scheduler on unt-speed cores, whle Fgure 6.1(b) shows the executon under GEDF on speed 2 cores. One step s dvded nto 2 and 3 sub-steps, representng the speedup of 1 and 1.5 for the deal scheduler and GEDF respectvely. Snce the crtcal-path length of Task τ 1 s equal to ts deadlne, ntutvely t should be executed mmedately even though t has the latest deadlne. That s exactly what the deal scheduler does. However, GEDF (whch does not take crtcal-path length nto consderaton) wll prortze Task τ 2 frst. If GEDF s only on a unt-speed system,task τ 1 wll mss deadlne. However, when GEDF gets speed-1.5 cores, all jobs are fnshed n tme. To llustrate Case 2 of the above theorem, consder s = 2. Snce t 2(3) s the second ncomplete sub-step under GEDF, f(s) = 2(3). All the nodes fnshed by the deal scheduler after second sub-step (shown above n dark grey) have also been fnshed under GEDF by step t 2(3) (shown below n dark grey). 30

41 C 1 1 = 1 W 1 1,1 W 1 1,2 C 1 3 = 3 t 1(1) t 1(2) C 2 1 = 1 W 2,2 W 2,1 1 1 W 1 3,1 t 2(1) W 1 3,2 W 1 3,3 W 1 3,4 W 1 3,5 W 1 3,6 t 2(2) t 3(1) t 3(2) t 4(1) t 4(2) t 1(1) t 1(2) 1 st & 2 nd sub-step C 2 = 3 fnshed sub-nodes D C 1 2 = 1 2 = 5 u 2 = C 2 /D 2 = 3/5 W 2 1,1 W 2 1,2 t 2(1) t 2(2) C 2 2 = 1 W 2 2,1 W 2 2,2 t 3(1) t 3(2) W 2 3,1 C 2 3 = 1 W 2 3,2 t 4(1) t 4(2) t 5(1) t 5(2) W 1 5,1 W 1 5,2 C 1 5 = 1 C 1 = 8 D 1 = 6 u 1 = C 1 /D 1 = 4/3 (a) Scheduled under unt-speed deal scheduler. t 5(1) t 5(2) t 6(1) t 6(2) W 1 4,1 W 1 4,2 W 1 4,3 W 1 4,4 C 1 4 = 2 C 1 1 = 1 W 1 1,1 W 1 1,2 C 1 3 = 3 t 1(3) t 2(1) C 2 1 = 1 W 2,2 W 2,1 1 1 t 2(2) t 2(3) 1 st & 2 nd ncomplete sub-step C 1 2 = 1 W 1 3,1 t 3(1) W 1 3,2 W 1 3,3 W 1 3,4 W 1 3,5 W 1 3,6 t 3(2) t 3(3) t 4(1) t 4(2) t 4(3) t 7 1 f on speed-1 Not fnsh at tme 7 Mss deadlne t 5(1) t 5(2) W 1 5,1 W 1 5,2 t 5(1) t 5(2) W 1 4,1 W 1 4,2 W 2 1,1 W 2 1,2 t 1(1) t 1(2) C 2 2 = 1 W 2 2,1 W 2 2,2 t 1(1) t 1(2) W 2 3,1 C 2 3 = 1 W 2 3,2 t 1(3) t 2(1) C 1 5 = 1 Meet deadlne f on speed-1.5 (b) Scheduled under 2-speed GEDF scheduler. t 5(3) W 4,3 1 t 6(1) W 4,4 1 C 1 4 = 2 Fgure 6.1: Examples of task set executon on 2 cores. 31

42 Chapter 7 Lower Bound on Capacty Augmentaton Bound of GEDF Whle the above proof guarantees a bound, snce the deal scheduler s not known, gven a task set, we cannot tell f t s feasble on speed-1 cores. Therefore, we cannot tell f t s schedulable by GEDF on cores wth speed 2 1. m One standard way to prove resource augmentaton bounds s to use lower bounds on the deal scheduler, such as Inequaltes 3.1 and 3.2. As prevously stated, we call the resource augmentaton bound proven usng these lower bounds a capacty augmentaton bound n order to dstngush t from the augmentaton bound descrbed above. To prove a capacty augmentaton bound of b under GEDF, one must prove that f Inequaltes 3.1 and 3.2 hold for a task set on m unt-speed cores, then GEDF can schedule that task set on m cores of speed b. Hence, the capacty augmentaton bound s also an easy schedulablty test. Frst, we demonstrate a counter-example to show provng a capacty augmentaton bound of 2 for GEDF s mpossble. In partcular, n Fgure 7.1 we show a task set that satsfes nequaltes 3.1 and 3.2, but cannot be scheduled on m cores of speed 2 by GEDF. In ths example, m = 6 as shown n Fgure 7.2. The task set has two tasks. All values are measured on a unt-speed system, shown n Fgure 7.1. Task τ 1 has 13 nodes wth total executon tme of 440 and perod of 88, so ts utlzaton s 5. Task τ 2 s a sngle node, wth executon tme and mplct deadlne both 60 and hence utlzaton of 1. Note the total utlzaton (6) s exactly equal 32

43 Fgure 7.1: Structure of the task set that demonstrates GEDF does not provde a capacty augmentaton bound less than (3 + 5)/2 to m, satsfyng nequalty 3.2. The crtcal-path length of each task s equal to ts deadlne, satsfyng nequalty 3.1. The executon trace of the task set on a 2-speed 6-core core under GEDF s shown n Fgure 7.2. The frst task s released at tme 0 and s mmedately executed by GEDF. Snce the system under GEDF s at speed 2, W 1,1 1 fnshes executng at tme 28. GEDF then executes 6 out of the 12 parallel nodes from Task τ 1. At tme 29, task τ 2 s released. However, ts deadlne s r 2 + D 2 = = 89, whch s later than deadlne 88 of task τ 1. Nodes from task τ 1 are not preempted by task τ 2 and contnue to execute untl all of them fnsh ther work at tme 60. Task τ 1 successfully meets ts deadlne. The GEDF scheduler fnally gets to execute task τ 2 and fnshes t at tme 90, so task τ 2 just fals to meet ts deadlne of 89. Note that ths s not a counter-example for the resource augmentaton bound shown n Theorem 2, snce no scheduler can schedule ths task set on unt-speed system ether. Second, we demonstrate that one can construct task sets that requre capacty augmentaton of at least to be schedulable by GEDF. We generate task sets wth two tasks whose 33

44 Fgure 7.2: Executon of the task set under GEDF at speed 2 structure depends on m, speedup factor b and a parallelsm factor n, and show that for large enough m and n, the capacty augmentaton requred s at least b As n the lower part of Fgure 7.1, task τ 1 s structured as a sngle node wth work x followed by nm nodes wth work y. Its crtcal-path length s x + y and so s ts deadlne. The utlzaton of task τ 1 s set to be m 1, hence m 1 = x + nmy x + y (7.1) Task τ 2 s structured as a sngle node wth work and deadlne equal to x + y x (hence b utlzaton 1). Therefore, the total task utlzaton s m and Inequaltes 3.1 and 3.2 are met. As the lower part of Fgure 7.2 shows, Task τ 2 s released at tme x b

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling Real-Tme Systems Multprocessor schedulng Specfcaton Implementaton Verfcaton Multprocessor schedulng -- -- Global schedulng How are tasks assgned to processors? Statc assgnment The processor(s) used for