Fixed-Priority Multiprocessor Scheduling with Liu & Layland s Utilization Bound

Similar documents
Fixed-Priority Multiprocessor Scheduling with Liu & Layland s Utilization Bound

Parametric Utilization Bounds for Fixed-Priority Multiprocessor Scheduling

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

Embedded Systems. 4. Aperiodic and Periodic Tasks

Problem Set 9 Solutions

Two Methods to Release a New Real-time Task

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Last Time. Priority-based scheduling. Schedulable utilization Rate monotonic rule: Keep utilization below 69% Static priorities Dynamic priorities

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Partitioned Mixed-Criticality Scheduling on Multiprocessor Platforms

The Order Relation and Trace Inequalities for. Hermitian Operators

CHAPTER 17 Amortized Analysis

ECE559VV Project Report

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Improved Worst-Case Response-Time Calculations by Upper-Bound Conditions

Module 9. Lecture 6. Duality in Assignment Problems

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

AN EXTENDIBLE APPROACH FOR ANALYSING FIXED PRIORITY HARD REAL-TIME TASKS

Lecture 4. Instructor: Haipeng Luo

Quantifying the Sub-optimality of Uniprocessor Fixed Priority Pre-emptive Scheduling for Sporadic Tasksets with Arbitrary Deadlines

Foundations of Arithmetic

a b a In case b 0, a being divisible by b is the same as to say that

Lecture 4: November 17, Part 1 Single Buffer Management

Errors for Linear Systems

Difference Equations

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Notes on Frequency Estimation in Data Streams

Affine transformations and convexity

find (x): given element x, return the canonical element of the set containing x;

Global EDF Scheduling for Parallel Real-Time Tasks

NP-Completeness : Proofs

Graph Reconstruction by Permutations

More metrics on cartesian products

Lecture Notes on Linear Regression

Calculation of time complexity (3%)

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

2.3 Nilpotent endomorphisms

Economics 101. Lecture 4 - Equilibrium and Efficiency

= z 20 z n. (k 20) + 4 z k = 4

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Common loop optimizations. Example to improve locality. Why Dependence Analysis. Data Dependence in Loops. Goal is to find best schedule:

NUMERICAL DIFFERENTIATION

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

Edge Isoperimetric Inequalities

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Improving the Sensitivity of Deadlines with a Specific Asynchronous Scenario for Harmonic Periodic Tasks scheduled by FP

11 Tail Inequalities Markov s Inequality. Lecture 11: Tail Inequalities [Fa 13]

Lecture 12: Discrete Laplacian

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

Keynote: RTNS Getting ones priorities right

Finding Dense Subgraphs in G(n, 1/2)

The Geometry of Logit and Probit

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

A new construction of 3-separable matrices via an improved decoding of Macula s construction

Maximizing the number of nonnegative subsets

CONTRAST ENHANCEMENT FOR MIMIMUM MEAN BRIGHTNESS ERROR FROM HISTOGRAM PARTITIONING INTRODUCTION

MMA and GCMMA two methods for nonlinear optimization

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Quantifying the Sub-optimality of Uniprocessor Fixed Priority Non-Pre-emptive Scheduling

The Minimum Universal Cost Flow in an Infeasible Flow Network

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

APPENDIX A Some Linear Algebra

5 The Rational Canonical Form

Kernel Methods and SVMs Extension

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Chapter - 2. Distribution System Power Flow Analysis

Feature Selection: Part 1

Assortment Optimization under MNL

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Parallel Real-Time Scheduling of DAGs

Lecture 14: Bandits with Budget Constraints

Worst-case response time analysis of real-time tasks under fixed-priority scheduling with deferred preemption

Singular Value Decomposition: Theory and Applications

Example: (13320, 22140) =? Solution #1: The divisors of are 1, 2, 3, 4, 5, 6, 9, 10, 12, 15, 18, 20, 27, 30, 36, 41,

Finding Primitive Roots Pseudo-Deterministically

The Schedulability Region of Two-Level Mixed-Criticality Systems based on EDF-VD

On the correction of the h-index for career length

Volume 18 Figure 1. Notation 1. Notation 2. Observation 1. Remark 1. Remark 2. Remark 3. Remark 4. Remark 5. Remark 6. Theorem A [2]. Theorem B [2].

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

Single-Facility Scheduling over Long Time Horizons by Logic-based Benders Decomposition

Canonical transformations

Complete subgraphs in multipartite graphs

Two-Phase Low-Energy N-Modular Redundancy for Hard Real-Time Multi-Core Systems

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

A CLASS OF RECURSIVE SETS. Florentin Smarandache University of New Mexico 200 College Road Gallup, NM 87301, USA

A 2D Bounded Linear Program (H,c) 2D Linear Programming

O-line Temporary Tasks Assignment. Abstract. In this paper we consider the temporary tasks assignment

MODELING TRAFFIC LIGHTS IN INTERSECTION USING PETRI NETS

Min Cut, Fast Cut, Polynomial Identities

Appendix B. The Finite Difference Scheme

Learning Theory: Lecture Notes

Online Appendix: Reciprocity with Many Goods

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

The Second Anti-Mathima on Game Theory

χ x B E (c) Figure 2.1.1: (a) a material particle in a body, (b) a place in space, (c) a configuration of the body

arxiv: v1 [math.ho] 18 May 2008

Transcription:

Fxed-Prorty Multprocessor Schedulng wth Lu & Layland s Utlzaton Bound Nan Guan, Martn Stgge, Wang Y and Ge Yu Department of Informaton Technology, Uppsala Unversty, Sweden Department of Computer Scence and Technology, Northeastern Unversty, Chna Abstract Lu and Layland dscovered the famous utlzaton bound N(2 N 1 1) for fxed-prorty schedulng on sngleprocessor systems n the 1970 s. Snce then, t has been a long standng open problem to fnd fxed-prorty schedulng algorthms wth the same bound for multprocessor systems. In ths paper, we present a parttonng-based fxed-prorty multprocessor schedulng algorthm wth Lu and Layland s utlzaton bound. Keywords-real-tme systems; utlzaton bound; multprocessor; fxed prorty schedulng I. INTRODUCTION Utlzaton bound s a well-known concept frst ntroduced by Lu and Layland n ther semnal paper [18]. Utlzaton bound can be used as a smple and practcal way to test the schedulablty of real-tme task sets, as well as a good metrc to evaluate the qualty of a schedulng algorthm. It was shown that the utlzaton bound of Rate Monotonc Schedulng (RMS) on sngle processors s N(2 1 N 1). For smplcty of presentaton we let = N(2 1 N 1). Multprocessor schedulng are usually categorzed nto two paradgms [10]: global schedulng, n whch each task can execute on any avalable processor n the run tme, and parttoned schedulng n whch each tasks s assgned to a processor beforehand, and durng the run tme each task can only execute on ths partcular processor. Although global schedulng on average utlzes computng resource better, the best known utlzaton bound of global fxed-prorty schedulng s only 38% [3], whch s much lower than the best known result of parttoned fxed-prorty schedulng 50% [7]. 50% s also known as the maxmum utlzaton bound for both global and parttoned fxed-prorty schedulng [4], [19]. Although there exst schedulng algorthms, lke the pfar famly [2], [9], offerng utlzaton bounds of 100%, these schedulng algorthms are not prorty-based and ncur much hgher context-swtch overhead [11]. Recently a number of works have been done on the semparttoned schedulng, whch can exceed the maxmum Ths work was partally sponsored by CoDeR-MP, UPMARC, and NSF of Chna under Grant No. 60973017 and 60773220. utlzaton bound 50% of the parttoned schedulng. In semparttoned schedulng, most tasks are statcally assgned to one fxed processor as n parttoned schedulng, whle a few number of tasks are splt nto several subtasks, whch are assgned to dfferent processors. A recent work [17] has shown that the worst-case utlzaton bound of semparttoned fxed-prorty schedulng can acheve 65%, whch s stll lower than 69.3% (the worst-case value of when N s ncreasng to the nfnty). Ths gap s even larger wth a smaller N. In ths paper, we propose a new fxed-prorty schedulng algorthm for multprocessor systems based on semparttoned schedulng, whose utlzaton bound s. The algorthm uses RMS on each processor, and has the same task splttng overhead as n prevous work. We frst propose a sem-parttoned fxed-prorty schedulng algorthm, whose utlzaton bound s for a class of task sets n whch the utlzaton of each task s no larger than /(1 + ). Ths algorthm assgns tasks n decreasng perod order, and always selects the processor wth the least workload assgned so far among all processors, to assgn the next task. Then we remove the constrant on the utlzaton of each task, by ntroducng an extra task pre-assgnng mechansm; the algorthm can acheve the utlzaton bound of for any task set. The rest of the paper s structured as follows: Secton II revews the pror work on sem-parttoned schedulng; Secton III ntroduces the notatons and the basc concept of sem-parttoned schedulng. The frst and second proposed algorthm, as well as ther worst-case utlzaton bound property, s presented n Secton IV and V respectvely. Fnally, the concluson s made n Secton VI. II. PRIOR WORK Sem-parttoned schedulng has been studed wth both EDF schedulng [1], [8], [5], [6], [12], [13], [16] and fxedprorty schedulng [14], [15], [17]. The frst sem-parttoned schedulng algorthm s EDFfm [1] for soft real-tme systems based on EDF schedulng. Andersson et al. proposed EKG [8] for hard real-tme systems, n whch splt tasks are forced to executed n certan

tme slots. Later EKG was extended to sporadc and arbtrary deadlne task systems [5] [6] wth the smlar dea. Kato et al. proposed EDDHP and EDDP [12] [13] n whch splt tasks are scheduled based on prorty rather than tme slots. The worst-case utlzaton bound of EDDP s 65%. Later Kato et al. proposed EDF-WM, whch can sgnfcantly reduce the context swtch overhead aganst prevous work. There are relatvely fewer works on the fxed-prorty schedulng sde. Kato et al. proposed RMDP [14] and DMPM [15], both wth the worst-case utlzaton bound of 50%. whch s the same as the parttoned schedulng wthout task splttng. Recently, Lakshmanan et al. [17] proposed the algorthm PDMS HPTS DS, whch can acheve the worstcase utlzaton bound of 65%, and can acheve the bound 69.3% for a specal type of task sets that consst of lght tasks. They also conducted case studes on an Intel Core 2 Duo processor to characterze the practcal overhead of task-splttng, and showed that the cache overheads due to task-splttng can be expected to be neglgble on mult-core platforms. III. BASIC CONCEPTS We frst ntroduce the processor platform and task model. The multprocessor platform conssts of M dentcal processors {P 1, P 2,...P M }. A task set τ = {τ 1, τ 2,..., τ N } conssts of N ndependent tasks. Each task τ s a 2-tuple C, T, where C s the worst-case executon tme, T s the mnmum nter-release separaton (also called perod). T s also τ s relatve deadlne. Tasks n τ are sorted n non-decreasng perod order,.e., j > T j T. Snce our proposed algorthms use ratemonotonc schedulng (RMS) as the schedulng algorthm on each processor, we can use the task ndces to represent the task prortes,.e., τ has hgher prorty than τ j f and only f < j. The utlzaton of each task τ s defned as U = C /T. We recall the classcal result of Lu and Layland: Theorem 1 ([18]). On a sngle-processor system, each task set τ wth U N(2 1 N 1) τ τ s schedulable usng rate-monotonc schedulng (RMS). The utlzaton bound of our proposed sem-parttoned schedulng algorthm s bult upon ths result. In the remander of ths paper, we use to denote the above utlzaton bound for N tasks: = N(2 1 N 1) (1) We further defne the utlzaton of a task set τ n multprocessor schedulng on M processors as U(τ) = τ τ U /M (2) 1 τ 1 2 r r r+r 1 τ 2 3 τ 3 r R 1 R 2 T R 1 r+r 1 +R 2 Fgure 1. T -R 1 -R 2 Subtasks For smplcty of presentng our algorthms, we assume each task τ τ has utlzaton U. Note that ths assumpton does not nvaldate our results on task sets contanng tasks wth utlzaton hgher than : If n a task set wth U(τ) there are tasks wth a hgher (ndvdual) utlzaton than, we can just let them run each exclusvely on an own processor. The remanng task set on the remanng processors stll has a utlzaton of at most. If we are able to show ts schedulablty, then together ths results n the desred bound for the full task set. A sem-parttoned schedulng algorthm conssts of two parts: the parttonng algorthm, whch determnes how to splt and assgn each task (or rather each of ts parts) to a fxed processor, and the schedulng algorthm, whch determnes how to schedule the tasks assgned to each processor. Wth the parttonng algorthm, most tasks are assgned to a processor and only execute on ths processor at run tme. We call these tasks non-splt tasks. Other tasks are called splt tasks, whch are splt nto several subtasks. Each subtask of splt task τ s assgned to (thereby executes on) a dfferent processor, and the sum of the executon tme of all subtasks equals C. For example, n Fgure 1 the task τ s splt nto three subtasks τ 1, τ 2 and τ 3, executng on processor P 1, P 2 and P 3, respectvely. The subtasks of a task need to be synchronzed to execute correctly. For example, n Fgure 1, τ 2 can not start executon untl τ 1 s fnshed. Ths equals deferrng the actual ready tme of τ 2 by up to R1 (relatve to τ s orgnal release tme), where R 1 s the worst-case response tme of τ 1. One can regard ths as shortenng the actual relatve deadlne of τ 2 by up to R 1. Smlarly, the actual ready tme of τ 3 s deferred by up to R 1 + R2, and τ 3 s actual relatve deadlne s shortened by up to R 1 + R2. We use τ k to denote the k th subtask of a splt task τ, and defne τ k s synthetc deadlne as k = T R l (3) l [1,k 1] Thus, we represent each subtask τ k by a 3-tuple d

c k, T, k, n whch ck s the executon tme of τ k, T s the orgnal perod, k s the synthetc deadlne. For consstency, each non-splt task τ can be represented by a sngle subtask τ 1 wth c 1 = C and 1 = T. The normal utlzaton of a subtask τ k s U k = c k /T, and we defne another new metrc, the synthetc utlzaton V k, to descrbe τ k s workload wth ts synthetc deadlne: V k = c k / k (4) We call the last subtask of τ ts tal subtask, denoted by τ t and the other subtasks ts body subtasks, as shown n Fgure 1. We use τ bj to denote the j th body subtask. We use τ P q to denote that τ s assgned to processor P q, and say that P q s the host processor of τ. A task set τ s schedulable under a sem-parttoned schedulng algorthm A, f after assgnng tasks to processors by A s parttonng algorthm, each task τ τ can meet ts deadlne under A s schedulng algorthm. IV. THE FIRST ALGORITHM SPA1 A sgnfcant dfference between SPA1 and the algorthms n prevous work s that SPA1 employs a worst-ft parttonng, whle all prevous algorthms employ a frst-ft parttonng [17], [14], [15]. The basc procedure of frst-ft parttonng s as follows: one selects a processor, and assgn tasks to ths processor as much as possble to fll ts capacty, then pck the next processor and repeat the procedure. In contrast, the worst-ft parttonng always selects the processor wth the mnmal total utlzaton of tasks that have been assgned to t, so the occuped capactes of all processors are ncreased roughly n turn. The reason for us to prefer worst-ft parttonng s ntutvely explaned as follows. A subtask τ k s actual deadlne ( k ) s shorter than τ s orgnal deadlne T, and the sum of the synthetc utlzatons of all τ s subtasks s larger than τ s orgnal utlzaton U, whch s the key dffculty for sem-parttoned schedulng to acheve the same utlzaton bound as on sngle-processors. Wth worst-ft parttonng, the occuped capacty of all processors are ncreased n turn, and task splttng only occurs when the capacty of a processor s completely flled. Then, f one parttons all tasks n ncreasng prorty order, the splt tasks n worstft parttonng wll generally have relatvely hgh prorty levels on each processor. Ths s good for the schedulablty of the task set, snce the tasks wth hgh prortes usually have better chance to be schedulable, so they can tolerate the shortened deadlnes better. Consder an extreme scenaro: f one can make sure that all splt tasks subtasks have the hghest prorty on ther host processors, then there s no need to consder the shortened deadlnes of these subtasks, snce, beng of the hghest prorty level on each processor, they are schedulable anyway. Thus, as long as the splt tasks wth shorten deadlnes do not cause any problem, Lu and Layland s utlzaton bound can be easly acheved. The phlosophy behnd our proposed algorthms s makng the splt subtasks get as hgh prorty as possble on each processor. In contrast, wth the frst-ft parttonng, a splt subtask may get qute low prorty on ts host processors 1. For nstance, wth the algorthm n [17] that acheves the utlzaton bound of 65%, n the worst case the second subtask of a splt task wll always get the lowest prorty on ts host processor. As wll be seen later n ths secton, SPA1 does not completely solve the problem. More precsely, SPA1 s restrcted to a class of lght task sets, n whch the utlzaton of each task s no larger than /(1+). Intutvely, ths s because f a task s utlzaton s very large, ts tal subtask mght stll get a relatvely low prorty on ts host processor, even usng worst-ft parttonng. (We wll solve ths problem wth SPA2 n Secton V.) In the followng, we wll ntroduce SPA1 as well as ts utlzaton bound property. The remanng part of ths secton s structured as follows: we frst present the parttonng algorthm of SPA1, and show that any task set τ satsfyng U(τ) can be successfully parttoned by SPA1. Then we ntroduce how the tasks assgned to each processor are scheduled. Next, we prove that f a lght task set s successfully parttoned by SPA1, then all tasks can meet ther deadlnes under the schedulng algorthm of SPA1. Together, ths mples that any lght task set wth U(τ) s schedulable by SPA1, and fnally ndcates the utlzaton bound of SPA1 s for lght task sets. 1: f U(τ) > then abort 2: UQ := [τn 1, τ N 1 1,..., τ 1 1 ] 3: Ψ[1...M] := all zeros 4: whle UQ do 5: P q := the processor wth the mnmal Ψ 6: τ k := pop front(uq) 7: f (U k + Ψ[q] ) then 8: τ k P q 9: Ψ[q] := Ψ[q] + U k 10: else 11: splt τ k nto two parts τ k and τ k+1 such that U k + Ψ[q] = 12: τ k P q 13: Ψ[q] := 14: push front(τ k+1, UQ) 15: end f 16: end whle Algorthm 1: The parttonng algorthm of SPA1. 1 Under the algorthms n [15], a splt subtask s prorty s artfcally advanced to the hghest level on ts host processor, whch breaks down the RMS prorty order and thereby leads to a lower utlzaton bound.

A. SPA1: Parttonng and Schedulng The parttonng algorthm of SPA1 s very smple, whch can be brefly descrbed as follows: We assgn tasks n ncreasng prorty order, and always select the processor on whch the total utlzaton of tasks have been assgned so far s mnmal among all the processors. When a task (subtask) can not be assgned entrely to the current selected processor, we splt t nto two parts and assgn the frst part such that the total utlzaton of the current selected processor s, and assgn the second part to the next selected processor. The precse descrpton of the parttonng algorthm s n Algorthm 1. U Q s the lst accommodatng unassgned tasks, sorted n ncreasng prorty order. U Q s ntalzed by {τn 1, τ N 1 1,..., τ 1 1 }, n whch each element τ 1 = c 1 = C, T, 1 k = T s the ntal subtask form of task τ. Each element Ψ[q] n the array Ψ[1...M] denotes the sum of the utlzaton of tasks that have been assgned to processor P q. The work flow of SPA1 s as follows. In each loop teraton, we pck the task at the front of UQ, denoted by τ k, whch has the lowest prorty among all unassgned tasks. We try to assgn τ k to the processor P q, whch has the mnmal Ψ[q] among all elements n Ψ[1...M]. If U k + Ψ[q] then we can assgn the entre τ k to P q, snce there s enough capacty avalable on P q. Otherwse, we splt τ k nto two subtasks τ k and τ k+1, such that U k + Ψ[q] = (Note that wth U k = c k /T we denote the utlzaton of subtask τ k.) We further set Ψ[q] :=, whch means ths processor P q s full and we wll not assgn any more tasks to P q. Then we nsert τ k+1 back to the front of UQ, to assgn t n the next loop teraton. We contnue ths procedure untl all tasks have been assgned. It s easy to see that all task sets below the desred utlzaton bound can be successfully parttoned by SPA1: Lemma 1. Any task set wth U(τ) (5) can be successfully parttoned to M processors wth SPA1. Note that there s no schedulablty guarantee n the parttonng algorthm. It wll be proved n next subsecton. After the tasks are assgned (and possbly splt) to the processors by the parttonng algorthm of SPA1, they wll be scheduled usng RMS on each processor locally,.e., wth ther orgnal prortes. The subtasks of a splt task respect ther precedence relatons,.e., a splt subtask τ k s ready for executon when ts precedng subtask τ k 1 on some other processor has fnshed. release of τ ready for τ k release of τ ready for τ k c j j<k k T c j Fgure 2. Each subtask τ k can be vewed as an ndependent task wth perod of T and deadlne of k. B. Schedulablty We frst show an mportant property of SPA1: Lemma 2. After parttonng accordng to SPA1, each body subtask has the hghest prorty on ts host processor. Proof: In the parttonng algorthm of SPA1, task splttng only occurs when a processor s full. Thus, after a body task was assgned to a processor, there wll be no more tasks assgned to t. Further, the tasks are parttoned n ncreasng prorty order, so all tasks assgned to the processor before have lower prorty. By Lemma 2, we further know that the response tme of each body subtask equals ts executon tme, so the synthetc deadlne t of each tal subtask τ t s calculated as follows: t = T j<k k c bj = T (C c t ) (6) So we can vew the schedulng n SPA1 on each processor wthout consderng the synchronzaton between the subtasks of a splt task, and just regard every splt subtask τ k as an ndependent task wth perod T and a shorter relatve deadlne k calculated by Equaton (6), as shown n Fgure 2. In the followng we prove the schedulablty of non-splt tasks, body subtasks and tal subtasks, respectvely. 1) Non-splt Tasks: Lemma 3. If task set τ wth U(τ) s parttoned by SPA1, then any non-splt task of τ can meet ts deadlne. Proof: The tasks on each processor are scheduled by RMS, and the sum of the utlzaton of all tasks on a processor s no larger than. Further, the deadlnes of the non-splt tasks are unchanged and therefore stll equal ther perods. Thus, each non-splt task s schedulable. Note that although the synthetc deadlnes of other subtasks are shorter than ther orgnal perods, ths does not affect the schedulablty of the non-splt tasks, snce only the perods of these subtasks are relevant to the schedulablty of the non-splt tasks. 2) Body Subtasks: Lemma 4. If task set τ wth U(τ) s parttoned by SPA1, then any body subtask of τ can meet ts deadlne. Proof: The body subtasks have the hghest prortes on ther host processors and wll therefore always meet

U b1 U b2 U bb Y t U t hgh prorty...... b1 X Xb2 XbB Xt P b1 P b2 P bb P t low prorty (a) Γ (b) Γ Fgure 3. Illustraton of X b j, X t and Y t Fgure 4. Illustraton of Γ ther deadlnes. (Ths holds even though the deadlnes were shortened because of the task splttng). 3) Tal Subtasks: Now we prove the schedulablty for an arbtrary tal subtask τ t, durng whch we only focus on τ t, but do not consder whether other tal subtasks are schedulable or not. Snce the same reasonng can be appled to every tal subtask, the proofs guarantee that all tal subtasks are schedulable. Suppose task τ s splt nto B body subtasks and one tal subtask. Recall that we use τ bj, j [1, B] to denote the j th body subtask of τ, and τ t to denote τ s tal subtask. U bj = c bj /T and U t = c t /T denotes τ bj s and τ t s orgnal utlzaton respectvely. Addtonally, we use the followng notatons (cf. Fgure 3): For each body subtask τ bj, let X bj denote the sum of the utlzatons of all the tasks τ k assgned to P bj wth lower prorty than τ bj. For the tal subtask τ t, let Xt denote the sum of the utlzatons of all the tasks assgned to P t wth lower prorty than τ t. For the tal subtask τ t, let Y t denote the sum of the utlzatons of all the tasks assgned to P t wth hgher prorty than τ t. We can use these now for the schedulablty of the tal subtasks: Lemma 5. Suppose a tal subtask τ t s assgned to processor P t. If τ t satsfes then τ t Y t T / t + V t, (7) can meet ts deadlne. Proof: The proof dea s as follows: We consder the set Γ consstng of τ t and all tasks wth hgher prorty than τ t on the same processor,.e., the tasks contrbutng to Y t. For ths set, we construct a new task set Γ, n whch the tasks perods that are larger than t are all reduced to t. The man dea s to frst show that the counterpart of τ t s schedulable wth ths new set Γ by RMS because of the utlzaton bound, and then to prove ths mples the schedulablty of τ t n the orgnal set Γ. In partcular, let P t be the processor to whch τ t s assgned. We defne Γ as follows: Γ = {τ k h τ k h P t h } (8) We now gve the constructon of Γ: For each task τh k Γ, we have a counterpart τ h k n Γ. The only dfference s that we possbly reduce the perods: { c k h = T h, f T h t ck h, Th = t, f T h > t We also keep the same prorty order of tasks n Γ as ther counterparts n Γ, whch s stll a rate-monotonc orderng. Fgure 4 llustrates the constructon. In Fgure 4(a), Γ contans three tasks. τ 1 has a perod that s smaller than t, and τ 2 has a larger one. Further, τ t s contaned n Γ. Accordng to the constructon, Γ n Fgure 4(b) has also three tasks τ 1, τ 2 and τ t, where only the perods of τ 2 and τ t are reduced to t. Now we show the schedulablty of τ t n Γ. We do ths by showng the suffcent upper bound of on the total utlzaton of Γ. U( Γ) = c k h/ T h = c k h/ T h + V k (9) τh k Γ τh k Γ\{τ t} We now do a case dstncton for tasks τ h k Γ, accordng to whether ther perods were reduced or not. If T h t, we have T h = T h. Snce T > t, we have: c k h/ T h = c k h/t h = U k h < U k h T / t If T h > t, we have T h = t. Because of the prorty ordered by perods, we have T h T. Thus: c k h/ T h = c k h/ t c k h/t h T / t = U k h T / t Both cases lead to c k h / T h Uh k T / t, so we can apply ths to (9) from above: U( Γ) (10) τ k h Γ\{τ t } U k h T / t + V k

Snce Y t = τh k Γ\{τ t} U h k, we have: U( Γ) Y t T / t + V t Fnally, by the assumpton from Condton (7) we know that the rght-hand sde s at most, and thus U( Γ). Therefore, τ k s schedulable. Note that n Γ there could exst other tal subtasks whose deadlnes are shorter than ther perods. However, ths does not nvaldate that the condton U( Γ) s suffcent to guarantee the schedulablty of τ t under RMS. Now we need to see that ths mples the schedulablty of τ t. Recall that the only dfference between Γ and Γ s that the perod of a task n Γ s possbly larger than ts counterpart n Γ. So the nterference τ t suffered from the hgher-prorty tasks n Γ, s no larger than the nterference τ t suffered n Γ, and snce the deadlnes of τ t and τ t are the same, we know the schedulablty of τ t mples the schedulablty of τ t İt remans to show that Condton (7) holds, whch was the assumpton for ths lemma and thus a suffcent condton for tal subtasks to be schedulable. As n the ntroducton of ths secton, ths condton does not hold n general for SPA1, but only for certan lght task sets: Defnton 1. A task τ s a lght task f U 1 +. Otherwse, τ s a heavy task. A task set τ s a lght task sets f all tasks n τ are lght tasks. Lemma 6. Suppose a tal subtask τ t s assgned to processor P t. If τ s a lght task, we have Y t T / t + V t. Proof: We wll frst derve a general upper bound on Y t based on the propertes of X bj, X t and the subtasks utlzatons. Based on ths, we derve the bound we want to show, usng the assumpton that τ s a lght task. For dervng the upper bound on Y t, we note that as soon as a task s splt nto a body subtask and a rest, the processor hostng ths new body subtask s full,.e., ts utlzaton s. Further, each body subtask has by constructon the hghest prorty on ts host processor, so we have: j [1, B] : U bj + X bj = We sum over all B of these equatons, and get: U bj + X bj = B (11) Now we consder the processor contanng τ t, denoted by P t. Its total utlzaton s X t +U t +Y t and s at most,.e., X t + U t + Y t. We combne ths wth (11) and get: U bj Xbj Y t + U t X t (12) B B In order to smplfy ths, we recall that durng the parttonng phase, we always select the processor wth the smallest total utlzaton of tasks that have been assgned to t so far. (Recall lne 5 n Algorthm 1). Ths mples X bj X t for all subtasks τ bj. Thus, the sum over all X bj s bounded by B X t and we can cancel out both terms n (12): U bj Y t U t B Another smplfcaton s possble usng that B 1 and that τ s utlzaton U s the sum of the utlzatons of all of ts subtasks,.e., U bj = U U t: Y t U 2 U t We are now done wth the frst part,.e., dervng an upper bound for Y t. Ths can easly be transformed nto an upper bound on the term we are nterested n: Y t T t + V t (U 2 U t ) T t + V t (13) For the rest of the proof, we try to bound the rght-hand sde from above by whch wll complete the proof. The key s to brng t nto a form that s sutable to use the assumpton that τ s a lght task. As a frst step, we use that the synthetc deadlne of τ t s the perod T reduced by the total computaton tme of τ s body subtasks,.e., t = T (C c t ), cf. Equaton (6). Further, we use the defntons U = C /T, U t = c t /T and V t = c t / t to derve: (U 2 U t ) T t + V t C c t = T (C c t ) Snce c t > 0, we can fnd a smple upper bound of the rght-hand sde: C c t T (C c t ) = Snce τ s a lght task, we have T T (C c t ) 1 < U 1 + T T C 1 and by applyng U = C /T to the above, we can obtan T 1 T C Thus, we have establshed that s an upper bound of Y t T + V t t wth whch we started n (13). From Lemmas 5 and 6 t follows drectly the desred property: Lemma 7. If task set τ wth U(τ) s parttoned by SPA1, then any tal subtask of a lght task of τ can meet ts deadlne.

Table I AN EXAMPLE TASK SET 3 b 4 1 2 3 t 5 Fgure 5. The tal subtask of a task wth large utlzaton may have a low prorty level C. Utlzaton Bound By Lemma 1 we know that a task set τ can be successfully parttoned by the parttonng algorthm of SPA1 f U(τ) s no larger than. If τ has been successfully parttoned, by Lemma 3 and 4 we know that all the non-splt task and body subtasks are schedulable. By Lemma 7 we know a tal subtask τ k s also schedulable f τ s a lght task. Snce, n general, t s a pror unknown whch tasks wll be splt, we pose ths constrant of beng lght to all tasks n τ to have a suffcent schedulablty test condton: Theorem 2. Let τ be a task set only contanng lght tasks. τ s schedulable wth SPA1 on M processors f U(τ) (14) In other words, the utlzaton bound of SPA1 s for task sets only contanng tasks wth utlzaton no larger than /(1 + ). s a decreasng functon wth respect to N, whch means the utlzaton bound s hgher for task sets wth fewer tasks. We use N to denote the maxmal number of tasks (subtasks) assgned to on each processor, so Θ(N ), whch s strctly larger than also serves as the utlzaton bound on each processor. Therefore we can use Θ(N ) to replace n the dervaton above, and get that the utlzaton bound of SPA1 s Θ(N ) for task sets only contanng tasks wth utlzaton no larger than Θ(N )/(1 + Θ(N )). It s easy to see that there s at least one task assgned to each processor, and two subtasks of a task can not be assgned to the same processor. Therefore the number of tasks executng one each processor s at most N M + 1, whch can be used as an over-approxmaton of N. V. THE SECOND ALGORITHM SPA2 In ths secton we ntroduce our second sem-parttoned fxed-prorty schedulng algorthm SPA2, whch has the utlzaton bound of for task sets wthout any constrant. As dscussed n the begnnng of Secton IV, the key pont for our algorthms to acheve hgh utlzaton bounds s to make each splt task gettng a prorty as hgh as possble on ts host processor. Wth SPA1, the tal subtask of a task wth Task C T Heavy Task? Prorty τ 1 3 4 yes hghest τ 2 4.25 10 no mddle τ 3 4.25 10 no lowest very large utlzaton could have a relatvely low prorty on ts host processor, as the example n Fgure 5 llustrates. Ths s why the utlzaton bound of SPA1 s not applcable to task sets contanng heavy tasks. To solve ths problem, we propose the second semparttoned algorthm SPA2 n ths secton. The man dea of SPA2 s to pre-assgn each heavy task whose tal subtask mght get a low prorty, before parttonng other tasks, therefore these heavy tasks wll not be splt. Note that f one smply pre-assgns all heavy tasks, t s stll possble for some tal subtask to get a low prorty level on ts host processor. Consder the task set n Table I wth 2 processors, and for smplcty we assume = 0.8, and /(1 + ) = 4/9. If we pre-assgn the heavy task τ 1 to processor P 1, then assgn τ 2 and τ 3 by the parttonng algorthm of SPA1, the task parttonng looks as follows: 1) τ 1 P 1, 2) τ 3 P 2, 3) τ 2 can not be entrely assgned to P 2, so t s splt nto two subtasks τ 1 2 = 3.75, 10, 10 and τ 2 2 = 0.5, 10, 6.25, and τ 1 2 P 2, 4) τ 2 2 P 1. Then the tasks on each processor are scheduled by RMS. We can see that the tal subtask τ2 2 has the lowest prorty on P 1 and wll mss ts deadlne due to the hgher prorty task τ 1. However, f we do not pre-assgn τ 1 and just do the parttonng wth SPA1, ths task set s schedulable. To overcome ths problem, a more sophstcated preassgnng mechansm s employed n our second algorthm SPA2. Intutvely, SPA2 pre-assgns exactly those heavy tasks for whch pre-assgnng them wll not cause any tal subtask to mss deadlne. Ths s checked usng a smple test. Those heavy tasks that don t satsfy ths test wll be assgned (and possbly splt) later together wth the lght tasks. The key for ths to work s, that for these heavy tasks, we can use the property of falng the test n order to show that ther tal subtasks wll not mss the deadlnes ether. A. SPA2: Parttonng and Schedulng We frst ntroduce some notatons. If a heavy task τ s pre-assgned to a processor P q n SPA2, we call τ as a pre-assgned task, otherwse a normal task, and call P q as a pre-assgned processor, otherwse a normal processor. The parttonng algorthm of SPA2 contans three steps: 1) We frst pre-assgn the heavy tasks that satsfy a partcular condton to one processor each.

1: f U(τ) > then abort 2: P Q := [P 1, P 2,..., P M ] 3: P Q pre := 4: UQ := 5: Ψ[1...M] := all zeros 6: for := 1 to N do 7: f τ s heavy j> U j ( P Q 1) then 8: P q := pop front(p Q) 9: Pre-assgn τ to P q 10: push front(p q, P Q pre ) 11: Ψ[q] := Ψ[q] + U 12: else 13: push front(τ 1, UQ) 14: end f 15: end for 16: whle UQ do 17: τ k := pop front(uq) 18: f P q P Q : Ψ[q] then 19: P q := the element n P Q wth the mnmal Ψ 20: else 21: P q := pop front(p Q pre ) 22: end f 23: f U k + Ψ[q] then 24: τ k P q 25: 26: Ψ[q] := Ψ[q] + U k f P q came from P Q pre then 27: push front(p q, P Q pre ) 28: end f 29: else 30: splt τ k nto two parts τ k and τ k+1 such that U k + Ψ[q] = 31: τ k P q 32: Ψ[q] = 33: push front(τ k+1, UQ) 34: end f 35: end whle Algorthm 2: The parttonng algorthm of SPA2. 2) We do task parttonng wth the remanng (.e. normal) tasks and remanng (.e. normal) processors usng SPA1 untl all the normal processors are full. 3) The remanng tasks are assgned to the pre-assgned processors; the assgnment selects one processor to assgn as many tasks as possble, untl t becomes full, then select the next processor. The precse descrpton of the parttonng algorthm of SPA2 s shown n Algorthm 2. We frst ntroduce the data structures used n the algorthm: P Q s the lst of all processors. It s ntally [P 1, P 2,..., P M ] and processors are always taken out and put back n the front. Table II AN EXAMPLE DEMONSTRATING SPA2 Task C T Heavy Task? Prorty τ 1 0.5 10 no hghest τ 2 4.5 10 yes τ 3 6 10 yes τ 4 4 10 no τ 5 3 10 no τ 6 6 10 yes τ 7 3 10 no lowest P Q pre s the lst to accommodate pre-assgned processors, ntally empty. UQ s the lst to accommodate the unassgned tasks after Step 1). Intally t s empty, and durng Step 1), each task τ that s determned not to be pre-assgned wll be put nto UQ (already n ts subtask form τ 1 ). Ψ[1...M] s an array, whch has the same meanng as n SPA1: each element Ψ[q] n the array Ψ[1...M] denotes the sum of the utlzaton of tasks that have been assgned to processor P q. In the followng we use the task set example n Table II wth 4 processors to demonstrate how the parttonng algorthm of SPA2 works. For smplcty, we assume = 0.7, then the utlzaton threshold for lght tasks /(1 + ) s around 0.41. The ntal state of the data structures are as follows: P Q = [P 1, P 2, P 3, P 4 ] P Q pre = UQ = Ψ[1...4] = [0, 0, 0, 0] In Step 1) (lnes 6 to 15), each task τ n τ s vsted n ncreasng ndex order,.e., decreasng prorty order. If τ s a heavy task, we evaluate the followng condton (lne 7): U j ( P Q 1) (15) j> n whch P Q s the number of processors left n P Q so far. A heavy task τ s determned to be pre-assgned to a processor f ths condton s satsfed. The ntuton for ths s: If we pre-assgn ths task τ, then there s enough space on the remanng processors to accommodate all remanng lower prorty tasks. That way, no lower prorty tal subtask wll end up on the processor whch we assgn τ to. In our example, we frst vst the frst task τ1 1. It s a lght task, so we put t to the front of UQ (lne 13). The next task τ 2 s heavy, but Condton (15) wth P Q = 4 s not satsfed, so we put τ2 1 to the front of UQ. The next task τ 3 s heavy, and Condton (15) wth P Q = 4 s satsfed. Thus, we pre-assgn τ 3 to P 1, and put P 1 to the front of P Q pre (lnes 8 to 10). τ 4 and τ 5 are both lght tasks, so we put them to UQ respectvely. τ 6 s heavy, and Condton (15) wth P Q = 3 (P 1 has been taken out from P Q and put nto P Q pre ) s satsfed, so we pre-assgn τ 5 to P 2, and

put P 2 to the front of P Q pre. The last task τ 7 s lght, so t s put to the front of UQ. So far, the Step 1) phase has been fnshed, and the state of the data structures s as follows: P Q = [P 3, P 4 ] P Q pre = [P 2, P 1 ] UQ = [τ7 1, τ5 1, τ4 1, τ2 1, τ1 1 ] Ψ[1...4] = [0.6, 0.6, 0, 0] Note that the processors n P Q pre are n decreasng prorty order of the pre-assgned tasks on them, and the tasks n U Q are n decreasng prorty order. Step 2) and 3) are both n the whle loop of lne 16 35. In Step 2), the remanng tasks (whch are now n UQ) are assgned to normal processors (the ones n P Q). Only as soon as all processors n P Q are full, the algorthm enters Step 3), n whch tasks are assgned to processors n P Q pre (decson n lnes 18 to 22). The operaton of assgnng a task τ k (lnes 23 to 34) s bascally the same as n SPA1. If τ k can be entrely assgned to P q wthout task splttng, then τ k P q and Ψ[q] s updated (lnes 24 to 28). If P q s a pre-assgned processor, P q s put back to the front of P Q pre (lnes 26 to 28), so that t wll be selected agan n the next loop teraton, otherwse no puttng back operaton s needed snce we never take out elements from P Q, but just select the proper one n t (lne 19). If τ k can not be assgned to P q entrely, τ k s splt nto a new τ k and another subtask τ k+1, such that P q becomes full after the new τ k beng assgned to t, and then we put τ k+1 back to UQ (see lnes 29 to 33). Note that there s an mportant dfference between assgnng tasks to normal processors and to pre-assgned processors. When tasks are assgned to normal processors, the algorthm always selects the processor wth the mnmal Ψ (the same as n SPA1). In the contrast, when tasks are assgned to pre-assgned processors, always the processor at the front of P Q pre s selected,.e., we assgn as many tasks as possble to the processor n P Q pre whose preassgned task has the lowest prorty, untl t s full. As wll be seen later n the schedulablty proof, ths partcular order of selectng pre-assgned processors, together wth the evaluaton of Condton (15), s the key to guarantee the schedulablty of heavy tasks. Wth our runnng example, the remanng tasks are frst assgned to the normal processors P 3 and P 4 n the same way as by SPA1. Thus, τ7 1 P 3, then τ5 1 P 4, then τ4 1 P 3, then τ2 1 s splt nto τ2 1 = 4, 10, 10 and τ2 2 = 0.5, 10, 6, and τ2 1 P 4. So far, all normal processors are full, and the state of the data structures s as follows: P Q = [P 3, P 4 ] (both P 3 and P 4 are full) P Q pre = [P 2, P 1 ] UQ = [τ2 2, τ1 1 ] Ψ[1...4] = [0.6, 0.6, 0.7, 0.7] Then the remanng tasks n UQ are assgned to the preassgned processors. At frst τ2 2 P 2, after whch P 2 s not full and stll at the front of P Q pre. So the next task τ1 1 s also assgned to P 2. There s no unassgned task any more, so the algorthm termnates. It s easy to see that any task set below the desred utlzaton bound can be successfully parttoned by SPA2: Lemma 8. Any task set wth U(τ) can be successfully parttoned to M processors wth SPA2. After descrbng the parttonng part of SPA2, we also need to descrbe the schedulng part. It s the same as SPA1: on each processor the tasks are scheduled by RMS, respectng the precedence relatons between the subtasks of a splt task,.e., a subtask s ready for executon as soon as the executon of ts precedng subtask has been fnshed. Note that under SPA2, each body subtask s also wth the hghest prorty on ts host processor, whch s the same as n SPA1. So we can vew the schedulng on each processor as the RMS wth a set of ndependent tasks, n whch each subtask s deadlne s shortened by the sum of the executon tme of all ts precedng subtasks. B. Propertes Now we ntroduce some useful propertes of SPA2. Lemma 9. Let τ be a heavy task, and there are η preassgned tasks wth hgher prorty than τ. Then we know If τ s a pre-assgned task, t satsfes U j (M η 1) (16) j> If τ s not a pre-assgned task, t satsfes U j > (M η 1) (17) j> Proof: The proof drectly follows the parttonng algorthm of SPA2. Lemma 10. Each pre-assgned task has the lowest prorty on ts host processor. Proof: Wthout loss of generalty, we sort all processors n a lst Q as follows: we frst sort all pre-assgned processors n Q, n decreasng prorty order of the pre-assgned tasks on them; then the normal processors follow n Q n an arbtrary order. We use P x to denote the x th processor n Q. Suppose τ s a heavy task pre-assgned to P q. τ s a pre-assgned task, and the number of pre-assgned task wth hgher prorty than τ s q 1, so by Lemma 9 we know the followng condton s satsfed: U j (M q) (18) j> In the parttonng algorthm of SPA2, normal tasks are assgned to pre-assgned processors only when all normal

processors are full, and the pre-assgned processors are selected n ncreasng prorty order of the pre-assgned tasks on them, so we know only when the processors P q+1...p M are all full, normal tasks can be assgned to processor P q. The total capacty of processors P q+1...p M are (M q) (n our algorthms a processor s full as soon as the total utlzaton on t s ), and by (18), we know when we start to assgn tasks to P q, the tasks wth lower prorty than τ all have been assgned to processors P q+1...p M, so all normal tasks (subtasks) assgned to P q have hgher prortes than τ. Lemma 11. Each body subtask has the hghest prorty on ts host processor. Proof: Gven a body subtask τ bj assgned to processor P bj. Snce task splttng only occur when a processor s full, and all the normal tasks are assgned n ncreasng prorty order, we know τ bj has the hghest prorty among all normal tasks on P bj. Addtonally, by lemma 10 we know that f P bj s a pre-assgned processor, the pre-assgned task on P bj also has lower prorty than τ bj. So we know P bj has the hghest prorty on P bj. C. Schedulablty By Lemma 11 we know that under SPA2 each body subtask has the hghest prorty on ts host processor, so we know all body subtasks are schedulable. The schedulng algorthm of SPA2 s stll RMS, and the deadlne of a non-splt task stll equals to ts perod, so the schedulablty of non-splt tasks can be proved n the same way as n SPA1 (Lemma 3). In the followng we wll prove the schedulablty of tal subtasks. Suppose τ s splt nto B body subtasks and one tal subtask. Recall that we use τ bj, j [1, B] to denote the j th body subtask of τ, and τ t to denote τ s tal subtask. X t, Y t and X bj are defned the same as n Secton IV-B. Frst we recall Lemma 5, whch s used to prove the schedulablty of tal subtasks n SPA1: f a tal subtask τ t satsfes Y t T / t + V t (19) τ t can meet ts deadlne. Ths concluson also holds for SPA2, snce the schedulng algorthm on SPA2 s also RMS, whch s only the relevant property requred by the proof of Lemma 5. So provng the schedulablty of tal subtasks s reduced to provng Condton (19) for tal subtasks under SPA2. We call τ t a tal-of-heavy f τ s heavy, otherwse a talof-lght. In the followng we prove Condton (19) for τ t n three cases: 1) τ t s a tal-of-lght, and P t s a normal processor, 2) τ t s a tal-of-lght, and P t s a pre-assgned processor, 3) τ t s a tal-of-heavy. Case 1) can be proved n the same way as n SPA1, snce both the parttonng and schedulng algorthm of SPA2 on normal processors are the same as SPA1. Actually one can regard the parttonng and schedulng of SPA2 on normal processors as the parttonng and schedulng wth a subset of tasks (those are assgned to normal processors) on a subset of processors (normal processors) of SPA1. So the schedulablty of τ t n ths case can be proved by exactly the same reasonng as for Lemma 6. Now we prove Case 2), where τ t s a tal-of-lght, and P t s a pre-assgned processor. Lemma 12. Suppose τ t s a tal-of-lght assgned to a preassgned processor P t under SPA2. We have Y t T / t + V t Proof: By Lemma 10 we know τ t has hgher prorty than the pre-assgned task of P t, so X t s no smaller than the utlzaton of ths pre-assgned task. And snce a preassgned task must be heavy, we have X t > 1 + On the other hand, snce τ s lght, we know C T 1 + (20) We use c B to denote the total executon tme of all τ s body tasks. Snce c B < C and < 1, we have c B 1 < 1 + T T (1 1 1 + ) < T c B T T c B ( 1 + ) < T t ( 1 + U t ) + V t < (21) By (21) and (20) we have T t ( X t U t ) + V t < and snce the total utlzaton on each processor s bounded by,.e., Y t X t U t fnally we have Y t T / t + V t <. Now we prove Case 3), where τ t s a tal-of-heavy. Note that n ths case P t can be ether a pre-assgned or a normal processor. Lemma 13. τ t s the tal subtask of a normal heavy task τ, then we have Y t T / t + V t

Proof: By the property n Lemma 9 concernng normal heavy tasks we know τ satsfes the condton U j > (M η 1) j> n whch η s the number of pre-assgned tasks wth hgher prorty than τ. We use M to denote the set of all processors, so M = M, and use H to denote the set of the pre-assgned processors on whch the pre-assgned tasks prortes are hgher than τ, so H = η, so we have: U j > (M H 1) (22) j> By Lemma 10 we know any normal task assgned to a preassgned processor has hgher prorty than the pre-assgned task of ths processor. Therefore, τ s body and tal subtasks are all assgned to processors n M \ H. Moreover, when we start to assgn τ, all tasks wth lower prorty than τ have already been assgned (pre-assgned) to processors n M \ H, snce pre-assgned tasks have already been assgned before dealng wth the normal tasks, and all normal tasks are assgned n ncreasng prorty order. We use K to denote the set of processors n M \ H that contan nether τ s body nor tal subtask, and for each processor P k K we use X k to denote the total utlzaton of the tasks wth lower prorty than τ assgned to P k. Then we have X t + X bj + k [1, K ] X k = j> Snce K = M H (B + 1), and P k K, X k, we have X t + U j (M H (B + 1)) X bj j> (23) By Inequaltes (22) and (23) we have X t + X bj > B (24) Now we look at processor P t, the total utlzaton on whch s bounded by, so we have: By (24) and (25) we have U j Y t X t U t (25) Y t (B and snce U t + U bj = U, we have Y t B U + X bj + X bj ) U t U bj (26) Snce each body task has the hghest prorty on ts host processor, and the total utlzaton of any processor contanng a body subtask s, we have l [1,B] X b l + By (26) and (27) we have Y t T / t + V t l [1,B] Y t U By applyng U = C /T and V t the above nequalty, we get U b l = B (27) ( U ) T / t + V t = c t / t to the RHS of Y t T / t + V t T / t C / t + c t / t (28) We use c B to denote the sum of the executon tme of all τ s body subtasks, so we have c t + cb = C and t = T c B. We apply these to the RHS of (28) and get Y t T / t + V t Snce < 1, we have T c B T c B c B > c B T c B < T c B (29) T c B T c B < (30) So by Inequaltes (29) and (30) we have D. Utlzaton Bound Y t T / t + V t < Now we have known that any task set τ wth U(τ) can be successfully parttoned on M processors by SPA2 (Lemma 8). In last subsecton, we have shown that under the schedulng algorthm of SPA2, the body subtasks are schedulable snce they are always wth the hghest prorty level on ther host processors; the non-splt tasks are also schedulable snce the utlzaton on each processor s bounded by. The schedulablty for the tal subtasks s proved by case dstncton, n whch the schedulablty for the lght tal subtasks on normal processors can be proved by the same reasonng as for Lemma 6, for the lght tal subtasks on pre-assgned processors s proved by Lemma 12, and for the heavy tal subtasks s proved by Lemma 13. So we have the followng theorem: Theorem 3. τ s schedulable by SPA2 on M processors f U(τ) So s the utlzaton bound of SPA2 for any task set.

For the same reason as presented at the end of Secton IV-B, we can use Θ(N ), the the maxmal number of tasks (subtasks) assgned to on each processor, to replace n Theorem 3. E. Task Splttng Overhead Wth the algorthms proposed n ths paper, a task could be splt nto more than two subtasks. However, snce the task splttng only occurs when a processor s full, for any task set that s schedulable by SPA2, the number of task splttng s at most M 1, whch s the same as n prevous sem-parttoned fxed-prorty schedulng algorthms [17], [14], [15], and as shown n case studes conducted n [17], ths overhead can be expected to be neglgble on mult-core platforms. VI. CONCLUSIONS AND FUTURE WORK In ths paper, we have developed a sem-parttoned fxedprorty schedulng algorthm for multprocessor systems, wth the well-known Lu and Layland s utlzaton bound for RMS on sngle processors. The algorthm enjoys the followng property. If the utlzaton bound s used for the schedulablty test, and a task set s determned schedulable by fxed-prorty schedulng on a sngle processor of speed M, t s also schedulable by our algorthm on M processors of speed 1 (under the assumpton that each task s executon tme on the processors of speed 1 s stll smaller than ts deadlne). Note that the utlzaton bound test s only suffcent but not necessary. As future work, we wll challenge the problem of constructng algorthms holdng the same property wth respects to the exact schedulablty analyss. REFERENCES [1] J. Anderson, V. Bud, and U.C. Dev. An edf-based schedulng algorthm for multprocessor soft real-tme systems. In Euromcro Conference on Real-Tme Systems (ECRTS), 2005. [2] J. Anderson and A. Srnvasan. Mxed pfar/erfar schedulng of asynchronous perodc tasks. In Journal of Computer and System Scences, 2004. [3] B. Andersson. Global statc prorty preemptve multprocessor schedulng wth utlzaton bound 38%. In Internatonal Conference on Prncples of Dstrbuted Systems (OPODIS), 2008. [4] B. Andersson, S. Baruah, and J. Jonsson. Statc prorty schedulng on multprocessors. In IEEE Real-Tme Systems Symposum (RTSS), 2001. [7] B. Andersson and J. Jonsson. The utlzaton bounds of parttoned and pfar statc-prorty schedulng on multprocessors are 50%. In Euromcro Conference on Real-Tme Systems (ECRTS), 2003. [8] B. Andersson and E. Tovar. Multprocessor schedulng wth few preemptons. In IEEE conference on Embedded and Real- Tme Computng Systems and Applcatons (RTCSA), 2006. [9] S. K. Baruah, N. K. Cohen, C. G. Plaxton, and D. A. Varvel. Proportonate progress: A noton of farness n resource allocaton. In Algorthmca, 1996. [10] John Carpenter, Shelby Funk, Phlp Holman, Anand Srnvasan, James Anderson, and Sanjoy Baruah. A Categorzaton of Real-Tme Multprocessor Schedulng Problems and Algorthms. 2004. [11] U. Dev and J. Anderson. Tardness bounds for global edf schedulng on a multprocessor. In IEEE Real-Tme Systems Symposum (RTSS), 2005. [12] S. Kato and N. Yamasak. Real-tme schedulng wth task splttng on multprocessors. In IEEE conference on Embedded and Real-Tme Computng Systems and Applcatons (RTCSA), 2007. [13] S. Kato and N. Yamasak. Portoned edf-based schedulng on multprocessors. In Internatonal Conference on Embedded Software (EMSOFT), 2008. [14] S. Kato and N. Yamasak. Portoned statc-prorty schedulng on multprocessors. In Internatonal Parellel and Dstrbuted Processng Symposum (IPDPS), 2008. [15] S. Kato and N. Yamasak. Sem-parttoned fxed-prorty schedulng on multprocessors. In IEEE Real-Tme and Embedded Technology and Applcatons Symposum (RTAS), 2009. [16] S. Kato, N. Yamasak, and Y. Ishkawa. Sem-parttoned schedulng of sporadc task systems on multprocessors. In Euromcro Conference on Real-Tme Systems (ECRTS), 2009. [17] K. Lakshmanan, R. Rajkumar, and J. Lehoczky. Parttoned fxed-prorty preemptve schedulng for mult-core processors. In Euromcro Conference on Real-Tme Systems (ECRTS), 2009. [18] C. L. Lu and J. W. Layland. Schedulng algorthms for multprogrammng n a hard-real-tme envronment. In Journal of the ACM, 1973. [19] D. Oh and T. P. Baker. Utlzaton bounds for n-processor rate monotone schedulng wth statc processor assgnment. In Real-Tme Systems, 1998. [5] B. Andersson and K. Bletsas. Sporadc multprocessor schedulng wth few preemptons. In Euromcro Conference on Real-Tme Systems (ECRTS), 2008. [6] B. Andersson, K. Bletsas, and S. Baruah. Schedulng arbtrary-deadlne sporadc task systems multprocessors. In IEEE Real-Tme Systems Symposum (RTSS), 2008.