showed that the SMAT algorithm generates shelf based schedules with an approximation factor of 8.53 [10]. Turek et al. [14] proved that a generalizati

Similar documents
2 Martin Skutella modeled by machine-dependent release dates r i 0 which denote the earliest point in time when ob may be processed on machine i. Toge

On-line Bin-Stretching. Yossi Azar y Oded Regev z. Abstract. We are given a sequence of items that can be packed into m unit size bins.

Minimizing Average Completion Time in the. Presence of Release Dates. September 4, Abstract

immediately, without knowledge of the jobs that arrive later The jobs cannot be preempted, ie, once a job is scheduled (assigned to a machine), it can

Improved Bounds on Relaxations of a Parallel Machine Scheduling Problem

A BEST-COMPROMISE BICRITERIA SCHEDULING ALGORITHM FOR PARALLEL TASKS

ONLINE SCHEDULING OF MALLEABLE PARALLEL JOBS

June 19, Abstract. We give a simple proof that, for any instance of a very general class of scheduling

Scheduling Parallel Jobs with Linear Speedup

Optimal on-line algorithms for single-machine scheduling

Minimizing Mean Flowtime and Makespan on Master-Slave Systems

Rate-monotonic scheduling on uniform multiprocessors

Real-time operating systems course. 6 Definitions Non real-time scheduling algorithms Real-time scheduling algorithm

SPT is Optimally Competitive for Uniprocessor Flow

Single machine scheduling with forbidden start times

Machine scheduling with resource dependent processing times

Complexity analysis of job-shop scheduling with deteriorating jobs

Lecture 13. Real-Time Scheduling. Daniel Kästner AbsInt GmbH 2013

Embedded Systems Development

A polynomial-time approximation scheme for the two-machine flow shop scheduling problem with an availability constraint

Networked Embedded Systems WS 2016/17

Embedded Systems 14. Overview of embedded systems design

bound of (1 + p 37)=6 1: Finally, we present a randomized non-preemptive 8 -competitive algorithm for m = 2 7 machines and prove that this is op

Polynomially solvable and NP-hard special cases for scheduling with heads and tails

Preemptive Online Scheduling: Optimal Algorithms for All Speeds

Real-Time Systems. Event-Driven Scheduling

Multiprocessor Scheduling of Age Constraint Processes

ON THE COMPLEXITY OF SOLVING THE GENERALIZED SET PACKING PROBLEM APPROXIMATELY. Nimrod Megiddoy

Algorithm Design. Scheduling Algorithms. Part 2. Parallel machines. Open-shop Scheduling. Job-shop Scheduling.

Ideal preemptive schedules on two processors

Embedded Systems 15. REVIEW: Aperiodic scheduling. C i J i 0 a i s i f i d i

Polynomial Time Algorithms for Minimum Energy Scheduling

M 2 M 3. Robot M (O)

TDDB68 Concurrent programming and operating systems. Lecture: CPU Scheduling II

S. ABERS Vohra [3] then gave an algorithm that is.986-competitive, for all m 70. Karger, Phillips and Torng [] generalized the algorithm and proved a

Partition is reducible to P2 C max. c. P2 Pj = 1, prec Cmax is solvable in polynomial time. P Pj = 1, prec Cmax is NP-hard

Load Regulating Algorithm for Static-Priority Task Scheduling on Multiprocessors

Clock-driven scheduling

Multiprocessor Scheduling II: Global Scheduling. LS 12, TU Dortmund

There are three priority driven approaches that we will look at

Average-Case Performance Analysis of Online Non-clairvoyant Scheduling of Parallel Tasks with Precedence Constraints

Combinatorial Algorithms for Minimizing the Weighted Sum of Completion Times on a Single Machine

A Robust APTAS for the Classical Bin Packing Problem

An on-line approach to hybrid flow shop scheduling with jobs arriving over time

Aperiodic Task Scheduling

Single Machine Scheduling with Job-Dependent Machine Deterioration

Priority-driven Scheduling of Periodic Tasks (1) Advanced Operating Systems (M) Lecture 4

A 2-Approximation Algorithm for Scheduling Parallel and Time-Sensitive Applications to Maximize Total Accrued Utility Value

Real-time scheduling of sporadic task systems when the number of distinct task types is small

Completion Time Scheduling and the WSRPT Algorithm

Bounding the End-to-End Response Times of Tasks in a Distributed. Real-Time System Using the Direct Synchronization Protocol.

Non-preemptive Fixed Priority Scheduling of Hard Real-Time Periodic Tasks

Optimal Utilization Bounds for the Fixed-priority Scheduling of Periodic Task Systems on Identical Multiprocessors. Sanjoy K.

On Non-Preemptive Scheduling of Periodic and Sporadic Tasks

Lecture 6. Real-Time Systems. Dynamic Priority Scheduling

On-line Scheduling to Minimize Max Flow Time: An Optimal Preemptive Algorithm

Scheduling problems in master-slave model

Online algorithms for parallel job scheduling and strip packing Hurink, J.L.; Paulus, J.J.

Single processor scheduling with time restrictions

Scheduling Lecture 1: Scheduling on One Machine

Schedulability analysis of global Deadline-Monotonic scheduling

Real-time Systems: Scheduling Periodic Tasks

Real-Time and Embedded Systems (M) Lecture 5

Online Scheduling of Parallel Jobs on Two Machines is 2-Competitive

The Maximum Flow Problem with Disjunctive Constraints

Real-Time Systems. Lecture #14. Risat Pathan. Department of Computer Science and Engineering Chalmers University of Technology

Task Models and Scheduling

ABSTRACT We consider the classical problem of online preemptive job scheduling on uniprocessor and multiprocessor machines. For a given job, we measur

SCHEDULING UNRELATED MACHINES BY RANDOMIZED ROUNDING

An exact approach to early/tardy scheduling with release dates

CIS 4930/6930: Principles of Cyber-Physical Systems

Analysis of Scheduling Algorithms with Reservations

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Task assignment in heterogeneous multiprocessor platforms

Non-preemptive multiprocessor scheduling of strict periodic systems with precedence constraints

The Tool Switching Problem Revisited

Select and Permute: An Improved Online Framework for Scheduling to Minimize Weighted Completion Time

Memorandum COSOR 97-23, 1997, Eindhoven University of Technology

Dependency Graph Approach for Multiprocessor Real-Time Synchronization. TU Dortmund, Germany

Multi-core Real-Time Scheduling for Generalized Parallel Task Models

Common-Deadline Lazy Bureaucrat Scheduling Problems

Equalprocessing and equalsetup time cases of scheduling parallel machines with a single server

Optimal Online Scheduling. Shang-Hua Teng x. December Abstract. We study the following general online scheduling problem.

Complexity of preemptive minsum scheduling on unrelated parallel machines Sitters, R.A.

Lower Bounds for Smith s Rule in Stochastic Machine Scheduling

Parallel machine scheduling with batch delivery costs

Batch delivery scheduling with simple linear deterioration on a single machine 1

University of Twente. Faculty of Mathematical Sciences. Scheduling split-jobs on parallel machines. University for Technical and Social Sciences

MICROPROCESSOR REPORT. THE INSIDER S GUIDE TO MICROPROCESSOR HARDWARE

Scheduling Adaptively Parallel Jobs. Bin Song. Submitted to the Department of Electrical Engineering and Computer Science. Master of Science.

On Machine Dependency in Shop Scheduling

A Framework for Scheduling with Online Availability

Complexity analysis of the discrete sequential search problem with group activities

On Two Class-Constrained Versions of the Multiple Knapsack Problem

1. Introduction Bottom-Up-Heapsort is a variant of the classical Heapsort algorithm due to Williams ([Wi64]) and Floyd ([F64]) and was rst presented i

Computers and Intractability. The Bandersnatch problem. The Bandersnatch problem. The Bandersnatch problem. A Guide to the Theory of NP-Completeness

Lecture 9 : PPAD and the Complexity of Equilibrium Computation. 1 Complexity Class PPAD. 1.1 What does PPAD mean?

Real-time Scheduling of Periodic Tasks (2) Advanced Operating Systems Lecture 3

Scheduling Lecture 1: Scheduling on One Machine

A fast 5/2-approximation algorithm for hierarchical scheduling

Transcription:

Preemptive Weighted Completion Time Scheduling of Parallel Jobs? Uwe Schwiegelshohn Computer Engineering Institute, University Dortmund, 441 Dortmund, Germany, uwe@carla.e-technik.uni-dortmund.de Abstract. In this paper we present a new algorithm for the o-line scheduling of parallel and independent jobs on a parallel processor system. To this end we introduce a machine model which is based on existing multiprocessors and accounts for the penalty of preemption. After examining the relation between the makespan and the total weighted completion time costs for the scheduling of parallel jobs it is shown that the new algorithm achieves a small approximation factor for both total weighted completion time and makespan scheduling. To ne tune the algorithm a fairly simple numerical optimization problem is derived. This way dierent preemption penalties can be considered when determining the approximation factor. Finally, we compare the costs of the generated preemptive schedules with those of non-preemptive schedules for the same problem. Key words. Scheduling, approximation algorithms AMS subject classications. 68M0, 68Q5, 90B35 1 Introduction Various scheduling problems have been addressed since the seventies and many of them have been shown to be NP-complete. Thus, many eorts to deal with these problems have been devoted to nding good approximation algorithms. The most commonly used scheduling objectives are the minimization of either the weighted completion time (response time, ow time) or the makespan. In the case of a multiprocessor, we can say informally that weighted completion time scheduling supports the viewpoint of a user who can inuence the completion time of his job by assigning a higher weight to it and consequently accepting higher costs. On the other hand makespan scheduling is closely related to the overall use of a multiprocessor. There, individual job weights are ignored. Thus, it generally reects the goal of the multiprocessor's owner, see also [5]. In this paper we address the problem of o-line scheduling of parallel and independent jobs with invariant resource requirements. Makespan scheduling of parallel and independent jobs is similar to bin packing and has been addressed frequently in the past. For instance Garey and Graham [6] have shown that a simple list schedule for a given job system achieves an approximation factor of. Weighted completion time scheduling of sequential jobs on a multiprocessor has also been subject of various research eorts. For this problem Kawaguchi and Kyan [8] presented a list scheduling algorithm called LF with a tight approximation factor of 1+p. However, the parallel problem has only recently been addressed when Schwiegelshohn et al.? A preliminary version of this paper appeared in the Proceedings of the European Symposium of Algorithms ESA'96, Springer Lecture Notes in Science Computer 1136 1

showed that the SMAT algorithm generates shelf based schedules with an approximation factor of 8.53 [10]. Turek et al. [14] proved that a generalization of Kawaguchi's and Kyan's LF method produces a tight approximation factor of for parallel jobs with unique weights if the resource requirement of each job is at most 50% of the maximum number of processors. However, when allowing arbitrary jobs this method may result in schedules which deviate signicantly from the optimum. Other results include Chakrabarti et al. [1] who addressed for instance the scheduling of malleable jobs similar to [14] and proved an expected performance within 8.67 of optimal for a randomized on{line algorithm. Deng et al. [] also discussed preemptive response time scheduling for malleable jobs with unique weights and variable resource requirements. But their model is related to a multithreaded environment and diers signicantly from ours as they allow unknown execution times while neither a preemption penalty nor gang scheduling is considered. Here, we present an o-line algorithm which generates a preemptive schedule with small approximation factors for makespan and total weighted completion time costs. This is achieved by combining two restrictive schedules into a general schedule with the help of preemption. A similar idea has been applied by Stein and Wein [13] to show the existence of schedules which approximate both the makespan and the total weighted completion time. In the paper we rst introduce our machine model in Section. Then the scheduling problem is dened and some aspects of bicriteria scheduling of parallel and independent jobs are presented. Some previous results with relevance to our algorithm are discussed in Section 4. Next, the new algorithm is introduced in Section 5 and then analyzed. We show that the total weighted completion time approximation factor can be ne tuned by solving a numerical optimization problem. This way a tight approximation factor of.37 can for instance be achieved if the preemption penalty is not taken into account. Further, some special cases are considered in Section 7. Finally, our results are compared to the worst case bounds for non-preemptive SMAT schedules. The Model A system of M independent parallel jobs must be executed on a multiprocessor. Each job has only a single phase of parallelism []. It is described by its invariant resource requirement r i f1; : : : ; g (also called xed allotment in [14]), its execution time h i 1, and a user set priority weight w i > 0. Note, that without restriction of generality the minimal job execution time has been normalized to 1. Our multiprocessor P(; p) consists of identical nodes which use an interconnection network for communication. Each node has its own processor, main memory, and a local hard disk for swapping as in the IBM SP. The multiprocessor allows free variable partitioning [4], that is the resource requirement r i of a job i can be satised by any resource set i f1; ; : : : g with j i j = r i. Also, the execution time of a job does not depend on the assigned partition. The multiprocessor further supports gang scheduling [4] by switching simultaneously the context of all processors belonging to the same partition. This context switch is executed by use of the local processor

memory and/or the local hard disk while the interconnection network is not aected except for message draining and synchronization. The context switch also causes a preemption penalty mainly due to processor synchronization, message draining, saving of job status and page faults. In our model this penalty is assumed to be a constant amount of time p which is encountered at any context switch. Individual nodes are assigned to jobs in an exclusive fashion, i.e. at any time instant any node belongs at most to a single partition. This partition and the corresponding job are said to be active at this time instant. As gang scheduling is used a job must be either active on all or on none nodes of its partition at the same time. All jobs which have already been started but not yet completed and whose partitions include node n are said to be resident at node n. The number of resident jobs at any node may be limited by the swapping space of the local hard disk. The multiprocessor also has an external parallel le system with a distributed interface which is assumed not to form a bottleneck. Therefore, the amount of time required to load a parallel job and save its results is independent of the assigned partition and any concurrent loading or saving of other jobs. Consequently, it is included in h i. On the other hand the cost of any intermediate saving of the job status due to a context switch is contained in the preemption penalty p. 3 The Scheduling Problem For a given job system and a multiprocessor P(; p) we next introduce a valid preemptive schedule based on the model described in the previous section by dening timing and resource allocation separately. Denition 1 Timing. In a preemptive schedule S(; P) each job i is assigned { a non-negative integer d i, { a (d i + )-tuple of time instants t i (0); : : : ; t i (d i + 1) such that Xd i =0 t i ()? t i (? 1) > t i ( + 1)? t i () = h i + p d i 8 < : and 0 d i = 0 or ( = for 0 < d i ) p d i > 0 and ( = 1 or = d i + 1) p = + 1 for 0 < < d i d i denotes the number of preemptions for job i. More precisely, the execution of job i is started at time t i (0), interrupted at times t i (? 1), resumed at times t i (), and completed at its completion time t i (d i + 1) with 0 < d i. Denition esource Allocation. A resource allocation is dened by assigning to each job i and each with 0 d i a node set i; f1; ; : : : g such that j i; j = r i. The resource allocation is valid if j; \ k; = ; for all dierent jobs j and k with t j () t k () < t j ( + 1). 3

In most parts of our paper we further assume that for all jobs i and for all 0 1 < d i there is i;1 = i; = i. Then, we also say that the schedule does not use job migration. A schedule S(; P) has the total weighted completion time c S and the makespan m S with c S = X i w i t i (d i + 1) and m S = max i ft i(d i + 1)g: For a given job system and a multiprocessor P we further denote = min c S and = min m S: all S(;P) all S(;P) The problem of nding a schedule with c = is known to be NP-complete in the strong sense for arbitrary even if each job requires only a single processor [7]. The same is true for nding a schedule with m = in the case of parallel jobs [3]. Both results hold for preemptive and non-preemptive schedules. Moreover, in general there is no schedule with c = and m = as shown in Example 1. Example 1. Assume the following system of + 1 jobs: h i r i w i jobs 1 1 1 1 i? 1 1 i = + 1 For this job system the following relations hold for the various optimal schedules: m Si any optimal makespan schedule S m (+3) any optimal completion time schedule S c + 1 + 1 c Si Example 1 also demonstrates the existence of job systems such that any optimal makespan schedule S m results in csm = (). A similar result with respect to optimal completion time schedules is described in the following lemma. Lemma 3. There are job systems such that any non-preemptive optimal total weighted completion time schedule S c results in msc = (). Proof: We consider a job system of k jobs which complies with the following restrictions: h i r i w i jobs H 1 H( + 1) i = 1 1 i = H + 1 1 variable i + 1 for 1 i < k 1 variable i for 1 < i k 4

In addition, we have P k i=3 w i H+. Note that delaying job 1 only by a single time unit is more expensive than starting an optimal schedule for all other jobs at time h 1 = H as H( + 1) > ( + H+ )H. Therefore, in the optimal completion time schedule the start time t 1 (0) of job 1 must be 0. Also, job 1 cannot be executed concurrently with job as the second job requires all nodes. Similarly, it is more expensive to delay job by a single time unit than starting the optimal schedule for jobs 3 to k at time h 1 + h = H + 1 as > H+ (H + 1). Due to the execution times and node requirements of jobs 3 to k none of them can be started before time H + 1 in the optimal completion time schedule. By recursively repeating the same construction for all remaining jobs the single optimal completion time schedule S c can be determined. If h i = + i is used for all jobs i + 1 with 0 i < k = we obtain m Sc = (3+1) while = 3? 1. However if preemption is allowed, the relation between total weighted completion time scheduling and makespan scheduling is dierent. Lemma 4. If preemption with job migration and without preemption penalty is allowed then any optimal weighted completion time schedule S c also guarantees msc < for any job system. Proof: Assume job i with r i and t i(d i + 1) = max j^rj ft j(d j + 1)g. As schedule S c has the minimal S total weighted completion time there are less than d t [0; m Sc [n i =0 [t i(); t i ( + 1)[. Therefore, we have nodes idle during all time instances > (m S c? h i ) + h ir i = m S c? h i + r i h i m S c : 4 Previous esults and Basic Observations In their LF schedule Kawaguchi and Kyan [8] rst generate a priority list of jobs by arranging the jobs in non increasing order of wi h i. This ratio is also known as Smith's ratio [1]. Whenever a node becomes available, the next unscheduled job in this list is assigned to the free node until all jobs are completed. The p +1. McNaughton [9] proved that running time of the algorithm is O(M log M) and it guarantees c < for this problem the cost of the optimal schedule cannot be reduced by introducing preemption. However, this observation does not hold for parallel jobs as can be seen by the following simple example [11]. Example. Assume a system of 3 jobs as described below: h i r i w i jobs 1 i = 1 1? 1 1 i = 4 1 1 i = 3 5

In an optimal non-preemptive schedule S n job is started concurrently with job 3. This block and job 1 are then scheduled in any order, resulting in c Sn = 11. In the optimal preemptive schedule S p however, rst job and job 3 are started together. Then at time t = 1, job 3 is interrupted and job 1 is executed before the execution of job 3 is resumed at time t = 3. Assuming no preemption penalty (p = 0) we have c Sp = 10. For parallel jobs a modied Smith ratio s i = wi h ir i can be dened. Therefore, all parameters of job i can be determined from any three of the four values h i, r i, w i, and s i. Turek et al. [14] showed that Kawaguchi's and Kyan's LF schedule guarantees c < if r i for each job i. While the proof in [14] only considers the unique weight case, a generalization to arbitrary weights is straight forward. Also, if all jobs of a job system require more than 50% of the nodes then it is easy to obtain an optimal total weighted completion time schedule as no two jobs can be executed concurrently. However due to dierent resource requirements of the various jobs, scheduling the jobs in Smith order will not necessarily be optimal. Corollary 5. Assume a job system with r i > all jobs are executed in Smith order. for all i. Then, a schedule S c satises csc < if Proof: We construct a new job system ^ such that for each job i there is a job ^{ ^ with w^{ = w i, h^{ = h i and r^{ =. Therefore, we have h^{ r^{ < h i r i. h ir i Using the squashed area bound A [15], where each job i is transformed into a job of execution time and resource requirement, we have csc A^ < A copt But if both cases are combined, meaning that jobs may be arbitrary, the approximation factor for may be as bad as [14]. Further, many non-preemptive schedules with small approximation factors for cannot guarantee a constant approximation factor for [15]. Therefore, we next address the approximation factors for of some non-preemptive schedules with good total weighted completion time performance. Corollary 6. For any job system a -SMAT NFIW schedule S c satises msc < 1 + + Proof:?1. First we restrict ourselves to a job system 1 which contains all jobs scheduled on the rst shelf of each height component. Assuming k < h max = max i h i k+1 we then obtain for the makespan m Sc ( 1 ) h max + k+1?1?1 while ( 1 ) h max. Therefore, we have m Sc ( 1 ) ( 1 ) < 1 +? 1 : Next we consider the job system = n Y 1. Using Y Lemmas 3.5 and 3.6 in [10] we obtain h i r i + h i r i? h i r i ) < Y h i r i (): i i 1 m Sc ( ) (Y i i 6

The combination of both parts yields m Sc () () < 1 + +? 1 : Corollary 7. Assume a job system with r i for all i. Then an LF schedule S c satises msc < 3. This bound is tight. Proof: Assume job i with t i (1) = m Sc. As S c is a non{preemptive LF schedule there must be less than nodes idle during all time instances t [0; t i (0)[. Therefore, we have h i and > (msc?h i) which results in m Sc = m S c? h i + h i < 3: Further, consider the following system of 1k + 1 jobs with 3k + 1 =. As all jobs have the same Smith ratio we assume that the order is given by the job index. h i r i w i jobs y + j 3 h i r i i = 3j + 1 and 0 j < k y + j + 1? 1 h ir i i = 3j + and 0 j < k 1? 1 h i r i i = 3j and 0 < j k ky 1 h i r i i = 1k + 1 For this job system we obtain m Sc = ky + k(k+1) + ky and d k 4 e(y + 4d k 4 e + 1) + (y + k? 1) which results for y = k in m Sc lim = 3:!1 It is easy to see that an LF schedule S c guarantees msc < if r i = 1 for all jobs i. 5 The Algorithm In this section we introduce the new preemptive algorithm PSS, shown in Table 1. PSS stands for Preemptive Smith atio Scheduling. Without restriction of generality is assumed to be even. The algorithm uses the following elements: Q t s a priority queue of jobs based on the modied Smith ratio a variable denoting the earliest starting time of the next job T (r) minftjt t s and? r P j r j with t j (d j + 1) > tg test a boolean function for testing whether the current job will cause preemption start value a function to determine the non{negative start delay for the current job if it causes preemption 7

Create a priority list Q for all jobs such that job i precedes job j if s i > s j; t s = 0; while (Q 6= ;)f pick the next job i and delete it from Q; d i = 0; if (r i > and test) f t i(0) = T ( ) + p + start value; t s = t i(0) + h i + p ; for all jobs j with t j(d j + 1) > t i(0)? p do f d j = d j + 1; t j(d j + 1) = t j(d j? 1) + h i + p; t j(d j? 1) = t i(0); t j(d j) = t s? p ; g g g else t s = t i(0) = T (r i); t i(1) = t i(0) + h i; Table 1. The Preemptive Algorithm PSS The following notation is used for variables which are changed during the iteration of algorithm PSS in which job i is removed from Q: q(i) and q(i) denote the value of variable q at the beginning and at the end of this iteration, respectively. Note that only a job with r i > may cause preemption. Further, these jobs cannot be preempted themselves. First, we discuss the validity of the schedule produced by PSS. Lemma 8. Consider a job system and a multiprocessor P. Independent of the implementation of test and start value PSS always generates a valid schedule which does not require migration. Proof: First note that at the beginning of any iteration t i (d i ) t s for any job i nq. Initially, PSS assigns to each job i an integer d i = 0 and time instances t i (0) and t i (1) with t i (1)? t i (0) = h i 1 and t s (i) t i (0) ts (i). During a later iteration job j may cause preemption and d i (j) may become d i (j) = d i (j)+1 while all t i () with d i (j) remain unchanged. The validity of the scheduling conditions for t i ( d i (j)? 1), t i ( d i (j)), and t i ( d i (j) + 1) can be easily veried. A job i can be assigned any set i f1; : : : ; gn( S j jjt j (d j + 1) > T (r i )) with j i j = r i if i does not cause preemption of any other job in PSS. Otherwise job i is neither executed concurrently with any other job j nq nor with any future job j Q as ts (i) t i (d i (i) + 1) holds. Therefore, job i can be assigned any node set i f1; : : : ; g with j i j = r i. Note that PSS is based on a restrictive global gang scheduling model which requires a concurrent context switch of all nodes independent of the partition size. Moreover, no job execution can take place during preemption penalty. The time complexity of PSS is max(o(m ); M(O(test) + O(start value))) as in the worst case O(M) jobs may preempt O(M) jobs each. However, note that the LF schedule and Turek's generalization are 8

both included in PSS. Therefore, if the job system observes the corresponding resource restrictions, p PSS has a time complexity of O(M log M) and guarantees cs < +1 and cs <, respectively. 6 Approximation Factors The approximation factors depend on the denition of the functions test and start value as for test 0 there are job systems for which PSS generates schedules with cs = () and ms = () [14]. It may seem appropriate to dene test and start value such that a local minimum for the completion time is achieved, that is t i (0) = argfmin t st ( X j6q_j=i w j t j ( d j (i) + 1))g: However as shown in Example 3, this approach cannot guarantee ms = o(). Example 3. Assume the following system of jobs with! 0: h i r i w i jobs 1 1 i = 1 1 1? i = 1 wi? 1 wi?? i = k + 1 and 1 k <? i = k and k Algorithm PSS with local optimality scheduling produces a schedule S where no two jobs are executed concurrently, thus resulting in m S = + while =. Therefore, we use the following functions for test and start value which only depend on T (r i ), T ( ), and the execution time h i of the new job i: test = ( h i v < T (r i)? T ( )) and start value = h i v : v 1 is a positive constant which may be chosen to minimize the approximation factors. As test and start value both have a time complexity of (1), the overall complexity of PSS is O(M ). Next, the ratio cs is addressed. Theorem 9. Assume schedule S is obtained by algorithm PSS. Then the job system below with k;! 1,! 0, and appropriately selected non-negative parameters x <, y produces the maximum ratio f c (p) = cs. There, it is assumed that x + 1 jobs of the rst group are always followed by 1 job of the second group in Q. The jobs of the third group are scheduled after all jobs of the rst two groups while the jobs of the fourth group must be at the end of Q. 9

h i r i w i number of jobs 1 v + 1 1 v + k(x + 1) 1? x? x k ky 1 ky x 1 k kyv Note that the modied Smith ratio is 1 for all jobs in the rst three groups of the job system in Theorem 9. Therefore, the order among these jobs is arbitrary. The jobs in the last group only have the purpose to delay the completion of all jobs in Group 3. The optimal schedule is obtained by using Turek's generalization of the LF schedule where the jobs of the various groups are arranged in the order Group 3, Group, Group 1 and Group 4. Then the high order terms for and c S are given by = k (xy + 1 x (? x)( v(?x) + 1) ) and = k (xy(yv + 1) + 1 ( x v +? x))( 1 v + 1 + p): c S Thus, Theorem 9 transforms the determination of the best approximation factor f c (p) for algorithm PSS into the solution of the following optimization problem: f c (p) = min ( max 0<v1 y0;0x< ( c S )) To prove Theorem 9 we construct a worst case example in several steps and derive a lower bound for which is tight for the job systems of Theorem 9. First, we restrict ourselves to all jobs i of a job system with s i = max j (s j ) and call the set of all those jobs ^. The order of those jobs in Q is arbitrary. Further, the modied Smith ratios are normalized such that s i = 1 for all jobs i ^. Then, the ratio between the total weighted completion time ^c S for all jobs i ^ in the schedule S produced by PSS and the costs ^ of the optimal schedule for ^ is determined. Note that in schedule S jobs with a lower s j may inuence the weighted completion time of jobs i ^. The following lemma establishes a relationship between ^cs ^ Lemma 10. If Proof: ^c S ^ for all job systems then and cs. cs is also upper bounded by. As test and start value are not dependent on any weight, the proof of Theorem of Kawaguchi's and Kyan's paper [8] can be directly applied to this lemma as well. Next, we derive a lower bound for ^. To this end we dene t b = max i6^ f(t ( ))(i)g and ^ t = fi ^jt(d i + 1) > t b g. Note that only jobs in ^ t can be preempted by jobs j with s j < 1. To generate a worst case, we therefore assume that all jobs i ^ are followed by a sucient number of jobs j with h j = 1, r j = and s j! 0. This results in t(d i + 1)? t b h i (1 + v + vp) for all jobs i ^ t where h i is the remaining execution time of job i at time t b. The job system ^ is now transformed into a new parameterized sequential job system ~(r) with 0 r <. This system consists of three disjoint job systems: 10

~ 1 : r i jobs j with r j = 1, h j = h i, and s j = 1 for each job i ^ t, ~ : min(r; r i ) jobs j with r j = 1, h j = h i, and s j = 1 for each job i ^n^ t, ~ 3 : max(0; r i? r) jobs j with r j = 1, h j = h i, and s j = 1 for each job i ^n^ t. ~ 1 is further partitioned into sets ~ 11 and ~ 1 such that j~ 11 j = min(r; j~ 1 j) and h i h j for all jobs i ~ 11 and j ~ 1. ~ (r) is the cost of the optimal schedule for ~(r) under the restriction that all r jobs of ~ 11 and all jobs of ~ 3 are scheduled using only resources 1; : : : ; r. The following corollary is a direct consequence of this denition. Corollary 11. ^ max 0r< ~ (r) holds for all job systems. In further parts of this section, we will therefore always use the lower bound ~ (r). Job system ~(r) consists only of sequential jobs with the same Smith ratio. The total completion time for each resource is 1 ((P i h i) + P i h i ) where the sums are taken over all jobs i ~(r) scheduled on this resource [1]. Also, we get a lower bound for ~ (r) if we assume that the makespan on all resources r + 1; : : : ; is the same and that this makespan is not larger than the makespan on any resource 1; : : : ; r [8]. For the job systems of Theorem 9 this lower bound for ~ (r) is tight and max 0r< ~ (r) = ^ holds. The rest of the proof of Theorem 9 is divided into three corollaries. While the proof is tedious it is mostly based on a simple concept: Job system is transformed into a job system 0 such that max r (~ (r))j > max r (~ (r))j 0 and ^c S j? ^c S j 0 (1 + 1 v + p)(max r(~ (r))j? max r (~ (r))j 0). First, we address all jobs i ^ with r i >. ^c Corollary 1. Assume a job system with S max j r(~(r)) 1 + 1 v + p. Then there is a job system 0 such ^c that S max j r(~(r)) 0 ^c S max j r(~(r)) and all jobs i ^ 0 with r i > have h i = 1 and cause preemption. Proof: 1. Assume a job i with h i causing preemption in S. Now transform into 0 by replacing i in Q with two jobs i 1 and i such that { r i1 = r i = r i, { s i1 = s i = s i, { h i1 = 1, { h i = h i? 1. Both jobs i 1 and i will cause preemption in S 0 and t i (1)j S 0 = t i (1)j S + p. Further, there is t j (d j + 1)j S t j (d j + 1)j S 0 for all jobs j \ 0. This results in yielding ^c S j 0 ^c S j? r i (h i? 1)( 1 v + 1? p) ~ (r)j 0 ~ (r)j? r i (h i? 1); ^c S max r (~ (r)) j 0 ^c S max r (~ (r)) j 1 + 1 v + p: 11

. As already mentioned we always assume that there are enough jobs j with s j! 0 such that t i (d i + 1) = t b + h i (1 + v + vp) for all jobs i ^ t or ^ t 0. Transform into 0 by replacing each job i ^ with job i 0 ^ 0 such that { r i 0 = r i, { h i 0 = ah i, { w i 0 = aw i, using a positive integer a. Then repeatedly split all jobs i 0 ^ 0 as shown above if they cause preemption and have h i 0. If all h i 0 are rational numbers then a can be selected such that h i = 1 for all jobs i 0 ^ 0 which cause preemption. This scaling procedure will also result in ^c S max r (~ (r)) j 0 ^c S max r (~ (r)) j 1 + 1 v + p: 3. Assume a job i ^ t with r i hi which does not cause preemption, that is v > T (r i)? T ( ) > 0 in algorithm PSS. Transform into 0 by replacing i in Q with two jobs i 1 and i such that { r i1 = r i = r i, { s i1 = s i = s i, { h i1 = v(t (r i1 )? T ( ))? with! 0, { h i = h i? h i1. If h i1 < 1 or h i < 1 scale the system appropriately as described above. Now, i 1 causes preemption while lim!0 T (r i ) = T ( ) holds for i in algorithm PSS. As before we have ^c S j 0 ^c S j? r i (h i1 h i? h i1 p? h i p) ~ (r)j 0 ~ (r)j? r i h i1 h i : 4. Assume a job i ^ t with r i > and T (r i) = T ( ) in algorithm PSS. Transform into 0 by replacing i in Q with j = r i? + 1 jobs i 1; : : : i j such that { h i1 = : : : = h ij = h i, { s i1 = : : : = s ij = s i, { r i1 =, { r i = : : : = r ij = 1. Then we have ^c S j 0 = ^c S j ~ (r)j 0 ~ (r)j : Note that Corollary 1 is valid independent of p. Next, we introduce a further transformation of the job system in order to increase the number of preemptions. ^c S Corollary 13. Assume a job system with max j r(~(r)) 1 + 1 v + p. Then there is another job system 0 ^c and a parameter > 0 with lim!0 maxf S max j ^c r(~(r))? S max j r(~(r)) 0; 0g = 0 and following properties of 0 : { r i > or r i = 1 for all jobs i ^ 0. 1

{ t i (0) t b? and h i = h j for all jobs i; j ^ 0 t. { h i = 1 + 1 v + and d i = 1 for all jobs i ^ 0 n^ 0 t with r i = 1. { + 1? r j jobs are preempted by job j ^ 0 with r j >. Proof: Based on Corollary 1 we can assume that h i = 1 for all jobs i ^ with r i >. 1. Let i ^ be a job with r i and h i. Then schedule S and job system are transformed into schedule S 0 and job system 0 by replacing i with two jobs i 1 and i such that { r i1 = r i = r i, { s i1 = s i = s i, { h i = h i? h i1 h i, { d i1 + d i = d i, { t i1 () = t i () for all 0 d i1, { t i1 (d i1 + 1) = t i (0), { t i () = t i (? d i1 ) for all 0 < d i + 1. Then t i (d i + 1) t i (0) + h i (1 + v + vp) holds. Therefore, we have ^c S 0j 0 ^c S j? r i h i1 h i (1 + 1 v + p) ~ (r)j 0 ~ (r)j? r i h i1 h i : However, the new schedule may not necessarily be a PSS schedule. Now, is transformed into 0 by repeatedly splitting jobs i ^ with d i 1 at time t j (1) + p + in schedule S if job j ^ causes preemption of job i. In Q job i 1 then replaces job i and job i is introduced just after job j. In the resulting PSS schedule we have 0 t i1 (d i1 + 1)? t i (0). Therefore, the above described conditions are valid for! 0. Further, the completion time of any job is reduced by at most t b v 1+v+vp which leads to lim maxf ^c S!0 max r (~ (r)) j ^c S? max r (~ (r)) j 0; 0g = 0: If the job splitting would result in jobs with h i < 1 the job system is scaled before as described in the proof of Corollary 1.. Let j 1 ; j ^ be two jobs with r j1 ; r j > and t j (0) > t j1 (1) such that there is no other job i with r i > and t j (0) > t i (1) > t j1 (1). If t j (0) > t j1 (1) + 1 v + p then all jobs i with t i(0) t j (0)? 1 v? p < t i(d i + 1) are split at time t j (0)? 1 v? p. If necessary the job system is scaled such that a = tj (0)? v 1?p?tj (1) 1 becomes an integer and 1+ v 1 + h i 1 v + for all jobs i with t j 1 (1) + p t i (0) < t j (0)? 1 v? p. Note that during any time instance between t j 1 (1) + p and t j (0)? 1 v? p more than 50% of the nodes are used. Next, job system is transformed into 0 by replacing the set fi ^jt j1 (1) + p t i(0) < t j (0)? 1 v? p g with a jobs having r i = + 1, h i = 1, s i = 1 and a jobs having r i = 1, h i = 1 v +, s i = 1. sequential jobs are then always followed by one job i with r i = + 1 in Q. Because of Corollary 1 and the results above this procedure guarantees ^c S j 0 max r (~ (r)j 0) ^c S j max r (~ (r)j ) : 13

The same construction can be used for the time frames [0; minft i (0)ji ^n^ t with r i > g? p [ and [maxft i (1)ji ^n^ t with r i > g + p ; t b[ if necessary. 3. Next, replace each job i ^ having r i with r i identical jobs j such that r j = 1, h j = h i, and s i = 1. This does not aect the cost of the PSS schedule if! 0. Also, transforming into 0 by removing a job i ^n^ t with r i < and d i = 0 results in ^c S j 0 > ^c S j? h i r i t b t ~ (r)j 0 ~ (r)j? h i r b i 1+ v 1 +p : Similarly, assume that job j ^ preempts b sequential jobs. Then can be transformed into 0 by removing b + r j?? 1 of these sequential jobs. This will not aect the completion time of any other job. Therefore we have ^c S j 0 max r (~ (r)j 0) ^c S j max r (~ (r)j ) : 4. Finally, a job system is transformed into 0 by replacing each job i ^ t with a job i 0 such that { r i 0 = 1, { s i 0 = 1, { h i 0 = P i^ t 0 h i j^ t 0j. This results in ^c S j 0 = ^c S j? ( P i^ t h i? (P i^t hi) j^ tj )(1 + 1 v + p) ~ (r)j 0 ~ (r)j? P i^ t h i + (P i^t hi) j^ tj : ^c Corollary 14. Assume a job system as described in Corollaries 1 and 13 with S max j r(~(r)) > 1+ 1 v +p and a given parameter r with 0 r <. Then there is another job system 0 with r i =? r for all preemption causing jobs i ^ 0 and Proof: ^c S max j ^c r(~(r)) S max j r(~(r)) 0. 1. Assume a job i with r i <?r. Then transform into 0 by removing one job j which is preempted by i, that is r j = 1, h j = 1 v, and s j = 1. Further, i is replaced by i 0 such that { r i 0 = r i + 1, { s i 0 = s i = 1, { h i 0 = h i. P P As i~ h i > (?r)t b 1+ and v 1 +p i~ [~ 3 h i > t b holds, this results in +p 1+ 1 v ^c S j 0 ^c S j? t b ( 1 v? 1) ~ (r)j 0 ~ (r)j? t b 1+ 1 v +p(1 v? 1): Therefore, it is sucient to assume that r i? r for all preemption causing jobs i ^. 14

. Transform into 0 such that P i^ 0 n^ 0 t h ir i = P i^n^ t h i r i and the rst b j~3j r c preemption causing jobs in the PSS schedule Sj 0 all have r i = while at most one other job j ^ 0 has r j > r. Then, we have ^c S j 0 ^c S j ~ (r)j 0 = ~ (r)j : By the use of scaling we can further achieve that all jobs i ^ with r i > 1 either have r i = r or r i =. 3. Assume a job i ^ with r i =. Transform into 0 by removing those jobs j with t j (0) 1 v + p. This results in ^c S j 0 P ~ (r)j 0 ~ (r)j? i^ w i: ^c S j? (1 + 1 v + p)p i^ w i This concludes the proof of Theorem 9. To obtain the minimal value for f c (0) = :366 we choose v = 0:836, x = 0:183, and y =. Note that f c (0) > 1 v + 1 = :196. In the next theorem we address the makespan costs of a PSS schedule: Theorem 15. Algorithm PSS produces schedules with a ratio f m (p) < + 1 v + p. This bound is tight. Proof: in We have P i r ih i and max i h i. Further, let j be the last job in Q. This results f m (p) = m S = m S? ts (j) t + s (j) m S? ts (j) t + s (j) max i h i Pi r < 1 + 1 + 1 ih i v + p = + 1 v + p: The job system below with k;! 1 and! 0 produces the maximum ratio f m (p) = + 1 v + p. There it is assumed that jobs of the rst group are always followed by 1 job of the second group in Q. The single job of the third group is the last job of Q. h i r i w i number of jobs 1 v + 1 1 v + k 1? 1? 1 k k(1 + v(?1) ) 1 1 7 Extensions It would also be interesting to know whether allowing job migration could signicantly decrease the ratio f c. For this purpose, we consider a variant of PSS with an additional optimization phase. This phase is executed for each job in the order given by the initial list Q. During optimization a job i is rescheduled such that there is no time interval below the completion time of i in which at least r i resources are available and i is not scheduled. To address this case similar corollaries as introduced in Section 6 can be used. However, here we just present a type of job system which cannot be improved by the optimization phase: 15

Example 4. Assume a system as described below: h i r i w i number of jobs k(1 + 1 v + p) 3 3k(1 + 1 v + p) 1 1 v +? 1 ( 1 v + )(? 1) k 1 + 1 + 1 k ky 1? 1 ky(? 1) 1 k The rst job in the table is also the rst job in Q. Next, one job of the second group alternates with one job of the third group. Then follows the job of Group 4 while all jobs of Group 5 are positioned at the end of Q. For k;! 1,! 0 the same numerical optimization problem as described in Section 6 is obtained with x = = const. Then v = 0:816 and y = :449 yield the value :1. However, note that :1 < 1 v + 1 = :5. Nevertheless, the example demonstrates that the achievable gain for allowing migration is limited. Moreover, this gain may come with an increased preemption penalty. Further, we may assume that the modied Smith ratio is 1 for all jobs of a job system. With other words the cost of any node{second is constant. Note that jobs may still have dierent weights depending on their resource requirements. However, Group 4 in Theorem 9 cannot exist anymore. Nevertheless, the proof of Theorem 9 remains valid. There is only a new high order expression for c S : kyv c S = k (xy(y + 1 v + 1 + p) + 1 ( x v +? x)( 1 v + 1 + p)): In this case Group 3 of Theorem 9 cannot increase f c and we obtain f c (0) < for v = 1. 8 Conclusion First, we addressed bicriteria scheduling of parallel jobs in general and gave a few new results. Then, we presented an algorithm which generates preemptive o-line schedules for parallel and independent jobs with xed resource requirements. This algorithm is obtained by combining two algorithms with good performance for restricted input sets. The schedule is based on a priority list and has small approximation factors for both total weighted completion time and makespan costs. The method belongs to the class of list scheduling algorithms. It is carefully analyzed and a tight worst case approximation factor is determined. Moreover, the analysis provides information about the structure of `bad cases'. Also, we derived a numerical optimization problem which can be used to ne tune the total weighted completion time approximation factor. The generated schedules are based upon our machine model which is derived from existing parallel computers. To our knowledge it is also the rst time that a preemption penalty is considered in the analysis of such an algorithm. Compared with the non-preemptive SMAT schedules our approximation 16

factors are signicantly better even if we assume that a context switch is as time consuming as the minimal completion time of a job including loading the job and storing its results, that is p = 1. As shown below PSS and SMAT schedules can be ne tuned to minimize either cs Schedule c S m S SMAT 8.53 5.19 SMAT 9 5 PSS with p = 0.37 3.0 PSS with p = 0.41 3 PSS with p = 1 3.41 4.31 PSS with p = 1 3.61 4 or ms. PSS schedules further have the advantage that they use preemption only for jobs which require at most 50% of the nodes. Even in this case there are at most two jobs resident on any node at the same time. Moreover, PSS schedules only need global preemption which may be easier to implement than other forms of gang scheduling with respect to running messages in the interconnection network. The big dierence between the total weighted completion time approximation factors for preemptive and non{preemptive scheduling of parallel jobs leads to the question whether better non{preemptive methods are possible at all. This may be subject of future research. Acknowledgement. The author is grateful to Joel Wein for a helpful discussion on bicriteria scheduling. eferences 1. S. Chakrabarti, C. Phillips, A.S. Schulz, D.B. Shmoys, C. Stein, and J. Wein. Improved approximation algorithms for minsum criteria. In Proceedings of the 1996 International Colloquium on Automata, Languages and Programming. Springer Verlag Lecture Notes in Computer Science, 1996.. X. Deng, N. Gu, T. Brecht, and K. Lu. Preemptive scheduling of parallel jobs on multiprocessors. In Proceedings of the 7 th SIAM Symposium on Discrete Algorithms, January 1996. 3. J. Du and J. Leung. Complexity of scheduling parallel task systems. SIAM Journal on Discrete Mathematics, (4):473{487, November 1989. 4. D.G. Feitelson and L. udolph. Parallel job scheduling: Issues and approaches. In D.G. Feitelson and L. udolph, editors, IPPS'95 Workshop: Job Scheduling Strategies for Parallel Processing, pages 1{18. Springer{ Verlag, Lecture Notes in Computer Science 949, 1995. 5. D.G. Feitelson and L. udolph. Towards convergence in job schedulers for parallel supercomputers. In D.G. Feitelson and L. udolph, editors, IPPS'96 Workshop: Job Scheduling Strategies for Parallel Processing, pages 1{6. Springer{Verlag, Lecture Notes in Computer Science 116, 1996. 6. M. Garey and. Graham. Bounds for multiprocessor scheduling with resource constraints. SIAM Journal on Computing, 4():187{00, June 1975. 7. M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979. 8. T. Kawaguchi and S. Kyan. Worst case bound of an LF schedule for the mean weighted ow-time problem. SIAM Journal on Computing, 15(4):1119{119, November 1986. 9.. McNaughton. Scheduling with deadlines and loss functions. Management Science, 6(1):1{1, October 1959. 17

10. U. Schwiegelshohn, W. Ludwig, J.L. Wolf, J.J. Turek, and P. Yu. Smart SMAT bounds for weighted response time scheduling. SIAM Journal on Computing. Accepted for publication. 11. U. Schwiegelshohn, J.J. Turek, and J.L. Wolf. Preemptive scheduling of parallel tasks. Technical eport C 0104 (8893), IBM esearch Division, June 1995. 1. W. Smith. Various optimizers for single-stage production. Naval esearch Logistics Quarterly, 3:59{66, 1956. 13. C. Stein and J. Wein. On the existence of schedules that are near-optimal for both makespan and total weighted completion time. Preprint, 1996. 14. J.J. Turek, W. Ludwig, J.L. Wolf, L. Fleischer, P. Tiwari, J. Glasgow, U. Schwiegelshohn, and P. Yu. Scheduling parallelizable tasks to minimize average response time. In Proceedings of the 6th Annual Symposium on Parallel Algorithms and Architectures, Cape May, NJ, pages 00{09, June 1994. 15. J.J. Turek, U. Schwiegelshohn, J.L. Wolf, and P. Yu. Scheduling parallel tasks to minimize average response time. In Proceedings of the 5 th SIAM Symposium on Discrete Algorithms, pages 11{11, January 1994. 18