Introduction Consider a set of jobs that are created in an on-line fashion and should be assigned to disks. Each job has a weight which is the frequen

Ancient and new algorithms for load balancing in the L norm Adi Avidor Yossi Azar y Jir Sgall z July 7, 997 Abstract We consider the on-line load balancing roblem where there are m identical machines (servers) and a sequence of jobs. The jobs arrive one by one and should be assigned to one of the machines in an online fashion. The goal is to minimize the sum over all machines of the squares of the loads, instead of the traditional maximum load. We show that for the sum of the squares the greedy algorithm erforms within 4=3 of the otimum, and no on-line algorithm achieves a better cometitive ratio. Interestingly, we show that the erformance of greedy is not monotone in the number of machines. More secically, the cometitive ratio is 4=3 for any number of machines divisible by 3 but strictly less than 4=3 in all the other cases (although it aroaches 4=3 for large number of machines). To rove that greedy is otimal, we show a the lower bound of 4=3 for any algorithm for 3 machines. Surrisingly, we rovide a new on-line algorithm that erforms within 4=3 of the otimum, for some xed > 0, for any suciently large number of machines. This imlies that the asymtotic cometitive ratio of our new algorithm is strictly better than the cometitive ratio of any ossible on-line algorithm. Such henomena is not known to occur for the classic maximum load roblem. Minimizing the sum of the squares is equivalent to minimizing the load vector with resect to the L 2 norm. We extend our techniques and analyze the exact cometitive ratio of greedy with resect to the L norm. This ratio turns out to be 2 ( ln ). We show that greedy is otimal for two machines but design an algorithm whose asymtotic cometitive ratio is better than the ratio of greedy. Deartment of Comuter Science, Tel Aviv University. E-Mail: adi@math.tau.ac.il. y Deartment of Comuter Science, Tel-Aviv University. E-Mail: azar@math.tau.ac.il. Research suorted in art by Alon Fellowshi and by the Israel Science Foundation administered by the Israel Academy of Sciences. z E-mail sgall@math.cas.cz. Mathematical Institute, AV CR, Zitna 25, 5 67 Praha, Czech Reublic; artially suorted by grant A09602 of AV CR. 0

Introduction Consider a set of jobs that are created in an on-line fashion and should be assigned to disks. Each job has a weight which is the frequency access to the disk. Dene the load on a disk to be the sum of the weights of jobs assign to it. If we would like to minimize the maximum delay of an access request of a job to the disk than it is equivalent to minimize the maximum load on a disk. However, in many cases the more aroriate goal would be to minimize the average delay of all access requests [6]. That is equivalent to minimizing the sum of the squares of the loads, since the delay of an access request is roortional to the load on the disk. That motivates us to dene the following machine load balancing roblem. We are given m arallel identical machines and a number of indeendent jobs (tasks) arriving one by one at arbitrary times. Each job has an associated weight and should be assigned immediately to exactly one of the machines based only on the revious jobs without any knowledge on the future jobs. The load of a machine is the sum of the weights of the jobs assigned to it. The cost of an assignment for an inut sequence of jobs is the sum of the squares of the machines' load vector (after all jobs are assigned). The goal of an assignment algorithm is to assign all the jobs while minimizing the cost. We measure the erformance of an on-line algorithm by its cometitive ratio. An on-line algorithm is c-cometitive if for each inut the cost of the assignment roduced by the algorithm is at most c time larger than the cost of the otimal assignment. We show that for minimizing the sum of the squares of loads the greedy algorithm is 4-cometitive over any number of machines. We also show that no on-line algorithm can 3 achieve a better ratio than 4 for all m which imlies that greedy may be considered as 3 an otimal algorithm. Interestingly, the cometitive ratio of the greedy algorithm is not monotone in the number of machines. It is exactly 4 for m 0 (mod 3) but strictly 3 smaller than 4 for m 6 0 (mod 3). However, the asymtotic cometitive ratio, which 3 is the limit of the cometitive ratio for m!, is 4. More recisely, for any > 0 there is 3 some M such that for all m M the cometitive ratio of greedy is at least 4=3. The reason that the greedy algorithm is otimal if we allow any number of machines is that we roof a lower bound of 4 on any on-line algorithm for 3 machines. The greedy algorithm is 3 also otimal for 2 and 4 machines, in both cases the cometitive ratio is 3 + 5=4 :3090. Surrisingly, we rovide a new on-line algorithm with cometitive ratio 4=3, for some xed > 0, for any suciently large number of machines. This imlies that the asymtotic cometitive ratio of our new algorithm is strictly better than the cometitive ratio of any ossible on-line algorithm and strictly better than the asymtotic cometitive ratio of greedy. Such henomena is not known to occur for the classic makesan roblem. In fact it is conjectured in [7] that the cometitive ratio for the makesan is monotone increasing with the number of the machines. Also we are not aware of a natural on-line roblem whose cometitive ratio is dierent than the asymtotic cometitive ratio (in articular it should be non-monotone). We also consider the general L norm (for any > ). Note that minimizing the sum of the squares is equivalent to minimizing the L 2 norm of the load vector. We can also dene the cost to be the L norm of the machines' load vector for any >. In articular, minimizing the L norm recisely means minimizing the maximum load. By using the triangle inequality, it is not hard to show that for any L norm the worst case erformance

of the greedy algorithm is at most 2. We determine the exact worst case erformance of the greedy algorithm over an arbitrary number of machines. We show that this erformance is 2 ( ln ). Then we resent a lower bound of 3 ( 2 ) for any on-line assignment algorithm for any xed number of machines. We show that for any > the greedy algorithm is otimal for 2 machines. In contrast, we design an algorithm whose asymtotic cometitive ratio is strictly better than the asymtotic cometitive ratio of greedy. The case = (i.e., L ) is the classic ancient roblem of scheduling jobs on identical machines minimizing the makesan (or maximum load). Graham [] showed that the greedy load balancing algorithm is 2 m cometitive in this case. The greedy algorithm is otimal only for m 3, for any m > 3 better algorithms exist [9, 7]. Bartal et al. [5] were the rst to show an algorithm whose cometitive ratio is below 2 for some constant > 0 and arbitrary m. Very recently, Albers [] designed :923 cometitive algorithm and showed a lower bound of :852. Chandra and Wong [6] were the rst to consider the roblem of minimizing the sum of the squares of the machines load vector. Cody and Coman [8], in their study of lacing a set of records on a sectored drum to minimize the average latency, considered essentially the same minimization roblem. In [6] it is shown that if the jobs arrive in non-increasing weights order then the greedy algorithm is 25 of the otimal assignment in the worst case. 24 This result was slightly imroved by Leung and Wei [2]. Chandra and Wong [6] also considered the the general L norm (for any > ) and showed that the greedy algorithm on the sorted items achieves a constant erformance bound. The constant deends on and grows to 3 when grows to. 2 O-line scheduling/load balancing with resect to the L norm has been considered in [2]. The o-line minimization roblem is known to be NP-hard in the strong sense [0]. Alon et al. [2] rovided a olynomial aroximation scheme for scheduling jobs with resect to the L norm for any >. An examle in which the otimal assignment for the sum of the squares is dierent than the otimal assignment in the L norm is also given in [2]. Results on load balancing for the unrelated machines with resect to the L norm and L 2 (and L ) norm aear in [3, 4]. The cometitive ratio for unrelated machines is much worse (()) than the trivial bounds for identical machines and the techniques used are totally dierent. 2 Denitions and reliminaries In the load balancing roblem we are given m identical machines (servers) and a nite sequence of jobs (tasks). Each job j has a weight w j 0. A schedule S is an assignment which assigns each job j to a single machine i, i m. For every schedule S, the load of machine i, denoted L i (S), is the sum of weights of all jobs assigned to i in S. The vector of loads is L(S) = (L (S); : : : ; L m (S)). An assignment algorithm is an algorithm which for an inut sequence roduces a schedule which assigns every job in to one of the m machines. An on-line assignment algorithm must assign a job j at its arrival to a machine based only on the revious jobs and their assignments; the decision is made without any knowledge about future job arrivals. Our measure of cost is the L 2

norm. Hence the cost of a schedule S is dened as kl(s)k = mx i=! (L i (S)) : Note that we can reorder the machines with no change in the cost. The otimal cost, denoted OPT(S), is the minimal cost of a schedule S 0 which assigns the same jobs as S; this schedule can be comuted by an oine algorithm which knows all the jobs in advance. We measure the erformance of our algorithms by the cometitive ratio. For a xed >, the cometitive ratio of a schedule S is dened as C(S) = kl(s)k =OPT(S) (utting 0=0 = to handle the emty schedule). Let A be an on-line assignment algorithm. The cometitive ratio of A over a xed number m of machines is dened as C A;m = sufc(s) j S is a schedule roduced by A on m machinesg: The cometitive ratio of A over an arbitrary number of machines is dened as C A = sufc A;m j m g: The revious denitions cover also the case when we measure the the sum of the squares of loads, since then the cost is (kl(s)k 2 ) 2. Consequently, the cometitive ratios w.r.t. the sum of the squares of loads are equal to C 2 (S), C 2 A;m and C2 A w.r.t. the L 2 norm. Now we dene the notion of a shae of a schedule, which is an abstraction of a schedule where for every machine, all jobs assigned to it excet for one are relaced by very small jobs with the same total load. In general it may be imossible to roduce such a schedule by the same algorithm as the original one. Nevertheless, the concet of a shae, is very useful for roving uer bounds on the cometitive ratio, since the otimal assignment may imrove (by artitioning the jobs) while the cost of the assignment does not change. Hence a shae is a essimistic estimate of a schedule. A shae characterizes each machine by two numbers, a i is the total load of the small jobs, and u i is (a lower bound on) the weight of one large jobs. Formally a shae is a air R = (a; u), where a and u are vectors of m nonnegative reals. The vector of loads of a shae is dened as L(R) = a + u. The shae R = (a; u) is a shae of a schedule S if L(R) = L(S) and for every i m with u i > 0 there exists a job with weight w j u i assigned to the machine i in S. The otimal cost of a shae R is the inmum of the otimal costs of all schedules S with the shae R, formally OPT(R) = inffopt(s) j R is a shae of Sg. As we shall see, the inmum can be relaced by a minimum. The cometitive ratio of a shae R is C(R) = kl(r)k =OPT(R). It is ossible to comute the otimal cost of the shae R = (a; u) exlicitly. It is the cost of a schedule in which some big jobs are scheduled each on a searate machine and the rest of the jobs are balanced evenly on the rest of the machines. Let the machines be ordered so that u i are nondecreasing. For l m let h l = ( P m i= a i + P r i= u i )=l. Let k be the largest l such that h l u l (k is always dened, since h u ). We dene the height of the shae to be h(r) = h k. It is easy to see that a good candidate for an otimal schedule for the shae R is to ut on each machine one job of size exactly u i and divide a i into a few jobs so that they can be 3

balanced exactly on the k machines; then the load vector is (h k ; : : : ; h k ; u k+ ; : : : ; u m ). See the Figures and 2 for examles where a i = for all i. Next, we show that this really is the otimal schedule. k m... 3 2 m... 3 2 h Figure : A shae R. Figure 2: Otimal assignment of R Lemma 2. Let h = h(r). Then OPT(R) = k(h; : : : ; h; u k+ ; : : : ; u m )k. Proof: We have seen above that there exists a schedule which achieves this value. remains to rove that for any schedule S with shae R, the cost kl(s)k is at least the bound in the statement of the lemma. The case of h(r) = 0 is trivial since the equality holds. Otherwise let h = h(r) and let k be such that h = h k in the revious denition. From the denition of h(r) it follows that h < u i for every i > k (otherwise we would have chosen larger k). For i > k, let j i be a job assigned to i in S with weight at least u i (it exists, since u i > 0). Let S 0 be the otimal schedule for the jobs in S. First, S 0 has a machine with load at most h: there are at least k machines on which no job j i, i > k, is scheduled, and their total load is at most kh. Second, if S 0 is otimal, then for any i > k, no other job is assigned to the same machine as the job j i : Assume that the job j i is scheduled on a machine with the load b > 0 of other jobs. We know that there is a machine with load c h < u i w ji. However, if we relace the two machines with loads c and b+w ji by two machines with load b+c and w ji, the total cost decreases due to the convexity of the function x. Consequently, after a renumbering of the machines, the vector of loads L = L(S 0 ) satises L i u i for each i > k and P m i= L i = hk + P m i=k+ u i. Using convexity again, the cost of any such schedule is at least k(h; : : : ; h; u k+ ; : : : ; u m )k. 3 The greedy algorithm In this section we determine the cometitive ratio of the greedy algorithm dened below. Algorithm Greedy: Uon arrival of a job j assign it to the machine with the current minimum load (ties are broken arbitrarily). To bound the erformance of Greedy, we show that each schedule can be relaced by a very secial shae so that the cometitive ratio does not decrease. Comuting the 4 It

cometitive ratio is then shown to be equivalent to comuting a maximum of a certain function over the reals. A shae R = (a; u) is called at if all the comonents of a are the same. Abusing notation, we then denote by a both the value of the comonent and the vector of m such comonents. (For an examle of a at shae, see Figure.) Lemma 3. Let S be a schedule obtained by Greedy. which is a shae of S. Then there exists a at shae R Proof: Let L = L(S) be the vector of loads of S; w.l.o.g. assume that L is the smallest comonent of L. We claim that the at shae (a; u), where a = L and u i = L i a, is a shae of S. Clearly the loads are the same. Consider a machine with u i > 0. Let j be the last job assigned to the machine i. At the time of its assignment, the load of the machine i must have been at most a, as otherwise Greedy would have scheduled j on the machine. Hence w j L i a = u i. Lemma 3.2 Let R = (a; u) be a at shae. Then there exists a at shae R 0 = (a 0 ; u 0 ) such that C(R) C(R 0 ) and for any i, u 0 i = 0 or u0 i h(r0 ). Proof: Assume that 0 < u < h = h(r). We claim that there exists a at shae R 0 such that kl(r 0 )k kl(r)k, OPT(R 0 ) = OPT(R), h(r 0 ) = h, u 0 is either 0 or h, and the other comonents of the vector u 0 remain unchanged. Alying this claim inductively nishes the roof of the lemma. To rove the claim, we dene two candidate shaes and rove that one of them satises all the conditions. Let R 0 = (a 0 ; u 0 ) and R 00 = (a 00 ; u 00 ), where u 0 = (0; u 2 ; : : : ; u m ), a 0 = a + u =m, u 00 = (h; u 2 ; : : : ; u m ), and a 00 = a (h u )=m. It is easy to verify that a 00 0, OPT(R 0 ) = OPT(R 00 ) = OPT(R), and h(r 0 ) = h(r 00 ) = h. Moreover, L(R) = h u L(R 0 ) + u h h L(R00 ); i.e., the old vector of loads is a weighted average of the two new vectors of the loads. From the convexity of the function x it follows that (kl(r)k ) h u (kl(r 0 )k ) + u h h (kl(r00 )k ) : Hence either kl(r)k kl(r 0 )k or kl(r)k kl(r 00 )k, and the claim is roved. Lemma 3.3 Let k m and let R = (a; u) be a at shae such that u i h(r) for i = ; : : : ; k. Then there exists x such that for R 0 = (; (x; : : : ; x; u k+ ; : : : ; u m )), it holds that C(R) C(R 0 ), OPT(R 0 ) = OPT(R), h(r 0 ) = h(r), and x h(r 0 ). Proof: Let x be the th ower mean of u i, i k, i.e., x = ( P k i= u i =k)=. Clearly x is at least the minimum of u ; : : : ; u k, hence the otimal schedule changes only in the load of the machines with these single jobs and x h(r) = h(r 0 ). The contribution of the k big jobs to (OPT(R)) is P k i= u i, and their contribution to (OPT(R0 )) is kx. These contributions are the same and hence the otimal cost does not change. It remains to rove 5

that kl(r 0 )k kl(r)k. Comaring the contribution of the rst k machines, we need to rove that kx i= (a + u i ) k(a + x) : We aly Minkowski inequality (the triangle inequality for the L norm) to the vectors (a; : : : ; a) and (u ; : : : ; u k ) and obtain kx i= (a + u i )! = k = a + kx i= Raising the inequality to th ower nishes the roof. u i! = = k = (a + x): The above lemmas imly that in order to nd the cometitive ratio of the greedy algorithm we need to solve a restricted otimization roblem. Dene the function f (x; ) = ( )( + x) + ( )x + Theorem 3.4 The cometitive ratios of the greedy algorithm are C Greedy;m = suff (x; M=m) = j 0 x; M = 0 : : : ; m g; C Greedy = suff (x; ) = j 0 x; 0 g: Proof: Let u(x; M; m) be a vector with m M comonents with value x and M zeros, and let R = (; u(x; M; m)) be the corresonding at shae. The function f is dened so that f(x; M=m) = = C(R) if x h(r). If x < h(r), the denominator of f is the th ower of the cost of a valid but not otimal schedule, hence f(x; M=m) = < C(R) in this case. By the above lemmas for every schedule S generated by Greedy there exists x and M such that the at shae R = (; u(x; M; m)) satises C(S) C(R) and x > h = h(r): First aly Lemma 3. to obtain a at shae of S; by Lemma 2.. the cometitive ratio can only increase. Next aly Lemma 3.2 and Lemma 3.3 to obtain a at shae (a; u) with u of the desired form. Last, normalize the shae so that a =, i.e., ut R = (; u=a) (if a = 0, the greedy schedule is otimal). From the construction it follows that x > h = m=m and M < m. Hence C(R) = f(x; M=m) =. This roves the uer bounds. To rove the lower bound for a xed m, it is sucient, given x and M, to nd a sequence with the same cost ratio as that of the shae R = (; u(x; M; m)). From the denition of R it follows that h(r) = m=m. The sequence of mm jobs with weight =M followed by m M jobs of weight x has the desired ratio. For unrestricted m, given x and, x a sequence of rational numbers M i =m i converging to. The result now follows from the lower bound for xed m, since f is continuous. In the Theorem A. in the aendix we rove that the suremum of f (x; ), x > 0, 0 < <, is achieved at a unique oint, moreover we give a method to comute the suremum and hence the cometitive ratio. Here we state the results for the most interesting case of sum of squares, i.e., evaluating CGreedy 2 for = 2. 6

Theorem 3.5 For = 2 the erformance of Greedy is: CGreedy 2 = 4; 3 CGreedy;m 2 = 4 3 for m 0 (mod 3); CGreedy;m 2 < 4 for m 6 0 (mod 3); 3 CGreedy;m 2 = 3+ 5 :3090 for m = 2; 4: 4 Proof: According to Theorem A., the suremum of f 2 is achieved at x = 3, = 2=3, and C 2 Greedy = f 2(x; ) = 4=3. The claim for m divisible by 3 is true, since the suremum of f 2 is achieved for M = 2m=3. The claim for m not divisible by 3 is true, since the suremum of f is unique, and M=m 6= 2=3 if m is not divisible by 3. The last claim is obtained by maximizing the function f 2 (x; =2) for m = 2 and additionally f 2 (x; =4) and f 2 (x; 3=4) for m = 4. The calculus leads to simle quadratic equations, which give the otimal solution x = + 5 and M = m=2. Details are omitted. 4 Lower Bounds In this section we rove that for = 2 Greedy is otimal for xed m = 2; 3; 4, and hence also for arbitrary m (since the worst erformance of Greedy is achieved for m = 3). In Theorem 4.2 we give weaker lower bounds for m > 4, and in Aendix C we rove some bounds for general. Theorem 4. For any on-line assignment algorithm A, C 2 A C 2 A;3 4=3 and C2 A;2 ; C 2 A;4 (3 + 5)=4 :3090. Proof: Consider the sequence (; ; ; 3; 3; 3; 2) for 3 machines. First the 3 jobs of weight arrive. If the algorithm A assigns two or more jobs with weight on the same machine, it does not get any other job. Its cost is at least 5, the otimal cost is 3, and we are done. Otherwise, A assigns on every machine one job with weight. Now the next 3 jobs of weight 3 arrive. If A assigns two or more of these jobs on the same machine, is does not get any other job. Its cost is at least ( + 3 + 3) 2 + ( + 3) 2 + 2, whereas the otimum cost is 3 4 2 which again yields a ratio greater than 4=3. Otherwise all machines have load 4 before the last job of weight 2 arrives. The cost of A on the full sequence is (2 + 4) 2 + 2 4 2, whereas the otimum cost is 2 2 + (3 + 3) 2 + (3 + + + ) 2, which yields a ratio of 4=3. Consider the sequences (; ; + 5) for m = 2 and (; ; ; ; + 5; + 5) for m = 4. If the algorithm assigns two jobs of weight to the same machine, the ratio is at least 3=2 > (3 + 5)=4. Otherwise together with the large job(s) we get a ratio of (3 + 5)=4. Theorem 4.2 For any number of machines m 2 and any on-line assignment algorithm A, C 2 A;m 5 :236: Proof: Denote = 5=2 0:8. We assume that m is even. The roof for m which is odd is similar and omitted in this abstract. Consider the sequence of m jobs of weight followed by m=2 jobs of weight + 5. At the beginning m jobs of size arrive. Let l be 7

Figure 3: Best on-line assignment of the rst m jobs with l emty machines Figure 4: Best on-line assignment of all the jobs with l emty machines the number of emty machines after the on-line algorithm assigns the m jobs of size. An examle of the best such assignment is shown in gure 3. First we assume m l m. In this case the cost of the on-line algorithm is at least 2 l 2 2 + (m 2l) 2 whereas the cost of otimum is m 2. This yields a ratio of at least +2l=m +2 = 5 which comletes the roof of this case. Since any assignment for l m=2 results in a cost greater than the assignment for l = m=2 we may assume l < m. Now the next m=2 jobs of weight 5 + arrive. An examle of the best on-line assignment for these jobs is shown in gure 4. The cometitive ratio of the algorithm is at least l 2 2 + (m=2 l) (2 + 5) 2 + (m=2 l) 2 + l ( 5 + ) 2 m=2 ( 5 + ) 2 + m=2 ( + ) 2 : Since l=m <, the cometitive ratio is at least (( ) 2 5 + 5)=(5 + 5) = 5. 5 A better algorithm Our algorithm uses the same intuition as the algorithms better than greedy for the classical makesan scheduling roblem [5, ]. Namely, instead of trying to balance all the machines, we maintain a constant fraction of the machines that are reserved for large jobs. In our case we schedule the small jobs on the remaining machines greedily. We dene an algorithm A(t; ) for any t > 0 and 0 < <. The choice of t and clearly inuences the erformance of the algorithm. However, since the exact analysis is beyond our current techniques, we are not trying to otimize them. Algorithm A(t; ): Let k = bmc. Before an arrival of a job, reorder the machines so that the loads are non-decreasing; let L k+ be the (k +)th smallest load. Uon arrival of a job j with weight w, if w tl k+, schedule the job on machine, otherwise schedule it on machine k +. Theorem 5. For every, there exist m 0, t,, and > 0 such that for any m > m 0, C A(t;);m C Greedy. The roof of Theorem 5. occuies the rest of this section and has the following structure. First, in Lemma 5.2 we rove that if the cometitive ratio of a at shae is very close to C Greedy, then the shae must look very similar to the unique worst case examle for Greedy found in Theorem 3.4. This is intuitively clear, but the recise roof is technical. We give the roof of Lemma 5.2 in Aendix D. Using this lemma we rove that for suciently 8

small, in any schedule S roduced by the algorithm such that C(S) is close to C Greedy, there must be many machines with load at least + t. Lemma 5.3 shows that in such a case there are k jobs with large load that are scheduled on machines with the load of remaining jobs at most y < ; here we use the crucial roerties of the algorithm. Last, we show that the existence of such k jobs, together with reviously established conditions, imlies that the cometitive ratio is bounded away from C Greedy. Let and x be the unique maximum of the function f (x; ) from the analysis of the greedy algorithm, Theorem A.. Fix constants t and h such that = < h < t < x ; such constants exist, since we have roved that x >. Fix a constant y <, such that H = y( + t) > h > =. Such y clearly exists. Note that it follows that yt >. Lemma 5.2 There exist constants ; B; > 0 and m such that for every m > m, if Q = (; v) is a at shae satisfying C(Q) C Greedy, then h(q) h, (OPT(Q)) Bm, and there are at least m machines with L i (Q) + t. Fix the constants ; B; and m so that the revious lemma holds. Now consider a schedule S generated by the algorithm A(t; ). W.l.o.g. we assume that the (k + )st smallest load is. (If it is 0, then the S is the otimal schedule. Otherwise multily all the weights by an aroriate constant; this does not change the behavior of the algorithm). Dene a shae Q = (b; v) by bi = minf; L i (S)g and v i = L i (S) b i. For each i, when the last job was scheduled on the machine i, the load was at most, therefore Q is a shae of S and C(S) C( Q). Now let Q = (; v) be a at shae corresonding to Q. We want to show that C(Q) is close to C( Q) for a suciently small. The load vector of Q is the same as L( Q) excet that the load of k machines increased to. Clearly kl(q)k kl( Q)k. Adding the extra load of at most k increases in the otimal schedule load of some machines to h(q). This contributes to (OPT(Q)) at most k(h(q)), since the derivative of x is increasing. Using also the facts that m(h(q)) (OPT(Q)) and h(q), we obtain (OPT(Q)) (OPT( Q)) k(h(q)) k(opt(q)) mh(q) or OPT(Q) OPT( Q)=( ) =. Fix some 2 < and ut = Assume that C(S) C Greedy 2 and <. It follows that (OPT(Q)) CGreedy C Greedy 2 : C(Q) C( Q)( ) = C(S)( ) = (C Greedy 2 )( ) = = C Greedy : Hence Q satises the assumtions of Lemma 5.2, and, in articular, there must be l = m machines with load L i (S) = L i (Q) + t. Lemma 5.3 For any l 0 and any schedule S generated by the algorithm A(t; ) (normalized as above), if there are l + k machines with L i (S) + t, then there exists z and a shae R = (a; u) of S such that the machines are artitioned into k \small" machines with z a i and u i = 0, l \bad" machines with a i = z and u i t + z, k \good" machines with a i = y and u i yt, and m l 2k \ordinary" machines with a i = and u i arbitrary. 9

Proof: Let c i be the load of the machine i before the last job scheduled on it. We will construct R so that always a i c i and u i = L i a i. This imlies that R is a shae of S. Let z be the smallest load in S. For the k machines with smallest load we ut a i = L i. These are the small machines. By the roerties of the algorithm, no job was so far assigned to a machine with load more than, hence every remaining machine can be assigned a i = and be an ordinary one. If L i + t, then the last job assigned to i has weight at least t, therefore according to the algorithm it was assigned to a machine with the smallest load at some revious time. Hence c i z and we may assign a i = z making this machine bad. It remains to nd the k good machines. If c i y for all machines with L i + t, we choose k of them and make them good by utting a i = y. Otherwise at some time a big job was assigned on a machine with load at least y. Hence at time the load of all machines was at least y. Let 0 be the rst time when the (k + )st smallest load was at least y. At that time all the k smallest loads were strictly less than y. Hence between time 0 and, the algorithm scheduled k jobs on the k small machines. These jobs have size at least ty, by the condition in the algorithm. Furthermore, no other jobs were scheduled on these machines later, as ty >. Hence these k machines have c i y and can be made good. Some of them may be the machines with L i + t, however, we still have at least l machines that can be made bad. Let the shae R and the value z be as in Lemma 5.3 and let the shae Q be the same as before Lemma 5.3. To nish the roof of Theorem 5., we show that for suitable arameters OPT(R) is signicantly larger than OPT(Q). The otimal schedule for R is obtained from the otimal schedule for Q so that we (i) move weight at least l( z) from the machines with load h = h(q) to the big jobs from the bad machines, which have weight at least t, (ii) move weight at least k( + y) from the machines with load h to the big jobs from the good machines, which have weight at least H = y( + t), and (iii) remove a weight of at most k( z) (corresonding to the small machines) from machines with load h. Since H t, we are always moving the weight from machines with load at most h to machines with load H and using the fact that the derivative of x is nondecreasing, we bound the resulting dierence as follows: (OPT(R)) (OPT(Q)) (l( z) + k( y)) (H h ) k( z) h ( z)m((h h ) h ) + bmc( y)(h h ): Since ; ; H; h are constants and h satises h h < H, we can choose suciently small so that the rst term is always ositive. Then, for suciently large m, the second term is at least "m for some " > 0. Hence, using (OPT(Q)) Bm, we obtain and (OPT(R)) (OPT(Q)) + "m (OPT(Q)) ( + "=B) C(S) C(R) = kl(r)k OPT(R) for a suciently small > 0. kl(q)k OPT(Q)( + "=B) = C Greedy ( + "=B) = C Greedy 0

References [] S. Albers. Better bounds for on-line scheduling. In Proc. 29th ACM Sym. on Theory of Comuting, 997. To aear. [2] N. Alon, Y. Azar, G. Woeginger, and T. Yadid. Aroximation schemes for scheduling. In Proc. 8th ACM-SIAM Sym. on Discrete Algorithms, 997. To aear. [3] J. Asnes, Y. Azar, A. Fiat, S. Plotkin, and O. Waarts. On-line load balancing with alications to machine scheduling and virtual circuit routing. In Proc. 25th ACM Symosium on the Theory of Comuting, ages 623{63, 993. [4] B. Awerbuch, Y. Azar, E. Grove, M. Kao, P. Krishnan, and J. Vitter. Load balancing in the l norm. In Proc. 36th IEEE Sym. on Found. of Com. Science, ages 383{39, 995. [5] Yair Bartal, Amos Fiat, Howard Karlo, and R. Vorha. New algorithms for an ancient scheduling roblem. In Proc. 24th ACM Sym. on Theory of Comuting, 992. [6] A.K. Chandra and C.K. Wong. Worst-case analysis of a lacement algorithm related to storage allocation. SIAM Journal on Comuting, 4(3):249{263, 975. [7] B. Chen, A. van Vliet, and G. Woeginger. New lower and uer bounds for on-line scheduling. Oerations Research Letters, 6:22{230, 994. [8] R.A. Cody and E.G. Coman, Jr. Record allocation for minimizing exected retrieval costs on crum-like storage devices. J. Assoc. Comut. Mach., 23():03{5, January 976. [9] G. Galambos and G. Woeginger. An on-line scheduling heuristic with better worst case ratio than graham's list scheduling. Siam Journal on Comuting, 22(2):349{355, 993. [0] M.R. Garey and D.S. Johnson. Comuters and Intractability. W.H. Freeman and Comany, San Francisco, 979. [] R.L. Graham. Bounds for certain multirocessor anomalies. Bell System Technical Journal, 45:563{58, 966. [2] J.Y.T. Leung and W.D. Wei. Tighter bounds on a heuristic for a artition roblem. Information Processing Letters, 56:5{57, 995.

Aendix A Cometitive Ratio of Greedy Theorem A. Dene: G = f(x; ) 2 R 2 j 0 < x; 0 < < g = ( ) and the system of equations S, S : ( x = (+x) x = Then, the suremum of f (x; ) over x > 0, 0 < < is achieved as a unique maximum in G. The maximum is achieved at the unique oint which is the solution to the system S in the region G. Furthermore, the value of the cometitive ratio is C Greedy = + x where x is the unique solution of the system S in G. Proof: It is easy to verify that f (0; ) = ( > ; 0 ) lim f (x; ) x! 0<< = ( > ) lim f (x; )!0 + = 0 x>0 f (x; ) = and therefore, the suremum of the function f (x; ) is achieved as a local maximum in G. Since the function f (x; ) has derivatives in G, the oint that achieves the maximum is a solution of the system of equations @f @x = 0 and @f @ = 0: In lemma A.2, we rove this system of equations imlies the system S, i.e., x = ; : ( + x) () x = : (2) Next we rove that the solution to the system S in G is unique. Denote: Clearly, LHS(x) = x RHS(x) = ( + x) 2

LHS(0) = RHS(0) In ( ; ), the function LHS(x) is linear, whereas RHS(x) is convex 0 > LHS 0 (0) > RHS 0 (0) and therefore equation () has a unique solution for x in (0; ), which imlies the system S has a unique solution in G. Thus, f (x; ) has a unique maximum in G. By lemma A.2 its value is + x, which comletes the roof. Lemma A.2 For > and (x; ) 2 G the system of equations imlies and @f @x = 0 x = x = f (x; ) = @f @ = 0 ( + x) + x : Proof: Denote: M = ( )( + x) + (3) N = ( )x + : (4) Clearly f = M N, and therefore @f @x = 0 @f @ = 0 ) f = M N () = @M @x @N @x () = @M @ @N @ : (5) @M @x @M @ @N @x @N @ = ( ) ( + x) (6) = ( + x) + = (( + x) ) = ( ) x (7) = x + ( ) = (x + ( ) ) Substituting the above derivatives in equation () gives ( + x) x = ( + x) x + ( ) : 3

Solving for we get or, = x ( + x) ( ) ( + x) (8) x = ( + x) : (9) ( ) ( + x) Substituting equations (3), (4), (6), (7) in equation () gives ( + x) + x + = ( + x) x which yields, = x ( + x) : Substituting equation (9) gives or, = = = ( + x) ( ) ( + x) ( + x) ( ) ( + x) ( + x) ( + x) = : (0) ( + x) Substituting equation (0) in equation (8) gives ( + x) = x ( + x) ( ) ( + x) or, Raising to the ower of ( ) (( + x) ) x (( + x) ) = : we get = ( ) = ( + x) x ( + x) : 4

which is equivalent to x = ( + x) : Substituting the last equation in equation (0) yields = ( ( x)) = ( ) x = x or, Furthermore, by (5) x = : f (x; ) = = = @M @x (x; ) (x; ) @N @x ( ) ( + x) ( ) x + x which comletes the roof. B Aroximating C Greedy Lemma B. + 2 < for >. Proof: Denote: If 2, s() = + 2 for > ln t() = for > : 2 and therefore s() = + ( ) 2 < 2 + = 2 + : 5

Otherwise, < < 2. For all z 2 R, z e z e and therefore ds d = ln 2 ln + 2 t() = ln 2 + 2 e t() = ln 2 2 + e < 0: The last inequality holds since 2 < e ln 2, for < < 2. Therefore s() in monotone decreasing in (; 2). Since, lim! + s() = the roof of the lemma is comleted. For > denote: g (x) = ( + x) for (0; ): For a xed >, dene (a n ) n=0, (b n ) n=0 by a 0 =, a n+ = g(a n ) and b 0 =, b n+ = g(b n ). Lemma B.2 a 0 < a < : : : : : : < a n < a n+ < x < b n+ < b n < : : : : : : < b < b 0 and lim n! a n = lim n! b n = x : Proof: Lemma B. claims that 2 + < for >. Therefore, a 0 < b 0 a 0 = < 2 = a b = < b 0 : By the roof of Theorem A. the equation x = g (x) has a unique solution in (0; ). In addition, g (x) is continuous and monotone increasing in (0; ). As a consequence, we may derive the desired result. The following theorem may be derived by using lemma B.2. Theorem B.3 C Greedy = 2 ln : 6

Proof: Lemma B.2 imlies that a < x < b and therefore we get the following aroximation to x Lemma B. easily imlies that < x < : 2 < + 2 : 2 By theorem 3.4 and the aroximation to x we get and ( + ) < CGreedy < + Since a x = + x ln a + O(x 2 ln a), C Greedy = = = = e ln ln e + + O 2 2! ln + O! + O ln2 2 < + + 2 : ln2 2!! Hence, = 2 ln e ln e = 2 = 2 = 2!! + O ln2 2 + O ln2 2!! 0 ln e ln (2 @ + O( ln2 )) 2 ln e ln 4e + O ln2 2! + O ln2 : 2 + O 0 @ ln (2 ln e!! ln 2 ln + O 2 C Greedy = 2 ln : AA + O( ln2 )) 2 2 7

It can be easily shown that when decreases to, x increases to the x + - the unique solution in (0; ) to the equation ln( + x) = x e, and the cometitive ratio decreases to. In addition, when decreases to, x decreases to and the cometitive ratio increases to 2. Figure 5 shows the cometitive ratio, x,, x for various u to four decimal digits recision. Calculations are based on the sequences (a n ) n=0, (b n ) n=0. The last column shows the number of iteration required to satisfy the desired recision for all entrances in the table. C Greedy x x No. of Iterations & & % x + & + = x e + % e.00.0002 4.7557 0.573 2.769 2.0.009 4.7256 0.5724 2.7048 20..086 4.4498 0.5829 2.5937 8.25.0449 4.0734 0.5994 2.444 6.5.0850 3.6042 0.6243 2.2500 4.75.24 3.267 0.6465 2.089 2 2.547 3.0000 0.6667 2.0000 0 2.5.237 2.6253 0.707.8420 8 3.2646 2.3692 0.73.732 7 4.3485 2.043 0.7776.5874 5 5.450.8405 0.825.4953 4 7.535.6085 0.8599.383 3 0.6096.4346 0.9003.295 2 5.7032.300 0.9333.234 2 25.7952.92 0.9600.435 50.8804.052 0.9800.083 00.932.0582 0.9900.0476 000.9907.0079 0.9990.0069 0000.9988.000 0.9999.0009 % % 2 & % & Figure 5: Aroximation of the cometitive ratio for various Exerimental results shows that C Greedy is monotone in. In fact that can be easily (but technically) roved for 0. Similarly to the aroximation of C Greedy, an aroximation for C Greedy;2, i.e., the cometitive ratio of Greedy for 2 machines, can be evaluated. Figure 6 shows the cometitive ratio and the solution to equation () for various u to four decimal digits recision. Again, the last column shows the number of iteration required to satisfy the desired recision for all entrances in the table. 8

C Greedy x No. of Iterations & & % 2( + 2).00.0002 4.8252 24.0.009 4.7967 2..083 4.5373 6.25.0438 4.879 3.5.087 3.762 0.75.49 3.460 8 2.44 3.236 7 2.5.927 2.929 5 3.23 2.7320 4 4.2870 2.5006 3 5.3248 2.3738 3 7.374 2.2446 2 0.4083 2.60 5.4380 2.05 25.4624 2.0586 50.48 2.0285 00.4905 2.04 % % 3 2 & 2 Figure 6: Aroximation of the cometitive ratio over two machines for various C General C. Two Machines Theorem C. The cometitive ratio of Greedy for two machines is C Greedy;2 = su x0 + ( + x) 2 + x ; where x 2 (0; ) is the unique solution of the equation x ( + ( + x) ) = 2 : () Proof: For x 0 dene: u (x) = + ( + x) 2 + x v (x) = x ( + ( + x) ): 9

By theorem 3.4, + ( + x) C Greedy;2 = su x0 2 + x = su u (x) : x0 Since u (0) = 2 and lim x! u (x) =, u (x) achieves its maximum in (0; ). Hence, x - the maximum of u (x) in (0; ), satises u 0 (x ) = 0. The last equation is equivalent to the equation v (x) = 2. Since v 0 (x) > 0 for x > 0, the equation v (x) = 0 (or u 0 (x) = 0) has a unique solution in (0; ). Theorem C.2 For any > the cometitive ratio of any on-line algorithm over two machines is at least C Greedy;2. Proof: Consider the sequence (; ; x). If the on-line algorithm assigns the rst two jobs on the same machine, it does not get another job and it roduces a cost of 2 ( )=. It is easy to show by convexity arguments of the function y that for all x 0 and therefore + ( + x) (2 + x) 2 + x 2 2 + x 2 + ( + x) 2 + x 2 : Hence, by theorem C. we are done. Otherwise, the last job is given to the on-line algorithm. The resulted assignment of the sequence is the same as the worst case examle of Greedy, and we are done again. C.2 Lower Bound for general > Similarly to theorem 4.2 a lower bound for general > and arbitrary number m 2 of machines can be roved. The lower bound deends on and grows to 3 as grows to. 2 Theorem C.3 For any > the cometitive ratio of any on-line algorithm A over m 2 of machines satises C A;m c = 3 2! 3 2 ( + ( 3 4 ) + ) 4 2 2 Proof: Similarly to the roof of theorem 4.2. Denote = c 2 2 : It can be easily seen that < 2 sequence of jobs (; : : : : : : : : : : : : : : : ; ; 2; : : : ; 2): = 3 2 : for >. W.l.o.g m is even. Consider the following {z } m {z } m=2 20

Let l be the number of idle machines after the on-line algorithm assigns the rst m jobs. Assume m l m (see gure 3 for a similar examle). In this case the ratio is at least 2 l 2 + (m 2l) m = + (2 2) which is at least c. Since any assignment for l m results in a cost greater than the assignment for l = m 2 2 we may assume l < m < m. In this case the cost of the otimum is 2 2m=, whereas the best assignment for the on-line algorithm (see gure 4 for a similar examle) yields a cost of l 2 + 2 m l + l m 2 m l 3 + l 2 : This cost is at least 2m = c, and therefore the roof is comleted. D The roof of Lemma 5.2 Lemma 5.2 says that if the cometitive ratio of a at shae is very close to C Greedy, then the shae must look very similar to the worst case examle for Greedy found in Theorem 3.4. This is intuitively clear, since the worst case examle is unique by Theorem A., however, the recise formulation and roof are a bit tedious; in articular we have to reexamine the transformations from Lemma 3.2 and Lemma 3.3. Lemma D. (5.2) There exist constants ; B; > 0 and m such that for every m > m, if Q = (; v) is a at shae satisfying C(Q) C Greedy, then h(q) h, (OPT(Q)) Bm, and there are at least m machines with L i (Q) + t. Proof: Fix x > 0 such that t < x x and such that = h <. Now choose such that (f (x; )) = C Greedy imlies jx x j x and j j. Such a exists since f is continuous and has a unique maximum (and f! for x! ). Let Q be a shae from the assumtion of the lemma, let h = h(q). Now we transform Q as follows. First, if there are machines i; i 0 with 0 < v i v i 0 < h, we decrease v i and increase v i 0 by the same amount maxfv i ; h v i 0g. This does not change the load vector of the otimal schedule, and the value of the modied shae can only increase due to convexity of x. We continue this rocess until there is at most one machine with 0 < v i < h. Next we aly Lemma 3.2 and then Lemma 3.3. As a result we obtain a at shae Q 0 = (a; v 0 ) such that all nonzero comonents of v 0 are the same, h(q 0 ) = h, OPT(Q 0 ) = OPT(Q), C(Q 0 ) C(Q), and h=m a +h=m. The bound on a follows since a can change only in Lemma 3.2, and this is alied only to one machine with 0 < v i < h. (In fact, this is the reason why we need to aly a dierent transformation if there are two or more such machines.) The cometitive ratio of Q 0 is bounded by C Greedy C(Q) C(Q 0 ) f (x=a; a=h) for x equal to the non-zero comonents of v 0. By the choice of, jx=a x j x and ja=h j. Choose m 0 2 such that = h =m 0. Now it follows that h =m a h m h ; 2

hence h h. Put B = 2 ( + x + x ). By the choice of m 0 it also follows that a 2. Hence (OPT(Q)) = (OPT(Q 0 )) m(a + x) m(a + a(x + x )) mb: Let l be the number of machines with L i (Q) + t. It remains to rove that l m for suitable constant. First we rove that we can assume that all the loads are bounded by some constant. Given a number s > h, let Q 00 = (; v 00 ) be dened by vi 00 = minfs; v i g. Note that vi 00 t for exactly l machines. Let (a; b) = ((a + b) a ) =. Let v i = s +. If we change only this v i, (kl(q)k ) decreases by (s + ; ) and (OPT(Q)) decreases by (s; ). A standard calculation shows that lim su (s + ; ) = : s! >0 (s; ) Hence for suciently large s and arbitrary > 0 (s + ; ) (s; ) C Greedy : Fix such s. Now it follows that C(Q 00 ) C Greedy, since C(Q) C Greedy, and the ratio of the dierences in each ste is at most C Greedy. Now construct Q 000 = (a 000 ; v 000 ) from Q 00 by the same transformations as Q 0 from Q. Let x be the resulting common value of all non-zero comonents of v 000. Since C(Q 000 ) C Greedy, it must be the case that x=a 000 x. Since a 000 h=m, we have x t for some constant t > t and suciently large m; x m m 0 aroriately. Examining the transformations, x is the th ower mean of l numbers uer-bounded by s and m l numbers uer-bounded by t. Since x t > t, l m for some constant. 22