Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating on uniformly bounded non-negative random variables, we are able to imrove uon revious results due to Johnson, Schechtman and Zinn 1985 and Lata la 1997. Our basic results rovide bounds involving Stirling numbers of the second kind and Bell numbers. By deriving novel effective bounds on Bell numbers and the related Bell function, we are able to translate our moment bounds to exlicit ones, which are tighter than revious bounds. The study was motivated by a roblem in oeration research, in which it was required to estimate the L -moments of sums of uniformly bounded non-negative random variables reresenting the rocessing times of jobs that were assigned to some machine in terms of the exectation of their sum. Keywords: Sums of random variables, moments, bounds on moments, binomial distribution, Poisson distribution, Stirling numbers, Bell numbers. AMS 2000 Subject Classification: Primary: 60E15 Secondary: 05A18, 11B73, 26D15 Deartments of Mathematics and of Comuter Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel, E-mail: berend@math.bgu.ac.il, and Deartment of Mathematics, Rice University, Houston, TX 77251, USA, E-mail: db3905@math.rice.edu. Deartment of Mathematics and Comuter Science, The Oen University of Israel, Ra anana, Israel. E-mail: tamirta@oenu.ac.il. 1

1 Introduction Numerous results in robability theory relate to sums of random variables. The Law of Large Numbers and the Central Limit Theorem are such instances. In this aer we deal, more secifically, with bounds for moments of sums of indeendent random variables. A classical result in this direction, due to Khintchine [11], is the following: Let X 1, X 2,..., X n be i.i.d. symmetric random variables assuming only the values 1 and 1. Then, for every > 0 there exist ositive constants C 1, C 2, deending only on, such that for every choice of a 1, a 2,..., a n R, n 1/2 C 1 ak 2 n n 1/2 a k X k C 2 ak 2, where X = [E X ] 1/ is the L -norm of X. For more on this tye of results we refer, for examle, to [6]. Hereinafter we consider ossibly infinite sequences of indeendent non-negative random variables X i, 1 i t, and aim at estimating the -moments, EX, 1, of their sum, X = t X i. The work of Lata la [12] rovides the last word in this subject. One of the corollaries of his work is the following one: Theorem A [12, Th. 1 & Lemma 8] Let X i, 1 i t <, be indeendent non-negative random variables, X = t X i, and 1. Then, for every c > 0, X 2e max { 1 + c c E 1, 1 + 1 } 1/ E, 1 c where t 1/k E k = EXi k, k = 1,. 2 In [12, Cor. 3], Lata la suggests to take c = ln in 1. This yields the following uniform bound: X 2e 1 + max{e 1, E }, > 1. 3 ln In fact, taking c = 1/ 1, the coefficients of both E 1 and E in 1 coincide and equal 1/ 1 1. Since the latter exression is bounded from above by / ln, we may imrove 3 and arrive at the following: Theorem A. Under the assumtions of Theorem A, X 2e ln max{e 1, E }, > 1. 4 2

Finally, by finding the value of c that minimizes the uer bound in 1, we arrive at the following exlicit version of Lata la s theorem the roof of which is ostoned to the Aendix. Theorem A. Under the assumtions of Theorem A, X 2e 2e 1 1 E1, E 1 1 1 E / 1 1/ E 1/ 1 E 1/ 1 1, otherwise. 12 / E, 5 An estimate similar to that of Theorem A, but with a better constant, was established already in [10]. It was shown there Theorem 2.5 that X K ln max{e 1, E }, K = 2, > 1. 6 In addition, it was shown that estimate 6 cannot hold with K < e 1 see Proosition 2.9 there. We note that Theorem A may offer better estimates than 6, even with K < e 1. For examle, if E 1 1 1 E 12 /, then the uer bound in Theorem A is less than 2e 2 E 1, while the uer bound in 6 is at least 2 E ln 1. In other cases, however, 6 may be sharer than Theorem A. In this aer we deal with the same setu, but restrict our attention to the case of uniformly bounded random variables, where 0 X i 1 for all 1 i t. In this case, E E 1/ 1, whence the existing bounds yield bounds deending only on E 1. We derive here imroved bounds in that case that deend only on E 1. Our bounds offer sharer constants and are almost tight as 1 +, where the revious exlicit bounds, 3 and 6, exlode. Our results aly also to ossibly infinite sequences. Some of our bounds involve Stirling numbers of the second kind and Bell numbers. A key ingredient in our estimates are new and effective bounds on Bell numbers. While revious bounds on Bell numbers were only asymtotic, we rovide here an effective bound on Bell numbers that alies to all natural numbers. That result is an interesting result on its own right and may have alications for other roblems of discrete mathematics. Section 2 includes our main results. In Subsection 2.1 we rovide a short discussion of Stirling and Bell numbers, and a statement of our bounds on Bell numbers and the related Bell function. In Subsection 2.2 we state our main results concerning bounds for sums of random variables. The roofs of our results are given in Section 3. In Section 4 we discuss our results and comare them with the best reviously known ones namely, with 6, due to Johnson et al.. Finally, the Aendix includes roofs of some of our statements Subsections 5.1, 5.2 and 5.3 and a descrition of the oeration research roblem that motivated this study Subsection 5.4. 3

2 Main Results 2.1 Effective Bounds on Bell Numbers The Stirling numbers of the second kind and Bell numbers are related to counting the number of artitions of sets. The first counts, for integers n k 1, the number of artitions of a set of size n into k non-emty sets. This number is denoted by Sn, k or sometimes by { } n k. The second counts the number of all artitions of a set of size n, and is denoted by B n. For more information we refer to Riordan [16]. Let An, k denote the number of colorings of n elements using exactly k colors namely, each color must be used at least once in the coloring. An, k is given by An, k = { } n : r = r 1,..., r k N k, r = n and r i 1, 1 i k, 7 r where r = k r i and n r = n!/r1! r k!. With this notation, the Stirling number of the second kind may be exressed as Sn, k = An, k k! The Bell number B n may be written in terms of the Stirling numbers:. 8 n B n = Sn, k. 9 In [2], de Bruijn derives the following asymtotic estimate for the Bell number B n, ln B n n = ln n ln ln n 1 + ln ln n ln n + 1 ln n + 1 2 2 [ ] ln ln n ln ln n + O. ln n ln n 2 In articular, we conclude that for every ε > 0 there exists n 0 = n 0 ε such that for all n > n 0, n n n n < Bn <. 10 e ln n e 1 ε ln n The roblem with estimate 10 is that it is ineffective in the sense that the value of n 0 = n 0 ε is imlicit in the asymtotic analysis in [2]. We rove here an uer bound that is less tight than 10, but alies for all n. Theorem 2.1 The Bell numbers satisfy B n < n 0.792n, n N. 11 lnn + 1 4

By Dobinski s formula [5, 15], B n = 1 e k n k!. 12 This suggests a natural extension of the Bell numbers for any real index, B = 1 e k k!. 13 We refer to B as the Bell function. Note that, for 0, we have B = EY, where Y P 1 is a Poisson random variable with mean EY = 1. For the sake of roving Theorem 2.1, we establish effective asymtotic bounds for the Bell function: Theorem 2.2 The Bell function B, 13, satisfies for all ε > 0, where and d is given by B < e 0.6+ε, > 0 ε, 14 ln + 1 0 ε = max { e 4, d 1 ε } 15 d = ln ln + 1 ln ln + 1 + e 1 ln. 16 2.2 Bounding Moments of Sums of Random Variables Throughout this aer we let X i, 1 i t, be a ossibly infinite sequence of indeendent random variables for which P 0 X i 1 = 1, X = t X i, and µ = EX. Our first result is an estimate for moments of integral order. Theorem 2.3 Letting S, k denote the Stirling number of the second kind, the following estimate holds: EX mint, S, k EX k e k 1k/2t, N. 17 The above estimate imlies that EX S, kexk for all t, Hence, using 9 we infer the following bound: Corollary 2.4 Letting B denote the -th Bell number, EX B max{ex, EX }, N. 18 In Section 5.3 we give an alternative roof of Corollary 2.4 that relies on the results of de la Peña, Ibragimov and Sharakhmetov [14]. Relying on Corollary 2.4 and an interolation argument, we arrive at the following exlicit bound for all real 1: 5

Theorem 2.5 For all 1, where X 0.792 ν ν = ln + 1 max{ex1/, EX}, 19 1 + 1 {} 1 {} and and {} denote the integer and fractional arts of, resectively. We note that for all 1 ν 1 + 1 1 4 2 1/4, so that inequality 19 may be simlified into the weaker form X 0.942 ln + 1 max{ex1/, EX}, 1. 21 Finally, using the asymtotic uer bounds for the Bell numbers rovided by 10 and by Theorem 2.2, we may obtain better estimates for large moment orders: Theorem 2.6 For all 1 and ε > 0, let ν be as given in 20 and 0 ε be as given in 15 and 16. Then for all ε > 0 and > 0 ε, X e 0.6+ε ν ln + 1 max{ex1/, EX}. 22 In addition, for every ε > 0 there exists ˆ 0 ε > 1 such that for all > ˆ 0 ε, X e 1+ε ν ln + 1 max{ex1/, EX}. 23 Note that the constant e 1+ε in 23 cannot be relaced by any constant smaller than e 1, as imlied by the lower bound from 10 on Bell numbers, and by the reviously mentioned result of Johnson, Schechtman and Zinn [10, Proosition 2.9]. 3 Proofs of the Main Results 3.1 Proofs of Theorems 2.1 and 2.2 Lemma 3.1 Let x 0 = x 0 be defined for all > 0 by Then where α = 1 + 1/e. 20 x 0 ln x 0 =. 24 x 0 < α ln, > ee+1, 25 6

Proof. As the function ux = x ln x is increasing for all x 1, it suffices to show that α u = α α ln ln ln >, > e e+1. ln This is easily seen to be equivalent to or First, we observe that As α 1 + α ln α ln α ln ln ln > 0, > e e+1, v := α 1 ln + α ln α α ln ln > 0, > e e+1. 26 ve e+1 = v e α/α 1 α α = α 1 + α ln α α ln α 1 α 1 v = α 1 inequality 26 follows from 27 and 28. = 0. 27 α ln > 0, > eα/α 1 = e e+1, 28 Proof of Theorem 2.2. Emloying a version of Stirling s formula cf. [17], we obtain by 13 that B 1 n e < e n n 2πnn/e n n. n n=1 Define hx = h x = ln ex x = x + ln x x ln x, x 1. x x As h x = ln x, the function h increases on [1, x x 0] and decreases on [x 0,, where x 0 = x 0 > 1 is determined by 24. Hence, hx hx 0 = x 0 + ln x 0 x 0 ln x 0 = x 0 + ln x 0, x 1. Invoking the bound 25 on x 0, we infer that n=1 hx < α + ln ln ln + ln α 1, x 1, ln for all > e e+1. Using the definition of d in 16, we conclude that hx < d + ln ln ln + 1 + ln α 1. It is easy to check that d is decreasing for > 1 and mas the interval 1, onto the interval 0,. Hence, for every ε > 0 there exists a single 0 = 0 ε > 1 such that d 0 = ε and d < ε for all > 0. We conclude that for every ε > 0 and > max {e e+1, d 1 ε}, hx < ln ln ln + 1 + ε + ln α 1. 29 7

For larger x we can do much better by observing that hx = x x ln x < x x 2 ln x < x, x > 2, > e4. 30 Combining 29 and 30, we conclude that for > max {e 4, d 1 ε}: 2 B < e hn = e hn + e hn < 31 n=1 n=1 n= 2 +1 < 2 e ln ln ln+1+ε+ln α 1 + n= 2 +1 e n. The first term on the right-hand side of 31 may be bounded as follows for all > e 4 : 2 e ln ln ln+1+ε+ln α 1 = 2 e ε+ln α 1 < ln + 1 e 0.6007+ε. 32 ln + 1 The second term on the right-hand side of 31 may be bounded by evaluating the sum of the geometric series, e n = e 2 +1 < e 2. 33 1 e 1 n= 2 +1 Using 32 and 33 in 31 we arrive at B < Finally, as it is easy to verify that e 0.6007+ε + e 2, > 0 ε. 34 ln + 1 e 0.6007+ε e + e 2 0.6+ε, > e 4, 35 ln + 1 ln + 1 the required estimate 14 follows from 34 and 35. Proof of Theorem 2.1. Using ε = de 4 0.346 in Theorem 2.2, we get that Since it may be verified that B < B < 0.776, > e 4. 36 ln + 1 0.792, = 1, 2,..., 54 = e 4, 37 ln + 1 the lemma follows from 36 and 37. 8

3.2 Proof of Theorem 2.3 The main tool in roving Theorem 2.3 is the following general-urose roosition: Proosition 3.2 For every convex function f, EfX EfY, 38 where Y is a binomial random variable with distribution Y B t, µ t in case t <, and a Poisson random variable with distribution Y P µ otherwise. A simlified version of this roosition, for the case where X is a sum of a finite sequence of Bernoulli random variables, aears in [9, Theorem 3]. For the sake of comleteness, we include a roof of this roosition in the aendix. We now roceed to rove our first estimate that is stated in Theorem 2.3. Denote µ = EX. Consider first the case of infinite sequences, t =. In that case, according to Proosition 3.2, EX EY where Y P µ is a Poisson random variable. Denoting m µ = EY, it may be shown that m +1 µ = µ m µ + dm µ dµ see [13]. This recursive relation imlies, as shown in [13], that Hence EY = S, k µ k. EX S, k µ k, 39 in accord with inequality 17 for t =. As for finite sequences, t <, the bounds offered by Proosition 3.2 involve moments of the binomial distribution. The moment generating function of a binomial random variable Y Bt, q is t t t M Y z = Ez Y = P Y = lz l = q l 1 q t l z l = qz + 1 q t. 40 l=0 l=0 l This function enables the comutation of the factorial moments of Y through Since k 1 M k Y 1 = E Y i. Y = i=0 k 1 S, k Y i 9 i=0

see [7,. 72], then As 40 imlies that EY = S, k M k Y 1. 41 M k k 1 Y 1 = q k t i 42 note that when k t + 1 we get M k Y 1 = 0, we conclude by 41 and 42 that EY = mint, i=0 k 1 S, k q k t i. 43 Using 43 in Proosition 3.2, where Y Bt, µ, we conclude that, when t < and N, t i=0 Finally, as we infer that This concludes the roof. EX k 1 i=0 mint, 1 i k 1 t EX mint, S, k µ k i=0 k 1 i=0 1 i. t e i/t = e k 1k/2t, S, k µ k e k 1k/2t. Estimate 17 of Theorem 2.3 may be relaxed as follows: EX mint, S, k EX k, N. 44 We roceed to describe a straightforward roof of 44 that does not deend on Proosition 3.2 and is much shorter than the roof given above. Proof of 44. As X = t X i, we conclude that X = X r, r {r: r =} where r = r 1,..., r t N t is a multi-index of non-negative values, X = X 1,..., X t, and X r = t X r i i. By the indeendence of the X i s and their being bounded between 0 and 1 we conclude that EX = EX r i i EX i. 45 r {r: r =} 1 i t r {r: r =} 1 i t r i >0 10

The sum on the right-hand side of 45 may be slit into artial sums, Σ = Σ 1 + Σ 2 +... + Σ mint,, where Σ k denotes the sum of terms in which the roduct consists of k multilicands. Thus, Σ 1 is the artial sum consisting of all terms of the form EX i, 1 i t, the artial sum Σ 2 includes all terms of the form EX i EX j, 1 i < j t, and so forth. With this in mind, we may rewrite 45 as mint, EX k A, k EX ij. 46 1 i 1 <i 2 <...<i k t j=1 Next, we observe that for all k, 1 k mint,, t k EX k k = EX i k! EX ij. 47 1 i 1 <i 2 <...<i k t j=1 Hence, 44 follows from 46, 47 and 8. 3.3 Proofs of Theorems 2.5 and 2.6 The roofs of Theorems 2.5 and 2.6 rely uon an interolation argument, given by the next lemma. Lemma 3.3 Let 1 be real and let N and γ = [0, 1 denote its integer and fractional arts, resectively. Then for any random variable X where θ = 1 γ 0, 1]. Proof. Using Hölder s inequality, X X θ X 1 θ +1, EX Y X s Y s/s 1, s 1,, we may bound the -th moment of a random variable X in the following manner: X = EX = EX θ X 1 θ X θ s X 1 θ s/s 1, s 1,. For the sake of simlicity, and without restricting generality, we assume herein that X is non-negative. Since X α σ = EX ασ 1/σ = X α ασ, σ 1, α 1/σ, we conclude that X X θ θs X 1 θ 1 θs/s 1, s 1,. 48 11

Now set s = = 1. Since θs =, while θ 1 γ 1 θs s 1 = 1 1 γ 1 γ 1 1 = + 1, 49 1 γ the desired inequality follows from 48 and 49. Proof of Theorem 2.5. In view of Corollary 2.4 and Theorem 2.1, we conclude that X c ln + 1 max{ex1/, EX}, N, 50 where we ut c = 0.792. Fixing N and γ [0, 1, we roceed to estimate X +γ. By Lemma 3.3, X +γ X θ X +1, 1 θ 1 γ θ =. 51 + γ Emloying 50, we obtain in case EX 1, and if EX < 1. Since θ X +γ c ln + 1 θ X +γ c ln + 1 we may combine 52 and 53 as follows: θ X +γ c ln + 1 It is easy to check that 1 θ + 1 EX 52 ln + 2 1 θ + 1 EX θ + 1 θ +1 53 ln + 2 θ + 1 θ + 1 = 1 + γ, 1 θ + 1 max{ex 1/+γ, EX}. 54 ln + 2 θ = 1 γ η and 1 θ = γ + η, 55 where Hence, η = θ + 1 1 θ = 1 γ + 1 γ γ1 γ + γ. 56 1 + 1 η. 57 12

As, by Jensen s inequality, we infer by 54, 57 and 58 that 1 γ + 1 γ 1 γ + γ + 1 = + γ, 58 η + γ 1 + 1 X +γ c ln + 1 θ ln + 2 1 θ max{ex1/+γ, EX}. 59 We claim, and rove later, that Therefore, by 59, 60 and 56, 1 ln + 1 θ ln + 2 1 1 θ ln + γ + 1. 60 X +γ c 1 + 1 γ1 γ +γ + γ ln + γ + 1 max{ex1/+γ, EX}, in accord with 19. Hence, all that remains is to rove 60. To that end, we use the definition of θ, 51, in order to exress γ in terms of θ and rewrite 60 as follows: ln + 1 + 1 θ ln + 1 θ ln + 2 1 θ, N, 0 < θ 1. 61 + θ By extracting the logarithm of both sides of 61, our new goal is roving that f + 1 + where fx = ln ln x. Denoting we aim at showing that 1 θ θf + 1 + 1 θf + 2, N, 0 < θ 1, + θ gθ = f + 1 + 1 θ, + θ gθ 1 θg0 + θg1, 0 θ 1, N. To this end, it suffices to show that g is convex in [0, 1], namely, that g θ 0 for θ [0, 1]. As the latter claim may be roved by standard techniques, we omit the further details. The roof is thus comlete. Proof of Theorem 2.6. The roof of this theorem goes along the same lines of the roof of Theorem 2.5. We omit further details. 13

4 Discussion Here we comare our bounds to the reviously known ones. According to Johnson et al., 6, X 2 ln max{e 1, E }, 1, 62 where E k = t EX k i 1/k. According to our Theorem 2.5, on the other hand, X 0.792 1 + 1 {} 1 {} ln + 1 max{ex1/, EX}, 63 for all 1. First, we note that 63 relies only on E 1 and does not involve E, as does 62. However, 63 alies only to the case of uniformly bounded variables, P0 X i 1 = 1, while 62 is not restricted to that case. It should be noted that, in the case where the variables X i are not uniformly bounded by 1 or some other arbitrary fixed constant, it is imossible to bound EX in terms of E 1 alone. As an examle, let X i a B1, 1, 1 i t, for some at a > 0. Then E 1 = 1 but EX 1/ E = a 1/. As a may be arbitrarily large, EX 1/ could not ossibly be bounded in terms of E 1 only in this case. Estimate 63 is sharer than 62 in the sense that the bound in the latter tends to infinity when 1 +, while that of 63 aroaches 0.792 EX 1.14EX a small ln 2 constant multilicative factor above the limit of the left-hand side, EX = lim 1 + X. When E 1 1, estimate 63 is better than estimate 62 by a factor of at most q = 0.792 2 1 + 1 {} 1 {} ln ln + 1 0.792 2 1 4 1 0.471. 2 For examle, q2 = 0.792 1 + 1 0 ln 2 2 2 ln 3 0.25. However, when E 1 1, estimate 62 may become better than 63, as exemlified below. Examle 4.1 Let = 2, E 1 < 1. Assume that all of the random variables are i.i.d. and that t2 X 2 i B1, 1. Hence, E 2 1 = 1/t and E 2 = 2/t 3. The uer bound in 62 is 4 maxe ln 2 1, E 2 = Θ1/t. However, the uer bound in 63 is of order of magnitude Θ1/ t. Hence, in this examle 62 is sharer than 63. Finally, we recall that Theorem 2.6 offers a further imrovement of the estimate in Theorem 2.5 by a multilicative factor of e 0.6+ε for all ε > 0 and > 0.792 0 ε for 0 ε that is exlicitly defined through 15 and 16, or by a multilicative factor of e 1+ε for all ε > 0 0.792 and sufficiently large. Acknowledgment. The authors would like to exress their gratitude to Michael Lin for his helful comments on the aer. 14

5 Aendix 5.1 Proof of Theorem A Denote fc = 1 + c c E 1, gc = 1 + 1 1/ E. c One checks easily that f is decreasing on the interval 0, 1/ 1 and increasing on 1/ 1,. If E 1 > 1 1/ E, then fc > gc throughout the ositive axis. Hence, the value of c that minimizes the right-hand side of 1 is that which minimizes fc, namely c = 1/ 1. This falls under the first case in 5 and gives the required result. Next, if E 1 1 1/ E, it may be easily verified that f and g intersect exactly once on the ositive axis, at the oint c 0 = 1 1/ E /E 1 1/ 1 1. There are two subcases to consider here: If f is non-increasing at c 0, which haens if E 1 1 1 12 / E, then the value of c that minimizes the right-hand side of 1 is still the minimum oint of f, which again falls under the first case in 5. If, on the other hand, f increases at c 0, namely E 1 < 1 1 12 / E, then the oint that minimizes the right-hand side of 1 is the intersection oint of f and g. Plugging the common value of f and g in the right-hand side of 1, we obtain the second case in 5. 5.2 Proof of Proosition 3.2 Here, we rove Proosition 3.2. We begin by stating and roving several lemmas. Hereinafter, f is a convex function. Lemma 5.1 Let X be a random variable with P a X b = 1 for some b > a 0, and µ = EX. Let X be a random variable, assuming only the values a and b, having the same exected value, i.e., E X = µ. Then EfX Ef X = fa b µ b a + fb µ a b a. 64 Proof. Let Y U0, 1 be indeendent of X. Define X by: X = { a, Y b X b a, b, otherwise. Then whence E X X = a b X b a + b X a b a E X = EE X X = EX = µ. = X, 65 15

Thus, X indeed has the required distribution. Consequently, by Jensen s inequality and 65, Next, we find that P X = a = Ef X = E Ef X X E fe X X = EfX. 66 b a P X = a X = r P X = rdr = b a b r b µ P X = rdr = b a b a. Hence Ef X equals the value on the right-hand side of 64. This, combined with 66, comletes the roof. Lemma 5.2 Let X i, 1 i t <, be indeendent random variables satisfying P 0 X i a i = 1. For 1 i t, let Xi be a random variable assuming only the values 0 and a i and having the same exectation as X i, namely X i /a i B1, µ i /a i, where µ i = EX i. Then t t E f X i E f X i. 67 Proof. We comute the exected value on the left-hand side of 67 by first fixing the values of X 2,, X t, taking the average with resect to X 1, and then averaging with resect to X 2,, X t : E f t X i = E E f t X i X 2,, X t. 68 For every fixed value of X 2,, X t, the internal exected value on the right-hand side of 68 may be bounded using Lemma 5.1, t t E f X i X 2,, X t E f X 1 + X i X 2,, X t. 69 i=2 Using 69 in 68 we conclude that t t E f X i E f X 1 + X i. i=2 By the same token, we may relace each of the other random variables X i, 2 i t, with the corresonding X i without decreasing the exected value, thus roving 67. Lemma 5.3 Let X i, i = 1, 2, 3, be indeendent random variables, with X i B1, i, i = 1, 2, 1 > 2, and X 3 an arbitrary non-negative variable. Let X 1 B1, 1 ε, X 2 B1, 2 + ε, where 0 ε 1 2, and assume that X 1, X 2, X 3 are also indeendent. Then E fx 1 + X 2 + X 3 E fx 1 + X 2 + X 3. 16

Proof. First, we rove the inequality for the case where X 3 is constant, say X 3 = a. On the one hand, we have E fx 1 + X 2 + a = 1 2 f2 + a + 1 + 2 2 1 2 f1 + a + 1 1 1 2 fa. Setting b = 1 ε 2 + ε 1 2 0, we find that E fx 1 + X 2 + a = 1 2 + bf2 + a + 1 + 2 2 1 2 2bf1 + a +1 1 1 2 + bfa = E fx 1 + X 2 + a + bf2 + a 2f1 + a + fa. Since the last term on the right-hand side is non-negative due to the convexity of f, this roves the inequality for constant X 3. The roof for general X 3 now follows: E fx 1 + X 2 + X 3 = E E fx 1 + X 2 + X 3 X 3 E E fx 1 + X 2 + X 3 X 3 = E fx 1 + X 2 + X 3. Proof of Proosition 3.2. Let us assume first that the sequence is finite. By Lemma 5.2, we may assume that all X i s are Bernoulli distributed, say X i B1, i, 1 i t. If not all i s are equal, we can find two of them, say 1 and 2, such that 1 > µ = t i > t t 2. Emloying Lemma 5.3, we can change 1 and 2, while keeing their sum constant and making one of them equal to µ, without reducing the -th moment of the sum. Reeating t this rocedure over and over again until all i s coincide, we see that EfX EfY, where Y Bt, µ. t For infinite sequences, we use the receding art of the roof to conclude that for all finite values of t t E f EfY t, X i where Y t Bt, µ t. Passing to the limit as t t =, we obtain the required result. 5.3 An Alternative Proof of Corollary 2.4 The second art of Corollary 3.1 in [14] states that, if X 1,..., X n are non-negative random variables with EX s k X 1,..., X k 1 a s k and EX t k X 1,..., X k 1 b k for all 1 k n, then n t E X k E θ t 1 n n t/s max b k, a s k 70 17

for all t 2 and 0 < s 1, where θ1 is a Poisson random variable with arameter 1. When X 1,..., X n are indeendent and bounded between 0 and 1, we may set a s k = EXk s and b k = EX k. Taking s = 1, we infer from 70 that n t E X k E θ t 1 n n t max EX k, EX k. 71 Finally, since, by Dobinski formula 12, E θ t 1 is the Bell number B t, inequality 71 coincides with inequality 18 in Corollary 2.4. 5.4 An Alication to Stochastic Scheduling In the classical multirocessor scheduling roblem, one is given a set J = {J 1,..., J n } of n jobs, with known rocessing times τj j = τ j R +, and a set M = {M 1,..., M m } of m machines. The goal is to find an assignment A : J M of the jobs to the machines, such that the resulting loads on each machine minimize a given cost function, T f A = fl 1,..., L m, where L i = j: AJ j =M i τ j, 1 i m. Tyical choices for the cost function f are the maximum norm in which case the roblem is known as the makesan roblem or, more generally, the l -norms, 1. The case = 2 was studied in [3, 4] and was motivated by storage allocation roblems; the general case, 1 < <, was studied in [1]. In stochastic scheduling, the rocessing times τ j are random variables with known robability distribution functions. The random variables τ j, 1 j n, are non-negative. They are also tyically indeendent and uniformly bounded. The goal is to find an assignment that minimizes the exected cost, T f A = E[fL 1,..., L m ]. As such roblems are strongly NP-hard, even in the deterministic setting, the goal is to obtain a reasonable aroximation algorithm, or to comare the erformance of a given scheduling algorithm to that of an otimal scheduling. We roceed to briefly sketch an alication of our results for estimating the erformance of a simle scheduling algorithm, called List Scheduling, as carried out in [18]. Assume that the target function is the l -norm of the vector of loads, i.e., Then, by Jensen s inequality, T f m A = E L i 1/, > 1. T f m 1/ A EL i. 18

Since L i is a sum of indeendent non-negative uniformly bounded random variables where the uniform bound may be scaled to equal 1, estimate 21, which is the simler version of Theorem 2.5, imlies that Hence, EL i 0.942 max{el i, EL i }. ln + 1 T f A 0.942 m ln + 1 1/ max{el i, EL i }. 72 The List Scheduling algorithm is an online algorithm that always schedules the next job to the machine that currently has the smallest exected load. As imlied by the analysis of Graham in [8], EL i 2µ, 1 i m, 73 where µ := max { nj=1 Eτ j m, max 1 j n Eτ j } is a quantity that deends only on the known exected rocessing times of the jobs and is indeendent of the assignment of jobs to the machines. Combining 73 with 72 we arrive at the following bound on the erformance of the stochastic List Scheduling algorithm, References T f A 0.942 ln + 1 m max{2µ, 2µ } 1/. [1] N. Alon, Y. Azar, G. Woeginger and T. Yadid, Aroximation schemes for scheduling, In Proc. 8th ACM-SIAM Sym. on Discrete Algorithms 1997, 493 500. [2] N. G. de Bruijn, Asymtotic Methods in Analysis, Dover, New York, NY, 1958. [3] A. K. Chandra and C. K. Wong, Worst-case analysis of a lacement algorithm related to storage allocation, SIAM Journal on Comuting 4 1975, 249 263. [4] R. A. Cody and E. G. Coffman, Jr., Record allocation for minimizing exected retrieval costs on drum-like storage devices, J. Assoc. Comut. Mach. 23 1976, 103 115. [5] G. Dobinski, Summierung der reihe n m /m! für m = 1, 2, 3, 4, 5,..., Grunert Archiv Arch. Math. Phys. 61 1877, 333 336. [6] T. Figiel, P. Hitczenko, W. B. Johnson, G. Schechtman and J. Zinn, Extremal roerties of Rademacher functions with alications to the Khintchine and Rosenthal inequalities, Trans. Amer. Math. Soc. 349 1997, 997 1027. 19

[7] B. Fristedt and L. Gray, A Modern Aroach to Probability Theory, Birkhäuser, Boston, MA, 1997. [8] R. L. Graham, Bounds on multirocessing timing anomalies, SIAM Journal of Alied Mathematics 17 1969, 416 429. [9] W. Hoeffding, On the distribution of the number of successes in indeendent trials, Ann. Math. Statist. 27 1956, 713 721 [10] W. B. Johnson, G. Schechtman and J. Zinn, Best constants in moment inequalities for linear combinations of indeendent and exchangeable random variables, Annals of Probability 13 1985, 234 253. [11] A. Khintchine, Uber dyadische Bruche, Math. Z. 18 1923, 109 116. [12] R. Lata la, Estimation of moments of sums of indeendent real random variables, Annals of Probability 25 1997, 1502 1513. [13] A. Paoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, New York, NY, 1984. [14] V. H. de la Peña, R. Ibragimov and S. Sharakhmetov, On extremal distribution and shar L -bounds for sums of multilinear forms, Ann. Probab. 31 2003, 630 675. [15] J. Pitman, Some robabilistic asects of set artitions, Amer. Math. Monthly 104 1997, 201 209. [16] J. Riordan, An Introduction to Combinatorial Analysis, John Wiley, New York, NY, 1980. [17] H. Robbins, A remark on Stirling s formula, Amer. Math. Monthly 62 1955, 26 29. [18] H. Shachnai and T. Tassa, Aroximation ratios for stochastic list scheduling, rerint. 20