arxiv: v1 [math.pr] 12 Jun PDF Free Download

arxiv: math.pr/ The missing factor in Bennett s inequality arxiv:6.59v [math.pr] Jun Xiequan Fan Ion Grama and Quansheng Liu Université de Bretagne-Sud, LMBA, UMR CNRS 65, Campus de Tohannic, 567 Vannes, France e-mail: fanxiequan@hotmail.com; ion.grama@univ-ubs.fr; quansheng.liu@univ-ubs.fr Abstract: Let S n be a sum of independent centered random variables satisfying Bernstein s condition with parameter {, and be the variance of S n. Bennett s inequality states that PS n x) exp x} x, where x, x. We give several bounds which + +x/ improve this inequality, in the spirit of Talagrand s refinement of Hoeffding s inequality. In particular, we sharpen this inequality by adding a missing factor Fx) decaying exponentially fast. The interesting feature of our bound is that it recovers closely the shape of the standard normal tail Φx) for all x, in contrast to Bennett s bound which does not share this property. Also, compared with the classical Cramér large deviations, our inequality has the advantage that it is valid for all x.. Introduction AMS subject classifications: Primary 6G5, 6F; secondary 6F5. Keywords and phrases: Sums of independent random variables, large deviations, exponential inequalities, asymptotic expansions, Bennett s inequality, tail probabilities. Let ξ,...,ξ n be a sequence of independent centered random variables satisfying Bernstein s condition Eξ k i k!k Eξ i, for k 3 and i,...,n,.) for some constant >. Denote S n ξ i and Eξi..) Starting from the seminal papers of Cramér [4] and Bernstein [3], the estimation of the tail probabilities PS n > x), for large x >, has attracted much attention. By employing the exponential Markov inequality and an upper bound for the moment generating function Ee λξi, Bennett [] obtained the following inequality cf. 8a) of []): for all x, PS n > x) B x, ) : exp { x }, where x x + +x/..3) Various generalizations and improvements of inequality.3) can be found in Hoeffding [], Statulevičius [], Nagaev [], Petrov [4], Talagrand [], Dedecker and Prieur [5], Pinelis [6], Doukhan and Neumann [6], Rio [7] [8] and Fan, Grama and Liu [8]. Cramér s large deviation resultcf..) of Section ) suggests that Bennett s inequality.3) can be substantially refined by adding a missing factor of order +x. In the case where the summands ξ i are assumed to be bounded, results of such type have been obtained by Eaton [7], Pinelis [5], Talagrand[] and Bentkus []. For example, using the conjugate measure technique of Cramér and Bernstein s method, Talagrand cf..6) of []) showed that if the variables ξ i satisfy b ξ i

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality for some constant b, then there exists an universal constant K such that, for all x Kb, PS n > x) inf λ EeλSn x) Mx)+K b ).4) H n x,) Mx)+K b ),.5) where H n x,) { x+ ) x+ n n x { } x Mx) Φx))exp Since Mx) O +x ) n x } n n+ and πmx) is Mill s ratio: with Φx) π x e t dt..6) ),.5) improves on Hoeffding s bound H n x,) cf..8) of []) by adding a factor of order +x for x Kb. The scope of this paper is to give several improvements of the Bennett inequality.3), in particular, by adding a missing factor in the spirit of Talagrand s inequalities.4) and.5). In addition to the fact that ξ i are not assumed to be bounded, our bounds will be valid for any x unlike the bound.5) which holds true only in the range x Kb. Our results will also imply Talagrand s bound.4) under the less restrictive condition.). Let us explain briefly our main results. Under Bernstein s condition, from Theorem., we obtain, for any x, where x PX n > x) Φ x)) [ +75.36+ x) ],.7) x. The bound.7) present two advantages. First, if we compare.7) with + +4x/ Cramér s large deviation result see.) of Section ), inequality.7) is valid for all x, while Cramér s result holds only for x o ). Second,.7) recovers closely the shape of the normal tail Φx) when x o ǫ ) and ǫ, contrary to Bennett s bound which is close to exp{ x } the exponential part of the normal tail). It is clear that.7) improves Bennett s bound Bx, ) only for small x see Figure ). A considerably sharper bound is obtained in our most important result, Theorem.3, which states that, for all x, PS n > x) B n x, )F x, ),.8) where F x, ) for x, F x, ) O +x ) for x o ) and { )} B n x, ) B x, ) x exp nψ n,.9) +x/ with ψt) t log +t), a nonnegative convex function in t. The bound in.8) improves Bennett s bound B { )} ) x, by the missing factor F x, )exp x nψ. The comparison between B n x, n +x/ ) F x, ) ) and Bennett s bound.3) is displayed in Figure 7 and the ratio between B n x, and B x, ) is given in Figure 5. A lower bound of the tail probability PS n > x) is obtained in Theorem.4, while in Theorem.5 we improve inequality.8) by the following one term expansion: for all x., PS n > x) inf EeλSn x) M x)+88.4θ ),.) λ

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 where θ. Note that equality.) also improves Talagrand s inequality.4). Moreover, under Bernstein s condition.), we have inf λ Ee λsn x) B n x, ). If ξ i are bounded ξ i, it holds inf λ Ee λsn x) H n x,). Our approach uses the conjugate distribution technique due to Cramér which is different from the method used in Bennett s original paper []. We refine the technique based on change of probability measure from Grama and Haeusler [] and derive sharp bounds for the cumulant function to obtain precise upper bounds for tail probabilities under Bernstein s condition. The paper is organized as follows. In Section, we present our main results. In Section 3, we state some auxiliary results to be used in the proofs of theorems. Sections 4, 5, 6 are devoted to the proofs of main results.. Main Results All over the paper ξ,...,ξ n is a sequence of independent real random variables with Eξ i and satisfying Bernstein s condition.), S n and are defined by.). We use the notations a b min{a,b}, a b max{a,b} and a + a. Our first result is the following large deviation inequality valid for all x. Theorem.. For any δ,] and x, where PS n > x) Φ x)) x x + ++δ)x/ and C δ 6.493 δ +.83 ) 8.48 δ. Moreover, C 75.36 and x x +δ ) x +ox ) as x. [ +C δ + x) ],.) The interesting feature of the bound.) is that it recovers closely the shape of the standard normal tail when x is moderate and r becomes small, which is not the case of Bennett s bound Bx, ) see Figure ). Our result can also be compared with Cramér s large deviation result in the i.i.d. case: under Cramér s condition that Ee δ ξ < for some δ >, { )] PS n > x) x 3 x +x exp λ n )}[+O,.) Φx) n n where λ ) is the Cramér series [3] and x o n) cf. Cramér [4] or Petrov [3]). Note that in this case Cramér s condition is equivalent to Bernstein s condition.), and O n ) as n. With respect to Cramér s result, the advantage of.) is that it is valid for all x. The numerical comparison between the bound.) with δ and Bennett s bound Bx, ) is given in Figure, and shows that for r. the ratio of the right-hand side of.) to Bx, ) is less than for small x satisfying Bx, ) 8.7 56. An improvement of Bennett s bound.3) for large x can be obtained from Theorem. formulated below. Theorem.. For all x, [ PS n > x) Φ x)) +A x, ).3) ] B x, )F x, ),.4)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 4 Comparison of various bounds Probabilities...4.6.8. Bx, r) bound.) Φx) 3 4 x Fig. We display Bennett s bound Bx,r) and bound.) with δ as a function of x with r.. Ratio of bound.) to Bx, r) Ratio...4.6.8. r. r.5 r. r.5 5 5 5 3 x Fig. Ratio of bound.) with δ to Bx,r) as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 5 where x is defined in.3), A x, ) ) x +84.9 +x/ M x).5) and F x, ) +Ax,/)/..6) π+ x) ) Moreover, F x, +o) π+x) when x o ),, and F x, ) + Mˆx) for all x. The advantage of Theorem. is that in the normal distribution function Φx) we have the expression x instead of the smaller term x figuring in Theorem., which represents a significant improvement. Inequality.3) improves Bennett s inequality.3) by the factor F x, ) of order +o) π+x) for x o ), which, following Talagrand [], we call missing factor. The numerical results displayed in Figures 3 and 4 show that bound.3) performs better than bound.) and significantly better than Bennett s bound Bx, ) especially for small r. A further significant improvement of Bennett s inequality.3) for all x is given by the following theorem: we replace the bound B x, ) by the following smaller one: B n x, ) { )} B x, ) x exp nψ n, +x/ where ψt) t log+t) is a nonnegative convex function in t. Theorem.3. For all x, PS n > x) B n x, ) F x, ),.7) where F x, ) Mx)+7.99R x ),.8) ) Rt) { t+6t ) 3 3t) 3/ t) 7, if t < 3,, if t 3,.9) being an increasing function.moreover, for all x α with α < 3, we have Rx ) Rα). If α., we have 7.99Rα) 88.4. To highlight the improvement of Theorem.3 over Bennett s bound, we note that B n x, ) Bx, ) and ) B n x, { B ) x, exp } 3/x ) / +o)), x,.) and we display the ratio of B n x,r) to Bx,r) in Figure 5 for various r n. The second improvement in the right-hand side of.7) comes from the missing factor F x, ), which is of

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 6 Comparison of various bounds Probabilities...4.6.8. Bx, r) bound.) bound.3) Φx) 3 4 x Fig 3. We display Bennett s bound Bx,r), bounds.) with δ and.3) as functions of x with r.. Ratio of bound.3) to Bx, r) Ratio...4.6.8. r.5 r.5 r. r.5 r. 5 5 5 3 x Fig 4. Ratio of bound.3) to Bx,r) as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 7 Ratio of B n x, r) to Bx, r) Ratio...4.6.8. r r. r.5 r. 5 5 5 3 x Fig 5. Ratio of B nx,r) to Bx,r) as a function of x for various values of r n. The missing factor F x, r) F x, r)...4.6.8. r. r.5 r.5 r. 3 4 x Fig 6. The missing factor F x,r) is displayed as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 8 Ratio of bound.7) to Bx, r) Ratio...4.6.8. r. r. r. r.5 r.5 5 5 5 3 x Fig 7. Ratio of B n x,r)f x,r) to Bx,r) as a function of x for various values of r n. order +x, for moderate values of x satisfying x <.. The numerical values of the missing factor F x, ) are displayed in Figure 6. Our numerical results confirm that the bound B n x, )F x, ) in.7) is significantly better thanbennett sboundbx, )forallx.forcomparison,wedisplaytheratiosofb nx,r)f x,r) to Bx,r) in Figure 7 for various r n. The following corollary improves inequality.) of Theorem. in the range x α with α < 3. Corollary.. For all x α with α < 3, [ PS n > x) Φ x)) +7.7Rα)+ x) ],.) where x is defined in.3) and Rt) by.9). In particular, if α. we have 7.7Rα).63. For the lower bound of tail probabilities PS n > x), we have the following result, which complements Corollary.. Theorem.4. For all x α with α where ˇx λ λ) 3 with λ 9.6, PS n > x) Φˇx)) x/ [ c α + ˇx) ], ) α + 9.6α, and c α 67.38R + 9.6x/ ˇx x +5.4x ) +ox ) as x. In particular, if α. we have c α 67.38R 6 ) 68.89. 753.3. Moreover,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 9 Combining Corollary. and Theorem.4, we obtain, for all x., PS n > x) Φ x+θ c x ))[ ) +θ c +x) ],.) where c,c > are some absolute constants and θ, θ. To close this section, we give an improvement of Talagrand s inequality.4). Theorem.5. For all x < 3, PS n > x) inf λ EeλSn x) F 3 x, ),.3) where F 3 x, ) Mx)+7.99θR x ),.4) Rt) is defined by.9) and θ. Moreover, inf λ Ee λsn x) B n x, ) Bx, ). In addition, for x., we have 7.99R x) 88.4. It is clear that our equality.3) implies Talagrand s inequality.4) with an information on the Talagrand s constant K, under a less restrictive condition Talagrand supposed that ξ i are bounded: b ξ i ). Notice that.3) can be written in the following form: for x α and α, 3), PS n > x) Mx)inf λ Ee λsn x) +7.99θ Rα) Mx) +7.7θ Rα)+x),.5) where θ, θ and the last step holds since π+t) Mt) π+t), t,.6) seefeller[9]).equality.5)impliesthattherelativeerrorbetweenps n > x)andmx)inf λ Ee λsn x) converges to uniformly in the range x o ) as. 3. Auxiliary results We consider the positive random variable Z n λ) n e λξi Ee λξi, λ <, the Esscher transformation) so that EZ n λ). We introduce the conjugate probability measure P λ defined by dp λ Z n λ)dp. 3.) Denote by E λ the expectation with respect to P λ. Then for any positive and measurable function f, E λ fξ i ) Efξ i)e λξi, i,...,n. Ee λξi

Setting and X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality we obtain the following decomposition: where B k λ) b i λ) E λ ξ i Eξ ie λξi Ee λξi, i,...,n, η i λ) ξ i b i λ), i,...,n, X k B k λ)+y k λ), k,...,n, 3.) k b i λ) and Y k λ) k η i λ). In the following, we give some lower and upper bounds of B n λ), which will be used in the proofs of theorems. Lemma 3.. For all λ <,.4λ)λ.5λ) λ) λ+6λ λ B n λ).5λ λ) λ. Proof. Since Eξ i, by Jensen s inequality, we have Ee λξi. Noting that by Taylor s expansion of e x, we get Eξ i e λξi Eξ i e λξi ), λ, B n λ) Eξ i e λξi λ + + k Using Bernstein s condition.), we obtain, for all λ <, + λ k k! Eξk+ i k + λ λ k k! Eξk+ i. 3.3) k k +)λ) k 3 λ λ) λ. 3.4) Combining 3.3) and 3.4), we get the desired upper bound of B n λ): for all λ <, B n λ) λ + 3 λ λ) λ.5λ λ) λ. By Jensen s inequality and Bernstein s condition.), Eξ i ) Eξ 4 i Eξ i, from which we get Eξ i.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality Using again Bernstein s condition.), we have, for all λ <, + Ee λξi + k λ k k! Eξk i + λ Eξ i λ) + 6λ λ λ+6λ. 3.5) λ Notice that gt) e t +t+ t ) satisfies gt) > if t >, and gt) < if t <. So tgt) for all t R. That is, te t t+t+ t ) for all t R. Therefore, ξ i e λξi ξ i +λξ i + λ ξi ). Taking expectation, we get Eξ i e λξi λeξ i + λ Eξ3 i 3!Eξ i λeξi λ.5λ)λeξi, from which, it follows that Eξ i e λξi.5λ)λ. 3.6) Combining 3.5) and 3.6), we obtain the following lower bound of B n λ): for all λ <, B n λ) This completes the proof of Lemma 3.. Eξ i e λξi Ee λξi We now consider the following cumulant function Ψ n λ).5λ) λ) λ+6λ λ.4λ)λ. 3.7) logee λξi, λ <. 3.8) We have the following elementary bound for Ψ n λ). Lemma 3.. For all λ <, Ψ n λ) nlog λ ) + λ n λ) λ)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality and λb n λ)+ψ n λ) λ λ) 6. Proof. By Bernstein s condition.), it is easy to see that, for all λ <, Then, we have + Ee λξi + Ψ n λ) k + λ Eξ i λ k k! Eξk i λ) k k + λ Eξ i λ). log log + λ Eξi λ) n + λ Eξ i λ) Since the geometric mean does not exceed the arithmetic mean, we get { n Using 3.) and the inequality + λ Eξi ) } /n λ) n + log+t) t, t, ) ). 3.9) + λ Eξi ) λ) λ n λ). 3.) we obtain the first assertion of the lemma. Since Ψ n ) and Ψ n λ) B nλ), by Lemma 3., for all λ <, Ψ n λ) λ B n t)dt λ Therefore, using again Lemma 3., we see that t.4t) dt λ.6λ). λb n λ)+ψ n λ).5λ λ) λ + λ.6λ) λ λ) 6, which completes the proof of the second assertion of the lemma. Denote λ) E λ Ynλ). By the relation between E and E λ, we have Eξ λ) i e λξi Eξ ie λξi ) ) Ee λξi Ee λξi ), λ <.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 Lemma 3.3. For all λ <, λ) 3λ) λ+6λ ) λ) λ) 3. 3.) Proof. Denote fλ) Eξ i eλξi Ee λξi Eξ i e λξi ). Then, Thus, f ) Eξ 3 i and f λ) Eξ 4 i eλξi Ee λξi Eξ i eλξi ). fλ) f)+f )λ Eξ i +λeξ3 i. 3.) Using 3.), 3.5) and Bernstein s condition.), we have, for all λ <, Therefore E λ η i Eξ i eλξi Ee λξi Eξ i e λξi ) Ee λξi ) Eξ i +λeξ3 i Ee λξi ) λ λ+6λ λ) 3λ) λ+6λ ) Eξ i. ) Eξ i +λeξ 3 i) λ) λ) 3λ) λ+6λ ). Using Taylor s expansion of e x and Bernstein s condition.) again, we obtain λ) This completes the proof of Lemma 3.3. Eξi eλξi λ) 3. For the random variable Y n λ) with λ <, we have the following result on the rate of convergence to the standard normal law. Lemma 3.4. For all λ <, ) sup P Yn λ) λ λ) y Φy) 3.44 3 λ) λ) 4. y R Proof. Since Y n λ) n η iλ) is the sum of independent and centered respect to P λ ) random variables η i λ), by the rate of convergence in the central limit theorem cf. e.g. Petrov [3], p. 5) we get, for λ <, sup P λ y R ) Yn λ) λ) y Φy) C 3 λ) E λ η i 3,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 4 where C > is an absolute constant. For λ <, using Bernstein s condition, we have As E λ η i 3 4 E λ ξ i 3 +E λ ξ i ) 3 ) 8 8 E λ ξ i 3 E ξ i 3 exp{ λξ i } 8 E 4 j λ j j! ξ i 3+j j +3)j +)j +)λ) j. j j +3)j +)j +)x j d3 dx 3 j we obtain, for λ <, x j j E λ η i 3 4 λ) 4. Therefore, we have, for λ <, ) sup P Yn λ) λ λ) y Φy) 4C y R 6 x) 4, x <, 3 λ) λ) 4 3.44 3 λ) λ) 4, where the last step holds as C.56 cf. Shevtsova [9]). This completes the proof of Lemma 3.4. Using Lemma 3.4, we easily obtain the following lemma. Lemma 3.5. For all λ., sup P λ Y n λ) y ) Φy) λ.7λ+4.45. y R Proof. Using Lemma 3.3, we have, for all λ < 3, λ λ) λ) λ+6λ λ) 3λ. 3.3)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 5 It is easy to see that P λ Y n λ) y ) Φy) λ ) P Yn λ) λ λ) y y Φ λ) λ) λ) λ)) ) + Φ y Φy) λ) λ) : I +I. By Lemma 3.4 and 3.3), we get, for all λ < 3, I 3.44 3 λ) λ) 4 3.44Rλ). Using Taylor s expansion and 3.3), we obtain, for all λ < 3, I ye y λ) π λ) λ) ye y λ) λ+6λ π λ) 3λ ) λ λ+6λ eπ λ) λ) 3λ. By simple calculations, we obtain, for all λ., P λ Y n λ) y ) Φy) λ.7λ+4.45. This completes the proof of Lemma 3.5. 4. Proofs of Theorems.-.3 In this section, we give upper bounds for PS n > x). For all x and λ <, by 3.) and 3.), we have: PS n > x) E λ Z n λ) {Sn>x} E λ e λsn+ψnλ) {Sn>x} Setting U n λ) λy n λ)+b n λ) x), we get E λ e λbnλ)+ψnλ) λynλ) {Ynλ)+B nλ) x>}. 4.) PS n > x) e λx+ψnλ) E λ e Unλ) {Unλ)>}. Since, by Fubini s theorem, for any real random variable U, Ee U {U>} we deduce, for all x and λ <, e t P < U t)dt, PS n > x) e λx+ψnλ) e t P λ < U n λ) t)dt. 4.) In the following N, ) denotes a standard normal random variable.

4.. Proof of Theorem. X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 6 From 4.), using Lemma 3., we obtain, for all x and λ <, PS n > x) e λx+ λ λ) e t P λ < U n λ) t)dt. 4.3) For any x and β [,.5), let λ λx) [, ) be the unique solution of the equation This definition and Lemma 3. imply that λ βλ λ) x. λ x/ +x/+ +4 β)x/ and B n λ) x. 4.4) Using 4.3) with λ λ, we get where PS n > x) e + β)λ) x e t P λ < U n λ) t)dt, 4.5) x λ λ. By 4.4) and Lemma 3.5, we have, for λ., e t P λ < U n λ) t)dt e y x P λ < Un λ) y x ) xdy e y x P < N,) y) xdy +.7λ+4.45 ) e y x dφy)+.4λ+84.9 M x)+.4λ+84.9. 4.6) Since } e t P λ < U n λ) t)dt and Φt) { exp t π+t) cf..6)), combining 4.5) and 4.6), we deduce, for all x, PS n > x) e β)λ x x [ {λ>.} +e β)λ x Φ x)+e x.4λ+84.9 )] {λ.} Φ x))i +I ), 4.7) with I exp { } [ ] β)λ x π+ x) {λ>.} 4.8)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 7 and [ I e β)λ x + π+ x).4λ+84.9 )] {λ.}. Now we shall give estimates for I and I. If λ >., then I and I exp {. β) x } [ π+ x) ]. 4.9) By a simple calculation, I provided that x 8 8 β note that β [,.5)). For x < we get λ x λ) < 8 7. β.) β. Then, using λ >, we obtain β, I + π+ x) + π+ x)λ < + 7 π β + x) + 8.48 β + x). If λ., we have I. Since + π+ x).4λ+84.9 ) +.4 ) π+ x)λ +84.9 π+ x) ) J J, it follows that I exp { β)λ x} J J. Using the inequality +x e x, we deduce { β) x )} I exp λ.4 π+ x) J. If x.65 β, we see that β) x.4 π+ x), so I J. For x <.65 β, we get λ x λ) <.65 β. Then I + π+ x).4λ+84.9 ) + π+ x).4λ+84.9 ) < + π+ x) + ) β +84.9 ) + x)..4.65 6.493 β +.83 Hence, whenever λ <, we have ) 6.493 I +I + β +.83 8.48 ) + x) β. 4.) Therefore substituting λ from 4.4) in the expression of x λ λ obtain, from 4.7) and 4.), inequality.) in Theorem.. and replacing β by δ, we

4.. Proof of Theorem. X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 8 For any x, let λ λx) [, ) be the unique solution of the equation By Lemma 3., it follows that Employing 4.3) with λ λ, we get where x/ λ +x/+ +x/ PS n > x) exp λ.5λ λ) x. 4.) and B n λ) x. 4.) { ˆx } e t P λ < U n λ) t)dt, 4.3) ˆx λ λ. Using Lemma 3.5 and B n λ) x cf. 4.)), we have M ˆx)+ e t P λ < U n λ) t)dt e yˆx P λ < Un λ) yˆx )ˆxdy e yˆx P < N,) y)ˆxdy +.7λ+4.45 ) { λ. } + {λ>. } e yˆx dφy)+ λ+84.9 ) λ+84.9 Combining 4.3) and 4.4), we obtain, for all x, PS n > x) Φˆx)+exp Φˆx)) ). 4.4) { ˆx [ + M ˆx) } λ+84.9 λ+84.9 ) ) ) ) ]. Substituting λ from 4.) in the expression of ˆx λ, we get, for all x, λ [ PS n > x) Φ x)) +A x, ), 4.5) ] where x x + +x/ and A x, ) This completes the proof of Theorem.. ) x +84.9 +x/. M x)

4.3. Proof of Theorem.3 X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 9 Let λ be defined by 4.). Using Lemma 3.4 and B n λ) x, we have, for all λ <, e t P λ < U n λ) t)dt e yλλ) P λ < Un λ) yλλ) ) λλ)dy e yλλ) P < N,) y)λλ)dy + 3.44 3 λ) λ) 4 e yλλ) dφy)+6.88 3 λ) λ) 4 M λλ) ) +6.88 3 λ) λ) 4. 4.6) Using λ λ and e t P λ < U n λ) t)dt, from 4.) and 4.6), we obtain PS n > x) exp { λx +Ψ n λ) } 4.7) [ M λλ) ) ) ] +6.88. 3 λ) λ) 4 By Lemma 3., inequality 4.7) implies that { λ )} PS n > x) exp λx +nlog + n λ) [ M λλ) ) ) ] +6.88 3. λ) λ) 4 Substituting λ from 4.) in the previous exponential function, we get PS n > x) B n x, ) [ M λλ) ) +6.88 3 λ) λ) 4 Since Mt) is decreasing in t and M t) πt,t >, it follows that Using Lemma 3.3, we deduce M λλ) ) Mx) ) +. x λλ) π λ λ) ) ]. 4.8) M λλ) ) Mx) λ.5λ π λ λ) λ) λ) 3λ) + λ+6λ.5λ) λ+6λ ) λ) 3λ) + π λ λ)4 3λ) + / λ+6λ ).R λ ). 4.9)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality By Lemma 3.3, it is easy to see that Hence, 6.88 3 λ) λ) 4 6.88R λ ). 4.) M λλ) ) +6.88 3 λ) λ) Mx)+7.99R λ ) 4. 4.) Implementing 4.) into 4.8) and using λ x, we obtain inequality.7). 5. Proof of Theorem.4 In this section, we give a lower bound for PS n > x). From Lemma 3. and 4.), it follows that, for all λ <, PS n > x) exp { λ } λ) 6 E λ e λynλ) {Ynλ)+B nλ) x>}. Let λ λx) [, /4.8] be the unique solution of the equation This definition and Lemma 3. imply that, for all x /9.6), λ.4λ) x. 5.) λ x/ + 9.6x/ and x B n λ). 5.) Therefore, PS n > x) exp { λ } λ) 6 E λ e λynλ) {Ynλ)>}. Setting V n λ) λy n λ), we get PS n > x) exp where ˇx λ λ) 3. By Lemma 3.4, it is easy to see that { ˇx } e t P λ < V n λ) t)dt, 5.3) e t P λ < V n λ) t)dt e λyλ) P λ < Vn λ) λyλ) ) λλ)dy e λyλ) P λ < N,) y)λλ)dy 3.44 e λyλ) dφy) 6.88 3 λ) λ) 4 M λλ) ) 6.88 3 λ) λ) 4. 3 λ) λ) 4

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality Since Mt) is decreasing in t and λ) Returning to 5.3), we obtain e t P λ < V n λ) t)dt M ˇx) 6.88 PS n > x) Φˇx) 6.88exp cf. Lemma 3.3), it follows that λ) 3 { ˇx } Using Lemma 3.3, for all x /9.6), we have λ /4.8 and Therefore, for all x /9.6), } Using the inequality Φt) { exp t 3 λ) λ) 4 λ)7 3λ) 3/ λ+6λ ) 3 3. PS n > x) Φˇx) 6.88R λ ) exp PS n > x) Φˇx)) 3 λ) λ) 4. 3 λ) λ) 4. { ˇx }. π+t) for t, we get, for all x /9.6), [ 67.38R λ ) + ˇx) ]. In particular, for all x α/ with α /9.6, a simple calculation shows that λ α + 9.6α 4.8 and 67.38R λ ) ) α 67.38R + 9.6α This completes the proof of Theorem.4. 6. Proof of Theorem.5 ) 67.38R 753.3. 4.8 We will use 4.). Notice that Ψ n λ) [, ) is increasing in λ. Let λ λx) be the unique solution of the equation x Ψ n λ). This definition implies that B nλ) Ψ n λ) x, U n λ) λy n λ) and e λx+ψnλ) inf λ e λx+ψnλ) inf λ EeλSn x). 6.) Using Lemma 3.4 with λ λ, we have e t P λ < U n λ) t)dt e yλλ) P λ < Yn λ) yλ) ) λλ)dy e yλλ) P < N,) y)λλ)dy + 3.44θ 3 λ) λ) 4 e yλλ) dφy)+ 6.88θ 3 λ) λ) 4 M λλ) ) + 6.88θ 3 λ) λ) 4,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality where θ. Therefore by 4.), we obtain PS n > x) M λλ) ) + 6.88θ ) 3 inf λ) λ) 4 λ eλsn x). 6.) Since Mt) is decreasing in t and M t) πt,t >, it follows that M λλ) ) Mx) x λλ) π λ 6.3) λ) x. By Lemma 3., we have the following two-sided bound.5λ) λ) λ+6λ λ B nλ) x.5λ λ. 6.4) λ) Using the two-sided bound in Lemma 3.3 and 6.4), by a simple calculation, we deduce λ λ) x λ) 3λ) λ+6λ ) λ 6.5) and x λλ) λ.5λ λ) λ) 3λ) +. λ+6λ 6.6) From 6.3), 6.5), 6.6) and Lemma 3.3, we easily obtain By Lemma 3.3, it is easy to see that Combining 6.7) and 6.8), we get, for all λ < 3, M λλ) ) Mx).R λ ). 6.7) 6.88 3 λ) λ) 6.88R λ ) 4. 6.8) M λλ) ) + 6.88θ 3 λ) λ) 4 Mx)+7.99θ R λ ), 6.9) where θ. Implementing 6.9) into 6.) and using λ x, we obtain equality.3) of Theorem.5. References [] Bennett, G. 96). Probability inequalities for the sum of independent random variables. J. Amer. Statist. Assoc. 57 33 45. [] Bentkus, V. 4). On Hoeffding s inequality. Ann. Probab. 3 65 673. [3] Bernstein, S. N. 946). The Theory of Probabilities. Moscow, Leningrad. [4] Cramér, H. 938). Sur un nouveau théorème-limite de la théorie des probabilités. Actualite s Sci. Indust. 736 5 3.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 [5] Dedecker, J. and Prieur, C. 4). Coupling for τ-dependent sequences and applications. J. Theoret. Probab. 7 86 885. [6] Doukhan, P. and Neumann, M. H. 7). Probability and moment inequalities for sums of weakly dependent random variables, with applications. Stochastic Process. Appl. 7 878 93. [7] Eaton, M. L. 974). A probability inequality for linear combination of bounded randon variables. Ann. Statist., No. 3, 69 64. [8] Fan, X., Grama, I. and Liu, Q.). Hoeffding s inequality for supermartingales. Stochastic Process. Appl. accepted. [9] Feller, W. 97). An introduction to probability theory and its applications. J. Wiley and Sons. [] Grama, I. and Haeusler, E. ). Large deviations for martingales via Cramer s method. Stochastic Process. Appl. 85 79 93. [] Hoeffding, W. 963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 3 3. [] Nagaev, S. V. 979). Large deviations of sums of independent random variabels. Ann. Probab. 7 745 789. [3] Petrov, V. V. 975). Sums of Independent Random Variables. Springer-Verlag. Berlin. [4] Petrov, V. V. 995). Limit Theorems of Probability Theory. Oxford University Press, Oxford. [5] Pinelis, I. 994). Extremal probabilistic problems and Hotelling s T test under a symmetry condition. Ann. Statist. 357 368. [6] Pinelis, I. 6). Binomial uper bounds on generalized moments and tail probabilities of super)martingales with differences bounded from above. High Dimensional probab. 5 33 5. [7] Rio, E. ). A Bennet type inequality for maxima of empirical processes. Ann. Inst. H. Poincaré Probab. Statist. 38 53 57. [8] Rio, E. ). About the rate function in Talagrand s inequality for empirical processes. C.R. Acad. Sci. Paris, Ser. I. To apperar. [9] Shevtsova, I. G. ). An improvement of convergence rate estimates in the Lyapunov theorem. Doklady. Math. 8 86 864. [] Statulevičius, V. A. 966). On large deviations. Probab. Theory Relat. Fields 6 33 44. [] Talagrand, M.995). The missing factor in Hoeffding s inequalities. Ann. Inst. H. Poincaré Probab. Statist. 3 689 7. [] Talagrand, M. 996). A new look at independence. Ann. Probab. 34.

arxiv: v1 [math.pr] 12 Jun 2012