arxiv: v1 [math.pr] 12 Jun 2012

Similar documents
Sharp large deviation results for sums of independent random variables

A generalization of Cramér large deviations for martingales

Hoeffding s inequality for supermartingales

Cramér large deviation expansions for martingales under Bernstein s condition

arxiv: v3 [math.pr] 14 Sep 2014

On large deviations of sums of independent random variables

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

Deviation inequalities for martingales with applications

On an Effective Solution of the Optimal Stopping Problem for Random Walks

Introduction to Self-normalized Limit Theory

March 1, Florida State University. Concentration Inequalities: Martingale. Approach and Entropy Method. Lizhe Sun and Boning Yang.

Asymptotic Behavior of a Controlled Branching Process with Continuous State Space

Self-normalized Cramér-Type Large Deviations for Independent Random Variables

Exercises in Extreme value theory

arxiv: v1 [math.oc] 18 Jul 2011

Large deviations for weighted random sums

arxiv:math/ v2 [math.pr] 16 Mar 2007

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Independence of some multiple Poisson stochastic integrals with variable-sign kernels

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

A NON-PARAMETRIC TEST FOR NON-INDEPENDENT NOISES AGAINST A BILINEAR DEPENDENCE

Tail inequalities for additive functionals and empirical processes of. Markov chains

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

BLOWUP THEORY FOR THE CRITICAL NONLINEAR SCHRÖDINGER EQUATIONS REVISITED

Probability and Measure

3. Probability inequalities

ON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES. Sergey G. Bobkov and Gennadiy P. Chistyakov. June 2, 2013

On Concentration Functions of Random Variables

THE L 2 -HODGE THEORY AND REPRESENTATION ON R n

On the Bennett-Hoeffding inequality

Journal of Inequalities in Pure and Applied Mathematics

Expansion and Isoperimetric Constants for Product Graphs

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

MODERATE DEVIATIONS FOR STATIONARY PROCESSES

Richard F. Bass Krzysztof Burdzy University of Washington

Universality in Sherrington-Kirkpatrick s Spin Glass Model

for all f satisfying E[ f(x) ] <.

Tail bound inequalities and empirical likelihood for the mean

Exponential inequalities for U-statistics of order two with constants

On the absolute constants in the Berry Esseen type inequalities for identically distributed summands

Practical approaches to the estimation of the ruin probability in a risk model with additional funds

Statistical Machine Learning

On large deviations for combinatorial sums

A Functional Central Limit Theorem for an ARMA(p, q) Process with Markov Switching

On Some Mean Value Results for the Zeta-Function and a Divisor Problem

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

ON COMPOUND POISSON POPULATION MODELS

Lecture 2 One too many inequalities

UPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES

TUSNÁDY S INEQUALITY REVISITED. BY ANDREW CARTER AND DAVID POLLARD University of California, Santa Barbara and Yale University

Concentration, self-bounding functions

On the expected diameter of planar Brownian motion

EXTREMAL PROPERTIES OF THE DERIVATIVES OF THE NEWMAN POLYNOMIALS

Concentration inequalities and the entropy method

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization

THE DVORETZKY KIEFER WOLFOWITZ INEQUALITY WITH SHARP CONSTANT: MASSART S 1990 PROOF SEMINAR, SEPT. 28, R. M. Dudley

(2m)-TH MEAN BEHAVIOR OF SOLUTIONS OF STOCHASTIC DIFFERENTIAL EQUATIONS UNDER PARAMETRIC PERTURBATIONS

Inverse Brascamp-Lieb inequalities along the Heat equation

The Convergence Rate for the Normal Approximation of Extreme Sums

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract

Random Bernstein-Markov factors

Refining the Central Limit Theorem Approximation via Extreme Value Theory

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES

THE MULTIPLICATIVE ERGODIC THEOREM OF OSELEDETS

Note on the Chen-Lin Result with the Li-Zhang Method

Sensitivity analysis of the expected utility maximization problem with respect to model perturbations

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals

A CLT FOR MULTI-DIMENSIONAL MARTINGALE DIFFERENCES IN A LEXICOGRAPHIC ORDER GUY COHEN. Dedicated to the memory of Mikhail Gordin

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

On distribution functions of ξ(3/2) n mod 1

Stability of Stochastic Differential Equations

Exercises Measure Theoretic Probability

Concentration of Measures by Bounded Couplings

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Lecture 1 Measure concentration

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

LIMIT THEOREMS FOR NON-CRITICAL BRANCHING PROCESSES WITH CONTINUOUS STATE SPACE. S. Kurbanov

A note on the convex infimum convolution inequality

Mi-Hwa Ko. t=1 Z t is true. j=0

On Some Extensions of Bernstein s Inequality for Self-Adjoint Operators

Introduction to self-similar growth-fragmentations

arxiv: v2 [math.st] 20 Feb 2013

A note on the growth rate in the Fazekas Klesov general law of large numbers and on the weak law of large numbers for tail series

Lithuanian Mathematical Journal, 2006, No 1

arxiv: v1 [math.pr] 1 Jan 2013

The properties of L p -GMM estimators

Large deviations of empirical processes

Lecture 4: Inequalities and Asymptotic Estimates

arxiv: v1 [math.pr] 7 Aug 2009

RELATIVE ERRORS IN CENTRAL LIMIT THEOREMS FOR STUDENT S t STATISTIC, WITH APPLICATIONS

Stat 260/CS Learning in Sequential Decision Problems. Peter Bartlett

Some superconcentration inequalities for extrema of stationary Gaussian processes

Controllability of linear PDEs (I): The wave equation

arxiv: v1 [math.ca] 23 Oct 2018

LYAPUNOV STABILITY OF CLOSED SETS IN IMPULSIVE SEMIDYNAMICAL SYSTEMS

Chapter 7. Basic Probability Theory

Discrete uniform limit law for additive functions on shifted primes

On Kummer s distributions of type two and generalized Beta distributions

A square bias transformation: properties and applications

Transcription:

arxiv: math.pr/ The missing factor in Bennett s inequality arxiv:6.59v [math.pr] Jun Xiequan Fan Ion Grama and Quansheng Liu Université de Bretagne-Sud, LMBA, UMR CNRS 65, Campus de Tohannic, 567 Vannes, France e-mail: fanxiequan@hotmail.com; ion.grama@univ-ubs.fr; quansheng.liu@univ-ubs.fr Abstract: Let S n be a sum of independent centered random variables satisfying Bernstein s condition with parameter {, and be the variance of S n. Bennett s inequality states that PS n x) exp x} x, where x, x. We give several bounds which + +x/ improve this inequality, in the spirit of Talagrand s refinement of Hoeffding s inequality. In particular, we sharpen this inequality by adding a missing factor Fx) decaying exponentially fast. The interesting feature of our bound is that it recovers closely the shape of the standard normal tail Φx) for all x, in contrast to Bennett s bound which does not share this property. Also, compared with the classical Cramér large deviations, our inequality has the advantage that it is valid for all x.. Introduction AMS subject classifications: Primary 6G5, 6F; secondary 6F5. Keywords and phrases: Sums of independent random variables, large deviations, exponential inequalities, asymptotic expansions, Bennett s inequality, tail probabilities. Let ξ,...,ξ n be a sequence of independent centered random variables satisfying Bernstein s condition Eξ k i k!k Eξ i, for k 3 and i,...,n,.) for some constant >. Denote S n ξ i and Eξi..) Starting from the seminal papers of Cramér [4] and Bernstein [3], the estimation of the tail probabilities PS n > x), for large x >, has attracted much attention. By employing the exponential Markov inequality and an upper bound for the moment generating function Ee λξi, Bennett [] obtained the following inequality cf. 8a) of []): for all x, PS n > x) B x, ) : exp { x }, where x x + +x/..3) Various generalizations and improvements of inequality.3) can be found in Hoeffding [], Statulevičius [], Nagaev [], Petrov [4], Talagrand [], Dedecker and Prieur [5], Pinelis [6], Doukhan and Neumann [6], Rio [7] [8] and Fan, Grama and Liu [8]. Cramér s large deviation resultcf..) of Section ) suggests that Bennett s inequality.3) can be substantially refined by adding a missing factor of order +x. In the case where the summands ξ i are assumed to be bounded, results of such type have been obtained by Eaton [7], Pinelis [5], Talagrand[] and Bentkus []. For example, using the conjugate measure technique of Cramér and Bernstein s method, Talagrand cf..6) of []) showed that if the variables ξ i satisfy b ξ i

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality for some constant b, then there exists an universal constant K such that, for all x Kb, PS n > x) inf λ EeλSn x) Mx)+K b ).4) H n x,) Mx)+K b ),.5) where H n x,) { x+ ) x+ n n x { } x Mx) Φx))exp Since Mx) O +x ) n x } n n+ and πmx) is Mill s ratio: with Φx) π x e t dt..6) ),.5) improves on Hoeffding s bound H n x,) cf..8) of []) by adding a factor of order +x for x Kb. The scope of this paper is to give several improvements of the Bennett inequality.3), in particular, by adding a missing factor in the spirit of Talagrand s inequalities.4) and.5). In addition to the fact that ξ i are not assumed to be bounded, our bounds will be valid for any x unlike the bound.5) which holds true only in the range x Kb. Our results will also imply Talagrand s bound.4) under the less restrictive condition.). Let us explain briefly our main results. Under Bernstein s condition, from Theorem., we obtain, for any x, where x PX n > x) Φ x)) [ +75.36+ x) ],.7) x. The bound.7) present two advantages. First, if we compare.7) with + +4x/ Cramér s large deviation result see.) of Section ), inequality.7) is valid for all x, while Cramér s result holds only for x o ). Second,.7) recovers closely the shape of the normal tail Φx) when x o ǫ ) and ǫ, contrary to Bennett s bound which is close to exp{ x } the exponential part of the normal tail). It is clear that.7) improves Bennett s bound Bx, ) only for small x see Figure ). A considerably sharper bound is obtained in our most important result, Theorem.3, which states that, for all x, PS n > x) B n x, )F x, ),.8) where F x, ) for x, F x, ) O +x ) for x o ) and { )} B n x, ) B x, ) x exp nψ n,.9) +x/ with ψt) t log +t), a nonnegative convex function in t. The bound in.8) improves Bennett s bound B { )} ) x, by the missing factor F x, )exp x nψ. The comparison between B n x, n +x/ ) F x, ) ) and Bennett s bound.3) is displayed in Figure 7 and the ratio between B n x, and B x, ) is given in Figure 5. A lower bound of the tail probability PS n > x) is obtained in Theorem.4, while in Theorem.5 we improve inequality.8) by the following one term expansion: for all x., PS n > x) inf EeλSn x) M x)+88.4θ ),.) λ

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 where θ. Note that equality.) also improves Talagrand s inequality.4). Moreover, under Bernstein s condition.), we have inf λ Ee λsn x) B n x, ). If ξ i are bounded ξ i, it holds inf λ Ee λsn x) H n x,). Our approach uses the conjugate distribution technique due to Cramér which is different from the method used in Bennett s original paper []. We refine the technique based on change of probability measure from Grama and Haeusler [] and derive sharp bounds for the cumulant function to obtain precise upper bounds for tail probabilities under Bernstein s condition. The paper is organized as follows. In Section, we present our main results. In Section 3, we state some auxiliary results to be used in the proofs of theorems. Sections 4, 5, 6 are devoted to the proofs of main results.. Main Results All over the paper ξ,...,ξ n is a sequence of independent real random variables with Eξ i and satisfying Bernstein s condition.), S n and are defined by.). We use the notations a b min{a,b}, a b max{a,b} and a + a. Our first result is the following large deviation inequality valid for all x. Theorem.. For any δ,] and x, where PS n > x) Φ x)) x x + ++δ)x/ and C δ 6.493 δ +.83 ) 8.48 δ. Moreover, C 75.36 and x x +δ ) x +ox ) as x. [ +C δ + x) ],.) The interesting feature of the bound.) is that it recovers closely the shape of the standard normal tail when x is moderate and r becomes small, which is not the case of Bennett s bound Bx, ) see Figure ). Our result can also be compared with Cramér s large deviation result in the i.i.d. case: under Cramér s condition that Ee δ ξ < for some δ >, { )] PS n > x) x 3 x +x exp λ n )}[+O,.) Φx) n n where λ ) is the Cramér series [3] and x o n) cf. Cramér [4] or Petrov [3]). Note that in this case Cramér s condition is equivalent to Bernstein s condition.), and O n ) as n. With respect to Cramér s result, the advantage of.) is that it is valid for all x. The numerical comparison between the bound.) with δ and Bennett s bound Bx, ) is given in Figure, and shows that for r. the ratio of the right-hand side of.) to Bx, ) is less than for small x satisfying Bx, ) 8.7 56. An improvement of Bennett s bound.3) for large x can be obtained from Theorem. formulated below. Theorem.. For all x, [ PS n > x) Φ x)) +A x, ).3) ] B x, )F x, ),.4)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 4 Comparison of various bounds Probabilities...4.6.8. Bx, r) bound.) Φx) 3 4 x Fig. We display Bennett s bound Bx,r) and bound.) with δ as a function of x with r.. Ratio of bound.) to Bx, r) Ratio...4.6.8. r. r.5 r. r.5 5 5 5 3 x Fig. Ratio of bound.) with δ to Bx,r) as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 5 where x is defined in.3), A x, ) ) x +84.9 +x/ M x).5) and F x, ) +Ax,/)/..6) π+ x) ) Moreover, F x, +o) π+x) when x o ),, and F x, ) + Mˆx) for all x. The advantage of Theorem. is that in the normal distribution function Φx) we have the expression x instead of the smaller term x figuring in Theorem., which represents a significant improvement. Inequality.3) improves Bennett s inequality.3) by the factor F x, ) of order +o) π+x) for x o ), which, following Talagrand [], we call missing factor. The numerical results displayed in Figures 3 and 4 show that bound.3) performs better than bound.) and significantly better than Bennett s bound Bx, ) especially for small r. A further significant improvement of Bennett s inequality.3) for all x is given by the following theorem: we replace the bound B x, ) by the following smaller one: B n x, ) { )} B x, ) x exp nψ n, +x/ where ψt) t log+t) is a nonnegative convex function in t. Theorem.3. For all x, PS n > x) B n x, ) F x, ),.7) where F x, ) Mx)+7.99R x ),.8) ) Rt) { t+6t ) 3 3t) 3/ t) 7, if t < 3,, if t 3,.9) being an increasing function.moreover, for all x α with α < 3, we have Rx ) Rα). If α., we have 7.99Rα) 88.4. To highlight the improvement of Theorem.3 over Bennett s bound, we note that B n x, ) Bx, ) and ) B n x, { B ) x, exp } 3/x ) / +o)), x,.) and we display the ratio of B n x,r) to Bx,r) in Figure 5 for various r n. The second improvement in the right-hand side of.7) comes from the missing factor F x, ), which is of

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 6 Comparison of various bounds Probabilities...4.6.8. Bx, r) bound.) bound.3) Φx) 3 4 x Fig 3. We display Bennett s bound Bx,r), bounds.) with δ and.3) as functions of x with r.. Ratio of bound.3) to Bx, r) Ratio...4.6.8. r.5 r.5 r. r.5 r. 5 5 5 3 x Fig 4. Ratio of bound.3) to Bx,r) as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 7 Ratio of B n x, r) to Bx, r) Ratio...4.6.8. r r. r.5 r. 5 5 5 3 x Fig 5. Ratio of B nx,r) to Bx,r) as a function of x for various values of r n. The missing factor F x, r) F x, r)...4.6.8. r. r.5 r.5 r. 3 4 x Fig 6. The missing factor F x,r) is displayed as a function of x for various values of r.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 8 Ratio of bound.7) to Bx, r) Ratio...4.6.8. r. r. r. r.5 r.5 5 5 5 3 x Fig 7. Ratio of B n x,r)f x,r) to Bx,r) as a function of x for various values of r n. order +x, for moderate values of x satisfying x <.. The numerical values of the missing factor F x, ) are displayed in Figure 6. Our numerical results confirm that the bound B n x, )F x, ) in.7) is significantly better thanbennett sboundbx, )forallx.forcomparison,wedisplaytheratiosofb nx,r)f x,r) to Bx,r) in Figure 7 for various r n. The following corollary improves inequality.) of Theorem. in the range x α with α < 3. Corollary.. For all x α with α < 3, [ PS n > x) Φ x)) +7.7Rα)+ x) ],.) where x is defined in.3) and Rt) by.9). In particular, if α. we have 7.7Rα).63. For the lower bound of tail probabilities PS n > x), we have the following result, which complements Corollary.. Theorem.4. For all x α with α where ˇx λ λ) 3 with λ 9.6, PS n > x) Φˇx)) x/ [ c α + ˇx) ], ) α + 9.6α, and c α 67.38R + 9.6x/ ˇx x +5.4x ) +ox ) as x. In particular, if α. we have c α 67.38R 6 ) 68.89. 753.3. Moreover,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 9 Combining Corollary. and Theorem.4, we obtain, for all x., PS n > x) Φ x+θ c x ))[ ) +θ c +x) ],.) where c,c > are some absolute constants and θ, θ. To close this section, we give an improvement of Talagrand s inequality.4). Theorem.5. For all x < 3, PS n > x) inf λ EeλSn x) F 3 x, ),.3) where F 3 x, ) Mx)+7.99θR x ),.4) Rt) is defined by.9) and θ. Moreover, inf λ Ee λsn x) B n x, ) Bx, ). In addition, for x., we have 7.99R x) 88.4. It is clear that our equality.3) implies Talagrand s inequality.4) with an information on the Talagrand s constant K, under a less restrictive condition Talagrand supposed that ξ i are bounded: b ξ i ). Notice that.3) can be written in the following form: for x α and α, 3), PS n > x) Mx)inf λ Ee λsn x) +7.99θ Rα) Mx) +7.7θ Rα)+x),.5) where θ, θ and the last step holds since π+t) Mt) π+t), t,.6) seefeller[9]).equality.5)impliesthattherelativeerrorbetweenps n > x)andmx)inf λ Ee λsn x) converges to uniformly in the range x o ) as. 3. Auxiliary results We consider the positive random variable Z n λ) n e λξi Ee λξi, λ <, the Esscher transformation) so that EZ n λ). We introduce the conjugate probability measure P λ defined by dp λ Z n λ)dp. 3.) Denote by E λ the expectation with respect to P λ. Then for any positive and measurable function f, E λ fξ i ) Efξ i)e λξi, i,...,n. Ee λξi

Setting and X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality we obtain the following decomposition: where B k λ) b i λ) E λ ξ i Eξ ie λξi Ee λξi, i,...,n, η i λ) ξ i b i λ), i,...,n, X k B k λ)+y k λ), k,...,n, 3.) k b i λ) and Y k λ) k η i λ). In the following, we give some lower and upper bounds of B n λ), which will be used in the proofs of theorems. Lemma 3.. For all λ <,.4λ)λ.5λ) λ) λ+6λ λ B n λ).5λ λ) λ. Proof. Since Eξ i, by Jensen s inequality, we have Ee λξi. Noting that by Taylor s expansion of e x, we get Eξ i e λξi Eξ i e λξi ), λ, B n λ) Eξ i e λξi λ + + k Using Bernstein s condition.), we obtain, for all λ <, + λ k k! Eξk+ i k + λ λ k k! Eξk+ i. 3.3) k k +)λ) k 3 λ λ) λ. 3.4) Combining 3.3) and 3.4), we get the desired upper bound of B n λ): for all λ <, B n λ) λ + 3 λ λ) λ.5λ λ) λ. By Jensen s inequality and Bernstein s condition.), Eξ i ) Eξ 4 i Eξ i, from which we get Eξ i.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality Using again Bernstein s condition.), we have, for all λ <, + Ee λξi + k λ k k! Eξk i + λ Eξ i λ) + 6λ λ λ+6λ. 3.5) λ Notice that gt) e t +t+ t ) satisfies gt) > if t >, and gt) < if t <. So tgt) for all t R. That is, te t t+t+ t ) for all t R. Therefore, ξ i e λξi ξ i +λξ i + λ ξi ). Taking expectation, we get Eξ i e λξi λeξ i + λ Eξ3 i 3!Eξ i λeξi λ.5λ)λeξi, from which, it follows that Eξ i e λξi.5λ)λ. 3.6) Combining 3.5) and 3.6), we obtain the following lower bound of B n λ): for all λ <, B n λ) This completes the proof of Lemma 3.. Eξ i e λξi Ee λξi We now consider the following cumulant function Ψ n λ).5λ) λ) λ+6λ λ.4λ)λ. 3.7) logee λξi, λ <. 3.8) We have the following elementary bound for Ψ n λ). Lemma 3.. For all λ <, Ψ n λ) nlog λ ) + λ n λ) λ)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality and λb n λ)+ψ n λ) λ λ) 6. Proof. By Bernstein s condition.), it is easy to see that, for all λ <, Then, we have + Ee λξi + Ψ n λ) k + λ Eξ i λ k k! Eξk i λ) k k + λ Eξ i λ). log log + λ Eξi λ) n + λ Eξ i λ) Since the geometric mean does not exceed the arithmetic mean, we get { n Using 3.) and the inequality + λ Eξi ) } /n λ) n + log+t) t, t, ) ). 3.9) + λ Eξi ) λ) λ n λ). 3.) we obtain the first assertion of the lemma. Since Ψ n ) and Ψ n λ) B nλ), by Lemma 3., for all λ <, Ψ n λ) λ B n t)dt λ Therefore, using again Lemma 3., we see that t.4t) dt λ.6λ). λb n λ)+ψ n λ).5λ λ) λ + λ.6λ) λ λ) 6, which completes the proof of the second assertion of the lemma. Denote λ) E λ Ynλ). By the relation between E and E λ, we have Eξ λ) i e λξi Eξ ie λξi ) ) Ee λξi Ee λξi ), λ <.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 Lemma 3.3. For all λ <, λ) 3λ) λ+6λ ) λ) λ) 3. 3.) Proof. Denote fλ) Eξ i eλξi Ee λξi Eξ i e λξi ). Then, Thus, f ) Eξ 3 i and f λ) Eξ 4 i eλξi Ee λξi Eξ i eλξi ). fλ) f)+f )λ Eξ i +λeξ3 i. 3.) Using 3.), 3.5) and Bernstein s condition.), we have, for all λ <, Therefore E λ η i Eξ i eλξi Ee λξi Eξ i e λξi ) Ee λξi ) Eξ i +λeξ3 i Ee λξi ) λ λ+6λ λ) 3λ) λ+6λ ) Eξ i. ) Eξ i +λeξ 3 i) λ) λ) 3λ) λ+6λ ). Using Taylor s expansion of e x and Bernstein s condition.) again, we obtain λ) This completes the proof of Lemma 3.3. Eξi eλξi λ) 3. For the random variable Y n λ) with λ <, we have the following result on the rate of convergence to the standard normal law. Lemma 3.4. For all λ <, ) sup P Yn λ) λ λ) y Φy) 3.44 3 λ) λ) 4. y R Proof. Since Y n λ) n η iλ) is the sum of independent and centered respect to P λ ) random variables η i λ), by the rate of convergence in the central limit theorem cf. e.g. Petrov [3], p. 5) we get, for λ <, sup P λ y R ) Yn λ) λ) y Φy) C 3 λ) E λ η i 3,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 4 where C > is an absolute constant. For λ <, using Bernstein s condition, we have As E λ η i 3 4 E λ ξ i 3 +E λ ξ i ) 3 ) 8 8 E λ ξ i 3 E ξ i 3 exp{ λξ i } 8 E 4 j λ j j! ξ i 3+j j +3)j +)j +)λ) j. j j +3)j +)j +)x j d3 dx 3 j we obtain, for λ <, x j j E λ η i 3 4 λ) 4. Therefore, we have, for λ <, ) sup P Yn λ) λ λ) y Φy) 4C y R 6 x) 4, x <, 3 λ) λ) 4 3.44 3 λ) λ) 4, where the last step holds as C.56 cf. Shevtsova [9]). This completes the proof of Lemma 3.4. Using Lemma 3.4, we easily obtain the following lemma. Lemma 3.5. For all λ., sup P λ Y n λ) y ) Φy) λ.7λ+4.45. y R Proof. Using Lemma 3.3, we have, for all λ < 3, λ λ) λ) λ+6λ λ) 3λ. 3.3)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 5 It is easy to see that P λ Y n λ) y ) Φy) λ ) P Yn λ) λ λ) y y Φ λ) λ) λ) λ)) ) + Φ y Φy) λ) λ) : I +I. By Lemma 3.4 and 3.3), we get, for all λ < 3, I 3.44 3 λ) λ) 4 3.44Rλ). Using Taylor s expansion and 3.3), we obtain, for all λ < 3, I ye y λ) π λ) λ) ye y λ) λ+6λ π λ) 3λ ) λ λ+6λ eπ λ) λ) 3λ. By simple calculations, we obtain, for all λ., P λ Y n λ) y ) Φy) λ.7λ+4.45. This completes the proof of Lemma 3.5. 4. Proofs of Theorems.-.3 In this section, we give upper bounds for PS n > x). For all x and λ <, by 3.) and 3.), we have: PS n > x) E λ Z n λ) {Sn>x} E λ e λsn+ψnλ) {Sn>x} Setting U n λ) λy n λ)+b n λ) x), we get E λ e λbnλ)+ψnλ) λynλ) {Ynλ)+B nλ) x>}. 4.) PS n > x) e λx+ψnλ) E λ e Unλ) {Unλ)>}. Since, by Fubini s theorem, for any real random variable U, Ee U {U>} we deduce, for all x and λ <, e t P < U t)dt, PS n > x) e λx+ψnλ) e t P λ < U n λ) t)dt. 4.) In the following N, ) denotes a standard normal random variable.

4.. Proof of Theorem. X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 6 From 4.), using Lemma 3., we obtain, for all x and λ <, PS n > x) e λx+ λ λ) e t P λ < U n λ) t)dt. 4.3) For any x and β [,.5), let λ λx) [, ) be the unique solution of the equation This definition and Lemma 3. imply that λ βλ λ) x. λ x/ +x/+ +4 β)x/ and B n λ) x. 4.4) Using 4.3) with λ λ, we get where PS n > x) e + β)λ) x e t P λ < U n λ) t)dt, 4.5) x λ λ. By 4.4) and Lemma 3.5, we have, for λ., e t P λ < U n λ) t)dt e y x P λ < Un λ) y x ) xdy e y x P < N,) y) xdy +.7λ+4.45 ) e y x dφy)+.4λ+84.9 M x)+.4λ+84.9. 4.6) Since } e t P λ < U n λ) t)dt and Φt) { exp t π+t) cf..6)), combining 4.5) and 4.6), we deduce, for all x, PS n > x) e β)λ x x [ {λ>.} +e β)λ x Φ x)+e x.4λ+84.9 )] {λ.} Φ x))i +I ), 4.7) with I exp { } [ ] β)λ x π+ x) {λ>.} 4.8)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 7 and [ I e β)λ x + π+ x).4λ+84.9 )] {λ.}. Now we shall give estimates for I and I. If λ >., then I and I exp {. β) x } [ π+ x) ]. 4.9) By a simple calculation, I provided that x 8 8 β note that β [,.5)). For x < we get λ x λ) < 8 7. β.) β. Then, using λ >, we obtain β, I + π+ x) + π+ x)λ < + 7 π β + x) + 8.48 β + x). If λ., we have I. Since + π+ x).4λ+84.9 ) +.4 ) π+ x)λ +84.9 π+ x) ) J J, it follows that I exp { β)λ x} J J. Using the inequality +x e x, we deduce { β) x )} I exp λ.4 π+ x) J. If x.65 β, we see that β) x.4 π+ x), so I J. For x <.65 β, we get λ x λ) <.65 β. Then I + π+ x).4λ+84.9 ) + π+ x).4λ+84.9 ) < + π+ x) + ) β +84.9 ) + x)..4.65 6.493 β +.83 Hence, whenever λ <, we have ) 6.493 I +I + β +.83 8.48 ) + x) β. 4.) Therefore substituting λ from 4.4) in the expression of x λ λ obtain, from 4.7) and 4.), inequality.) in Theorem.. and replacing β by δ, we

4.. Proof of Theorem. X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 8 For any x, let λ λx) [, ) be the unique solution of the equation By Lemma 3., it follows that Employing 4.3) with λ λ, we get where x/ λ +x/+ +x/ PS n > x) exp λ.5λ λ) x. 4.) and B n λ) x. 4.) { ˆx } e t P λ < U n λ) t)dt, 4.3) ˆx λ λ. Using Lemma 3.5 and B n λ) x cf. 4.)), we have M ˆx)+ e t P λ < U n λ) t)dt e yˆx P λ < Un λ) yˆx )ˆxdy e yˆx P < N,) y)ˆxdy +.7λ+4.45 ) { λ. } + {λ>. } e yˆx dφy)+ λ+84.9 ) λ+84.9 Combining 4.3) and 4.4), we obtain, for all x, PS n > x) Φˆx)+exp Φˆx)) ). 4.4) { ˆx [ + M ˆx) } λ+84.9 λ+84.9 ) ) ) ) ]. Substituting λ from 4.) in the expression of ˆx λ, we get, for all x, λ [ PS n > x) Φ x)) +A x, ), 4.5) ] where x x + +x/ and A x, ) This completes the proof of Theorem.. ) x +84.9 +x/. M x)

4.3. Proof of Theorem.3 X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 9 Let λ be defined by 4.). Using Lemma 3.4 and B n λ) x, we have, for all λ <, e t P λ < U n λ) t)dt e yλλ) P λ < Un λ) yλλ) ) λλ)dy e yλλ) P < N,) y)λλ)dy + 3.44 3 λ) λ) 4 e yλλ) dφy)+6.88 3 λ) λ) 4 M λλ) ) +6.88 3 λ) λ) 4. 4.6) Using λ λ and e t P λ < U n λ) t)dt, from 4.) and 4.6), we obtain PS n > x) exp { λx +Ψ n λ) } 4.7) [ M λλ) ) ) ] +6.88. 3 λ) λ) 4 By Lemma 3., inequality 4.7) implies that { λ )} PS n > x) exp λx +nlog + n λ) [ M λλ) ) ) ] +6.88 3. λ) λ) 4 Substituting λ from 4.) in the previous exponential function, we get PS n > x) B n x, ) [ M λλ) ) +6.88 3 λ) λ) 4 Since Mt) is decreasing in t and M t) πt,t >, it follows that Using Lemma 3.3, we deduce M λλ) ) Mx) ) +. x λλ) π λ λ) ) ]. 4.8) M λλ) ) Mx) λ.5λ π λ λ) λ) λ) 3λ) + λ+6λ.5λ) λ+6λ ) λ) 3λ) + π λ λ)4 3λ) + / λ+6λ ).R λ ). 4.9)

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality By Lemma 3.3, it is easy to see that Hence, 6.88 3 λ) λ) 4 6.88R λ ). 4.) M λλ) ) +6.88 3 λ) λ) Mx)+7.99R λ ) 4. 4.) Implementing 4.) into 4.8) and using λ x, we obtain inequality.7). 5. Proof of Theorem.4 In this section, we give a lower bound for PS n > x). From Lemma 3. and 4.), it follows that, for all λ <, PS n > x) exp { λ } λ) 6 E λ e λynλ) {Ynλ)+B nλ) x>}. Let λ λx) [, /4.8] be the unique solution of the equation This definition and Lemma 3. imply that, for all x /9.6), λ.4λ) x. 5.) λ x/ + 9.6x/ and x B n λ). 5.) Therefore, PS n > x) exp { λ } λ) 6 E λ e λynλ) {Ynλ)>}. Setting V n λ) λy n λ), we get PS n > x) exp where ˇx λ λ) 3. By Lemma 3.4, it is easy to see that { ˇx } e t P λ < V n λ) t)dt, 5.3) e t P λ < V n λ) t)dt e λyλ) P λ < Vn λ) λyλ) ) λλ)dy e λyλ) P λ < N,) y)λλ)dy 3.44 e λyλ) dφy) 6.88 3 λ) λ) 4 M λλ) ) 6.88 3 λ) λ) 4. 3 λ) λ) 4

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality Since Mt) is decreasing in t and λ) Returning to 5.3), we obtain e t P λ < V n λ) t)dt M ˇx) 6.88 PS n > x) Φˇx) 6.88exp cf. Lemma 3.3), it follows that λ) 3 { ˇx } Using Lemma 3.3, for all x /9.6), we have λ /4.8 and Therefore, for all x /9.6), } Using the inequality Φt) { exp t 3 λ) λ) 4 λ)7 3λ) 3/ λ+6λ ) 3 3. PS n > x) Φˇx) 6.88R λ ) exp PS n > x) Φˇx)) 3 λ) λ) 4. 3 λ) λ) 4. { ˇx }. π+t) for t, we get, for all x /9.6), [ 67.38R λ ) + ˇx) ]. In particular, for all x α/ with α /9.6, a simple calculation shows that λ α + 9.6α 4.8 and 67.38R λ ) ) α 67.38R + 9.6α This completes the proof of Theorem.4. 6. Proof of Theorem.5 ) 67.38R 753.3. 4.8 We will use 4.). Notice that Ψ n λ) [, ) is increasing in λ. Let λ λx) be the unique solution of the equation x Ψ n λ). This definition implies that B nλ) Ψ n λ) x, U n λ) λy n λ) and e λx+ψnλ) inf λ e λx+ψnλ) inf λ EeλSn x). 6.) Using Lemma 3.4 with λ λ, we have e t P λ < U n λ) t)dt e yλλ) P λ < Yn λ) yλ) ) λλ)dy e yλλ) P < N,) y)λλ)dy + 3.44θ 3 λ) λ) 4 e yλλ) dφy)+ 6.88θ 3 λ) λ) 4 M λλ) ) + 6.88θ 3 λ) λ) 4,

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality where θ. Therefore by 4.), we obtain PS n > x) M λλ) ) + 6.88θ ) 3 inf λ) λ) 4 λ eλsn x). 6.) Since Mt) is decreasing in t and M t) πt,t >, it follows that M λλ) ) Mx) x λλ) π λ 6.3) λ) x. By Lemma 3., we have the following two-sided bound.5λ) λ) λ+6λ λ B nλ) x.5λ λ. 6.4) λ) Using the two-sided bound in Lemma 3.3 and 6.4), by a simple calculation, we deduce λ λ) x λ) 3λ) λ+6λ ) λ 6.5) and x λλ) λ.5λ λ) λ) 3λ) +. λ+6λ 6.6) From 6.3), 6.5), 6.6) and Lemma 3.3, we easily obtain By Lemma 3.3, it is easy to see that Combining 6.7) and 6.8), we get, for all λ < 3, M λλ) ) Mx).R λ ). 6.7) 6.88 3 λ) λ) 6.88R λ ) 4. 6.8) M λλ) ) + 6.88θ 3 λ) λ) 4 Mx)+7.99θ R λ ), 6.9) where θ. Implementing 6.9) into 6.) and using λ x, we obtain equality.3) of Theorem.5. References [] Bennett, G. 96). Probability inequalities for the sum of independent random variables. J. Amer. Statist. Assoc. 57 33 45. [] Bentkus, V. 4). On Hoeffding s inequality. Ann. Probab. 3 65 673. [3] Bernstein, S. N. 946). The Theory of Probabilities. Moscow, Leningrad. [4] Cramér, H. 938). Sur un nouveau théorème-limite de la théorie des probabilités. Actualite s Sci. Indust. 736 5 3.

X. Fan, I. Grama et Q. Liu/ The missing factor in Bennett s inequality 3 [5] Dedecker, J. and Prieur, C. 4). Coupling for τ-dependent sequences and applications. J. Theoret. Probab. 7 86 885. [6] Doukhan, P. and Neumann, M. H. 7). Probability and moment inequalities for sums of weakly dependent random variables, with applications. Stochastic Process. Appl. 7 878 93. [7] Eaton, M. L. 974). A probability inequality for linear combination of bounded randon variables. Ann. Statist., No. 3, 69 64. [8] Fan, X., Grama, I. and Liu, Q.). Hoeffding s inequality for supermartingales. Stochastic Process. Appl. accepted. [9] Feller, W. 97). An introduction to probability theory and its applications. J. Wiley and Sons. [] Grama, I. and Haeusler, E. ). Large deviations for martingales via Cramer s method. Stochastic Process. Appl. 85 79 93. [] Hoeffding, W. 963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 3 3. [] Nagaev, S. V. 979). Large deviations of sums of independent random variabels. Ann. Probab. 7 745 789. [3] Petrov, V. V. 975). Sums of Independent Random Variables. Springer-Verlag. Berlin. [4] Petrov, V. V. 995). Limit Theorems of Probability Theory. Oxford University Press, Oxford. [5] Pinelis, I. 994). Extremal probabilistic problems and Hotelling s T test under a symmetry condition. Ann. Statist. 357 368. [6] Pinelis, I. 6). Binomial uper bounds on generalized moments and tail probabilities of super)martingales with differences bounded from above. High Dimensional probab. 5 33 5. [7] Rio, E. ). A Bennet type inequality for maxima of empirical processes. Ann. Inst. H. Poincaré Probab. Statist. 38 53 57. [8] Rio, E. ). About the rate function in Talagrand s inequality for empirical processes. C.R. Acad. Sci. Paris, Ser. I. To apperar. [9] Shevtsova, I. G. ). An improvement of convergence rate estimates in the Lyapunov theorem. Doklady. Math. 8 86 864. [] Statulevičius, V. A. 966). On large deviations. Probab. Theory Relat. Fields 6 33 44. [] Talagrand, M.995). The missing factor in Hoeffding s inequalities. Ann. Inst. H. Poincaré Probab. Statist. 3 689 7. [] Talagrand, M. 996). A new look at independence. Ann. Probab. 34.