A square bias transformation: properties and applications

Similar documents
On the absolute constants in the Berry Esseen type inequalities for identically distributed summands

Bounds on the Constant in the Mean Central Limit Theorem

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

Concentration of Measures by Bounded Size Bias Couplings

Concentration of Measures by Bounded Couplings

Stein s Method: Distributional Approximation and Concentration of Measure

Self-normalized Cramér-Type Large Deviations for Independent Random Variables

arxiv: v2 [math.pr] 19 Oct 2010

Generating and characteristic functions. Generating and Characteristic Functions. Probability generating function. Probability generating function

BOUNDS ON THE CONSTANT IN THE MEAN CENTRAL LIMIT THEOREM. BY LARRY GOLDSTEIN University of Southern California

On rate of convergence in distribution of asymptotically normal statistics based on samples of random size

Stein s Method and the Zero Bias Transformation with Application to Simple Random Sampling

Notes on Poisson Approximation

Random Variables. P(x) = P[X(e)] = P(e). (1)

Stein s method and zero bias transformation: Application to CDO pricing

BOUNDARY PROPERTIES OF FIRST-ORDER PARTIAL DERIVATIVES OF THE POISSON INTEGRAL FOR THE HALF-SPACE R + k+1. (k > 1)

arxiv: v2 [math.st] 20 Feb 2013

A note on the convex infimum convolution inequality

MOMENT CONVERGENCE RATES OF LIL FOR NEGATIVELY ASSOCIATED SEQUENCES

Maxima and Minima. (a, b) of R if

ON CONCENTRATION FUNCTIONS OF RANDOM VARIABLES. Sergey G. Bobkov and Gennadiy P. Chistyakov. June 2, 2013

Normal approximation of Poisson functionals in Kolmogorov distance

The Hilbert Transform and Fine Continuity

On Concentration Functions of Random Variables

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

arxiv: v1 [math.st] 2 Apr 2016

Section 3.5 The Implicit Function Theorem

Continuous Functions on Metric Spaces

MA102: Multivariable Calculus

Limiting Distributions

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

On large deviations for combinatorial sums

A note on the growth rate in the Fazekas Klesov general law of large numbers and on the weak law of large numbers for tail series

MODERATE DEVIATIONS IN POISSON APPROXIMATION: A FIRST ATTEMPT

Normal Approximation for Hierarchical Structures

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Wasserstein-2 bounds in normal approximation under local dependence

The Hopf argument. Yves Coudene. IRMAR, Université Rennes 1, campus beaulieu, bat Rennes cedex, France

The Central Limit Theorem: More of the Story

arxiv: v3 [math.pr] 24 Mar 2016

A NOTE ON A DISTRIBUTION OF WEIGHTED SUMS OF I.I.D. RAYLEIGH RANDOM VARIABLES

Moment Properties of Distributions Used in Stochastic Financial Models

THE NEARLY ADDITIVE MAPS

SOME PROPERTIES ON THE CLOSED SUBSETS IN BANACH SPACES

ECE 4400:693 - Information Theory

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

A Short Introduction to Stein s Method

are Banach algebras. f(x)g(x) max Example 7.4. Similarly, A = L and A = l with the pointwise multiplication

A COMPOUND POISSON APPROXIMATION INEQUALITY

STAT 200C: High-dimensional Statistics

Part II Probability and Measure

FINAL REVIEW Answers and hints Math 311 Fall 2017

arxiv: v1 [math.pr] 24 Aug 2018

Non Uniform Bounds on Geometric Approximation Via Stein s Method and w-functions

2x (x 2 + y 2 + 1) 2 2y. (x 2 + y 2 + 1) 4. 4xy. (1, 1)(x 1) + (1, 1)(y + 1) (1, 1)(x 1)(y + 1) 81 x y y + 7.

1 Review of di erential calculus

Engg. Math. I. Unit-I. Differential Calculus

On Some Estimates of the Remainder in Taylor s Formula

STA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/

3. Probability inequalities

Large deviations for weighted random sums

Sec. 14.3: Partial Derivatives. All of the following are ways of representing the derivative. y dx

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

Asymptotic at infinity. g(x) g(x) Tangential at finite point. slope y. slope y

On an uniqueness theorem for characteristic functions

A Gentle Introduction to Stein s Method for Normal Approximation I

Supplementary Questions for HP Chapter Sketch graphs of the following functions: (a) z = y x. (b) z = ye x. (c) z = x 2 y 2

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

EXACT CONVERGENCE RATE AND LEADING TERM IN THE CENTRAL LIMIT THEOREM FOR U-STATISTICS

PROPERTY OF HALF{SPACES. July 25, Abstract

arxiv: v1 [math.ca] 31 Jan 2016

Math 321 Final Examination April 1995 Notation used in this exam: N. (1) S N (f,x) = f(t)e int dt e inx.

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH

Fixed point of ϕ-contraction in metric spaces endowed with a graph

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Journal of Inequalities in Pure and Applied Mathematics

Module 3. Function of a Random Variable and its distribution

MATH The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input,

arxiv: v1 [math.oc] 21 Mar 2015

Lecture Notes on Metric Spaces

1* (10 pts) Let X be a random variable with P (X = 1) = P (X = 1) = 1 2

LOWELL WEEKLY JOURNAL

A Note on Integral Transforms and Partial Differential Equations

Stein Couplings for Concentration of Measure

CONVERGENCE OF THE FRACTIONAL PARTS OF THE RANDOM VARIABLES TO THE TRUNCATED EXPONENTIAL DISTRIBUTION

THE STRONG LAW OF LARGE NUMBERS

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

MAS223 Statistical Inference and Modelling Exercises

Differentiation. f(x + h) f(x) Lh = L.

Strong log-concavity is preserved by convolution

THE INVERSE FUNCTION THEOREM

Stein s Method: Distributional Approximation and Concentration of Measure

review To find the coefficient of all the terms in 15ab + 60bc 17ca: Coefficient of ab = 15 Coefficient of bc = 60 Coefficient of ca = -17

Solutions to old Exam 3 problems

2 Sequences, Continuity, and Limits

TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES. S.S. Dragomir and J.J.

Math 141: Section 4.1 Extreme Values of Functions - Notes

Asymptotic behavior for sums of non-identically distributed random variables

MATH 409 Advanced Calculus I Lecture 16: Mean value theorem. Taylor s formula.

MATH 1301, Solutions to practice problems

Transcription:

A square bias transformation: properties and applications Irina Shevtsova arxiv:11.6775v4 [math.pr] 13 Dec 13 Abstract The properties of the square bias transformation are studied, in particular, the precise moment-type estimate for the L 1 -metric between the transformed and the original distributions is proved, a relation between their characteristic functions is found. As a corollary, some new moment-type estimates for the proximity of arbitrary characteristic function with zero mean and finite third moment to the normal one with zero mean and the same variance are proved involving the double integrals of the square- and zero- bias transformations. Key words and phrases: probability transformation, zero bias transformation, size bias transformation, square bias transformation, characteristic function, L 1 -metric AMS 1 Mathematics Subject Classification: 6E1, 6E15 1 Introduction Let X be a random variable r.v.) with the distribution function d.f.) Fx) = PX < x), x R, and the characteristic function ch.f.) ft) = Ee itx = e itx dfx), t R, which is the Fourier-Stieltjes transform of the d.f. Fx). As is well known, if X is nonnegative with < EX <, then f t), t R, 1) f ) is a ch.f., and if < EX <, then f t) f ), tf ) f t) f ) Research supported by the Russian Foundation for Basic Research projects 11-1-515a, 11-7-11a, 11-1-16-ofi-m) and by the grant of the President of Russia MK 56.1.1). Lomonosov Moscow State University, Faculty of Computational Mathematics and Cybernetics, Leninskie Gory, GSP-1, Moscow, 119991, Russia; Institute for Informatics Problems of the Russian Academy of Sciences; e-mail: ishevtsova@cs.msu.su 1

are ch.f. s as well see, e.g., [15, Theorem 1..5]). The probability transformation given by 1) is called the X-size bias transformation. By a transformation of a random variable we mean that of its distribution. The X-size bias transformation was introduced by Goldstein and Rinott [7] for the purpose of estimation of the accuracy of the multivariate normal approximation to nonnegative random vectors under conditions of local dependence by Stein s method. Namely, in [7] an almost surely nonnegative r.v. X with < EX < is said to have the X-size biased distribution if dpx < x) = x dfx), x R. ) EX It is easy to see that the distribution given by ) has the ch.f. given by 1), hence, by virtue of the uniqueness theorem, definitions 1) and ) are equivalent. In the same paper Goldstein and Rinott also noticed that the distribution of X may be characterized by the relation EXGX) = EXEGX ), which should hold for all functions G such that EXGX) <. As regards the second transformation, if EX =, then the distribution given by the ch.f. f t) f ) tf ) = 1 σ f t), t where σ = EX >, is called the X-zero biased distribution. This definition was introduced by Goldstein and Reinert in [5] in an equivalent form for the purpose of generalization of the size bias transformation to r.v. s taking both positive and negative values and was inspired by the characteristic property of the mean zero normal distribution as the unique fixed point of the zero bias transformation. Namely, in [5] a r.v. X z) is said to have the X-zero biased distribution if EX = and EXGX) = σ EG X z) ) for all absolutely continuous functions G for which EXGX) exists. The zero biased transformation possesses the following elementary properties most of them are noticed/proved in [5]): 1. The zero biased distribution is absolutely continuous and unimodal about zero with the probability density function px) = σ EXIX > x), x R, and the ch.f. Ee itxz) 1 σ f t), t R. t. X z) d = X if and only if X has the normal distribution with zero mean [1, 5]. 3. The zero biased transformation preserves symmetry. 4. σ EX z) ) n = EX n+ /n+1) for n N, in particular, σ EX z) =.5EX 3. 5. If X = Y 1 +...Y n, where Y 1,...,Y n are independent r.v. s with zero means and EYj = σj > so that σ 1 +... + σ n = σ, then X z) = X I + Y z) I, where I is a random index independent of Y 1,...,Y n with the distribution PI = i) = σi/σ, i = 1,...,n, and X i = X Y i = j i Y j.

6. The following non-trivial estimate was proved in 9 independently by Goldstein [4] and Tyurin [3, ]: L 1 X,X z) ) E X 3 σ, 3) L 1 X,Y) being the L 1 -distance between the r.v. s X and Y, { } L 1 X,Y) = inf E X Y : X = d X, Y = d Y, E X <, E Y <. As regards the third transformation, given by the characteristic function f t) f t) f ) = f t) σ, t R, 4) where σ EX, ), it is called the X-square bias transformation. It is easy to see that a r.v. X has the ch.f. f t) if and only if EX GX) = σ EGX ) 5) for all functions G such that EX GX) <. In 7 L.Goldstein [3] called the distribution of a r.v. X satisfying 5) the X-square biased distribution. In 11 L.Chen, L.Goldstein and Q.-M. Shao [1, Proposition.3] proved the following relation between the distribution of the zero biased X z) and square biased X distributions of a symmetric r.v. X: X z) d = UX, where the r.v. s U, X are independent, U having uniform distribution on [ 1,1]. Taking into account that the square bias transformation preserves symmetry see below), the latest relation Later in [19] it was noticed that X = X ), and the distribution of X obtained the second name: X-double size bias distribution. In the same paper the following characterization of the modulus of the normally distributed r.v. was proved: X d = U 1 X, where U 1 has uniform distribution on[,1] and is independent ofx, if and only if X d = Z, where Z has the standard normal distribution. It is easy to see that the square bias transformation possesses the following elementary properties. 1. A r.v. X has the X-square biased distribution if and only if its d.f. F x) satisfies df x) = x dfx), x R. 6) σ. X d = X if and only if P X = σ) = 1, i.e. any Bernoulli distribution with symmetric atoms is a fixed point of the square bias transformation. This can be verified by noticing that the solution of the corresponding linear homogeneous differential equation f t) + σ ft) = of the second order with the initial condition f) = 1 has the form ft) = pe iσt +1 p)e iσt, p R, being a ch.f. if and only if p [,1]. 3. Square bias transformation preserves symmetry. Indeed, if the r.v. X has a symmetric distribution, then it s ch.f. ft) is even, and hence, f t) = f t)/f ) = f t)/f ) = f t), i.e. the distribution of X is symmetric as well. 3

4. X ) d = X ), where X ) has the X -size biased distribution. 5. cx) = cx for any constant c R. 6. σ EX ) n = EX n+, n N, σ E X r = E X r+, r >, in particular, σ EX = EX 3, σ E X = E X 3. Moreover, the following estimate for the L 1 -distance between the distributions of X and X will be proved in this paper. Theorem 1. If EX =, EX = 1, and E X 3 <, then L 1 X,X ) E X 3, moreover, for any ε > there exists a distribution of a r.v. X concentrated in two points, such that EX =, EX = 1, E X 3 <, and L 1 X,X ) > 1 ε)e X 3. The existence of the square bias transformation follows from the earlier result of [15] mentioned above. Moreover, in 5, Goldstein and Reinert [6] proved the existence of a class of transformations of probability distributions that are characterized by equations like 5). Namely, the authors described a class of measurable functions T : R R that provide the existence and uniqueness of the distribution of a random variable X T) such that ETX)GX) = EG m) X T) ) EXm TX) m! for all m times differentiable functions G: R R with E TX)GX) <. The authors of [6] also noticed that this class includes the zero- and size- bias transformations respectively with m = 1, Tx) = x and m =, Tx) = x +. L.Goldstein [3] noticed that this class also includes the square bias transformation with m =, Tx) = x ). However, up till now the properties of the square biased transformation have not been studied, in particular, the characteristic function of the square biased distribution and the estimate for L 1 X,X ) are established in this paper for the first time. Motivation and applications The zero bias transformation gives an opportunity to construct an integral estimate for the proximity of a ch.f. with zero mean to the normal one with the same variance in terms of the proximity of the corresponding zero biased distribution to the original one, which might be sharper than non-integral estimates based on the Taylor formula in the neighborhood of zero. Namely, for the sake of convenience put σ = 1 implying β 3 1 by the Lyapounov inequality. Then using the elementary relations ft) e t / = e t / t fu)e u / 1 ) t du = e / t f u)+σ ufu) ) e u / du = 4 = e t / t Ee iux Ee iuxz) ) ue u / du, 7)

and the estimate for the difference of arbitrary ch.f. s with finite first moments due to Korolev and Shevtsova [1]: Ee itx Ee ity t sin L 1X,Y) π ), E X,E Y <, t R, 8) where a b min{a,b}, a,b R, it is not difficult to conclude that t rt) ft) e t / e t / Ee iux ue Ee iuxz) u / du t e t / sin u L 1X,X z) ) π ) ue u / du, t R. Finally, applying inequality 3) to estimate L 1 X,X z) ) one obtain t rt) e t / β3 u sin 4 π ) ue u / du, t R, 9) for any r.v. X with EX =, EX = 1, E X 3 = β 3 <. Estimate 9) is exact as t, since, as is well known, rt) β 3 t 3 /6, and 9) implies that for all β 3 1 such that β 3 t π/ we have t β3 u ) rt) usin du = 3 sin β 3 t β 3 t 4 β3 4 4 cos β ) 3 t < β 3 t 3 4 6, with the least possible factor 1/6. However, estimate 9) is always sharper than the power-type estimate rt) β 3 t 3 /6 especially for moderate separated from zero) values of β 3 t. Note that β 3 t can be separated from zero for large enough values of β 3 1 even if t is small. Thus, the estimates for rt) of an integral 9)-type form play an important role in the construction of the least possible upper moment-type bounds of the accuracy of the normal approximation which should be uniform in some classes of distributions, especially if in these classes extremal distributions have large third absolute moments. This situation is typical, for example, for the problem of optimization of the absolute constants in the Berry Esseen-type inequalities with an improved structure see [11, 1, 1, 14, ] where a smoothing inequality is applied with the subsequent estimation of the difference f n t) e t /, f n t) being the ch.f. of the normalized sum of independent random variables, in terms of the difference ft) e t /, ft) being the ch.f. of a single r.v.) and in its non-uniform analogues for sums of independent r.v. s that use the Berry Esseen inequality with an improved structure see [, 18, 8]), as well as in the moment-type estimates of the rate of convergence in limit theorems for compound and mixed compound Poisson distributions where β 3, see [13, 1, 17]) which use the Berry Esseen inequality with an improved structure as well. The above reasoning suggests that for the moderate values of β 3 t, estimates for rt) in the twice-integrated form might be even sharper than estimates in the once-integrated form like 9). Since the ch.f. ft) is supposed to be differentiable at least twice, it is possible to continue 7) as ft) e t / = e t / t u e u / t u e u / = e t / f s)+sfs) ) dsdu = f s)+fs)+sf s) ) dsdu = = e t / 5 t u e u / ) Ee isx Ee isx +sf s) dsdu,

or as ft) e t / = e t / = e t / t u t u = e t / t u fs)e s / 1 ) dsdu = f s)+fs)+sf s)+s f s)+sfs) )) e s / dsdu = Ee isx Ee isx +sf s)+s Ee isx Ee isxz) ) ) e s / dsdu. Note that the second estimate contains the additional term s Ee isx Ee isxz) ), but the factor e s / does not exceed the analogous factor e u / in the first one. However, for all s t this additional term satisfies g 3 s) s Ee isx Ee isxz) s t = Ot ), t, actually, an even sharper estimate can be obtained, if inequalities 8) and 3) are used). If EX =, EX = 1, then for all s t we have g s) = s f s) = s EXe isx EX se Xe isx 1) s EX = s t = Ot ), t, so that while the first term sup g s)+g 3 s)) = Ot ), t, s t g 1 s) = Ee isx Ee isx = fs)+f s) should be equivalent to β 3 s as s + in order that the final integrated estimate have the exact order β 3 t 3 /6 as t. Thus, it is g 1 s) that determines the behavior of the final integral estimate for small values of s, and the problem of construction of the least possible bound for g 1 s) is very important. Theorem 1 gives an opportunity to construct such a bound. Namely, the following corollaries hold. Corollary 1. Let X be a r.v. with the ch.f. ft) and EX =, EX = 1, β 3 E X 3 <. Then for all t R Ee isx Ee isx ft)+f β3 t t) sin π ). Corollary. Let X be a r.v. with the ch.f. ft) and EX =, EX = 1, β 3 E X 3 <. Then for all t R ft) e t / e t / u t u β3 s sin π ) )ds+ u3 e u / 3 β3 s sin π ) +s β3 s sin 4 π ) )+s e s / dsdu. 6

to Note that, ast +, the r.-h. sides of the inequalities presented in corollary are equivalent t u sin β3 s ) dsdu = 8 β3 t β3 sin β ) 3t < < 3 β 3 sin β 3t 4 β 3t 4 cos β ) 3t = 4 t usin β3 u ) du, 4 provided that β 3 t π. Thus, the estimates including the square bias transformation which are presented in corollary in the twice-integrated form are sharper as t that the estimates in the once-integrated form which include the zero bias transformation only. So, corollary plays an important role in estimation of the rate of convergence in limit theorems for sums of independent random variables mentioned above. However, particular application of corollary is the subject of a separate investigation and will be published elsewhere. 3 Proof of theorem 1 As is known see, e.g. [4, Theorem 1.3.1]), the L 1 -metric can be represented in terms of the ζ 1 -metric as L 1 X,Y) = ζ 1 X,Y) sup{ EgX) EgY) : g F 1 }, E X <,E Y <, where F 1 is the set of all real-valued functions on R such that sup x y gx) gy) / x y 1. Since for any function g F 1 we also have g) F 1, we conclude that the modulus in the definition of ζ 1 X,Y) can be omitted: L 1 X,X ) = ζ 1 X,X ) = sup{egx) EgX ): g F 1 }. Let X be a r.v. with the d.f. Fx) and EX =, EX = 1, E X 3 <, X have X-square biased distribution, i.e. the d.f. F x) of the r.v. X satisfying the relation df x) = x dfx), x R. Then For g F 1 denote Then L 1 X,X ) = sup g F 1 EgX) EgX )) = sup g F 1 JF,g) = E1 X )gx) X 3 ) = L 1 X,X ) E X 3 = sup g F 1 JF,g), and the statement of the theorem is equivalent to sup sup g F 1 F JF,g) =, 1 x )gx)dfx). 1 x )gx) x 3 )dfx). where the supremum sup F is taken over all d.f. s F of the r.v. X satisfying two moment-type conditions: EX =, EX = 1. As it follows from the results of [9, 16], the supremum of a linear 7

with respect to the d.f. Fx)) functional JX, g) under two linear equality-type conditions EX =, EX = 1 is attained at the distributions concentrated in at most three points. Before passing to checking three- and two-point distributions, recall that for L 1 -metric the following representation in terms of the mean metric holds as well see, e.g. [4, 1.3]): and hence, L 1 X,Y) = κx,y) L 1 X,X ) = PX < u) PY < u) du, E X E Y <, Fu) F u) du = L 1 X,X ) = Fu) EX IX < u) du. Fu) EX IX < u) du. Let the r.v. X take two values and satisfy the conditions EX =, EX = 1. Then its distribution should necessarily have the form P X = ) q/p = p = 1 P X = ) p/q, q = 1 p,1). It is easy to see that EX 3 = q p)/ pq, E X 3 = p +q )/ pq. Then and hence L 1 X,X ) =, u p/q, EX IX < u) = p, p/q < u q/p, 1, q/p < u, EX IX < u) Fu) = p q)i p/q < u q/p ), p q I p/q < u q/p ) du = p q pq = EX 3 E X 3 by virtue of the Jensen inequality, thus, the statement of the theorem holds. Moreover, for any ε > L 1 X,X ) E X 3 = 1 p 1 p > 1 ε 1 p+p for all < p < ε/. Now consider a r.v. X taking exactly three values. Note that { L1 X,X ) } { sup : EX =, EX L1 X,X )σ } = 1 = sup : σ >, EX =, EX = σ, E X 3 E X 3 where the supremums are taken over three-point distributions of the r.v. X. Let X take values x,y,z with probabilities p,q,r > respectively, p+q +r = 1. Without loss of generality one can assume that x < y < z. From the conditions EX =, EX = σ we find that p = σ +yz z x)y x), q = σ +xz z y)y x), r = σ +xy z x)z y), yz < σ < xz. 8

For all u R we have EX IX < u) = σ EX IX < u) Fu) =, u x, px, x < u y, px +qy, y < u z, σ, z < u,, u x, px /σ 1), x < u y, px +qy )/σ p q, y < u z,, z < u. Noticing that px +qy )/σ p q = σ rz )/σ 1+r = r1 z /σ ), we obtain L 1 X,X ) = Fu) σ EX IX < u) du = p x y 1 σ 1 z z x)+r y). σ Consider the function L 1 X,X )σ E X 3 = py x) x σ +rz y) z σ +px 3 +qy 3 rz 3 = = 1 x σ σ +yz)+ z σ σ +xy) z3 σ +xz) + z x z y +σ z x xy +yz)+xyzz x)+xz 3 ) gx,y,z,σ ). The statement of the theorem is equivalent to supgx,y,z,σ ) =, where the sumpremum is taken over all σ >, x < y < z such that yz < σ < xz. Note that it suffices to consider only σ < max{x,z }, since the opposite inequality with q > ) implies that EX < σ. So, there are only three possibilities: 1) < σ < min{x,z }, ) x σ < z, 3) z σ < x. Opening the modules, we notice that gx,y,z,σ ) is a parabola with respect to σ on each of the intervals specified above. Consider the behavior of gx,y,z,σ ) on each of these intervals. 1. < σ < min{x,z }, then necessarily σ < xz and gx,y,z,σ ) = σ +xy)yz +σ z y)). z y)z x) The coefficient /z x) at σ 4 is negative, thus the branches of this parabola with respect to σ look down and the maximum value of the function gz,y,z,σ ) within the interval < σ < min{x,z } is attained either at the vertex σ = +xz xy) yz, z y) if σ > yz, or at the point σ yz +, if σ yz. We have σ +yz = yzz x) yz x)) z y) since x < y < z, with the equality attained if and only if y =. Thus, the supremum is attained as σ yz +, which implies that p and reduces the problem to checking two-point distributions considered above. 9,

. z σ < x, then by virtue of the conditions r >, z >. 3. x σ < z, then the function gx,y,z,σ ) = z3 σ +xy) z x)z y) = rz3 < gx,y,z,σ ) = σ x z y)+y z x)+xyz)+xyzxy xz yz) z x)z y) is linear and decreases monotonically in σ, since x z y)+y z x)+xyz >. Thus, if x yz, then the supremum of gx,y,z,σ ) is supplied by σ yz +, which reduces the problem to checking two-point distributions considered above. If x > yz, then the supremum of gx,y,z,σ ) is attained at σ = x. With this value of σ we have gx,y,z,x ) = xx+y)x z yx +yz ). z x)z y) y gx,y,z,x ) = xx+z)yzz y)+xz y) ) z x)z y) <, since x < z and x < y < z. Thus, the supremum of gx,y,z,x ) over all y such that x < y and yz < x = σ is supplied by y max{x, x /z}+ = x /z +, i.e., p, which reduces the problem to checking two-point distributions considered above. Thus, the theorem is completely proved. References [1] L. H. Y. Chen, L. Goldstein, Q.-M. Shao. Normal approximation by Stein s method. Springer, Berlin, Heidelberg, 11. [] S. V. Gavrilenko. An improvement of the nonuniform estimates of convergence rate of distributions of Poisson random sums to the normal law. Informatics and its Applications in Russian), 51):1 4, 11. [3] L. Goldstein. L 1 bounds in normal approximation. Ann. Probab, 355):1888 193, 7. [4] L. Goldstein. Bounds on the constant in the mean central limit theorem. Ann. Probab, 384):167 1689, 1. arxiv:91.76, 9. [5] L. Goldstein, G. Reinert. Stein s method and the zero bias transformation with application to simple random sampling. Ann. Appl. Probab., 74):935 95, 1997. [6] L. Goldstein, G. Reinert. Distributional transformations, orthogonal polynomials, and Stein characterizations. J. Theor. Probab., 181):37 6, 5. [7] L. Goldstein, Y. Rinott. On multivariate normal approximations by Stein s method and size bias couplings. J. Appl. Probab., 33:1 17, 1996. 1

[8] M. E. Grigorieva, S. V. Popov. An upper bound for the absolute constant in the nonuniform version of the Berry Esseen inequalities for nonidentically distributed summands. Dokl. Math., 861):54 56, 1. [9] W. Hoeffding. The extrema of the expected value of a function of independent random variables. Ann. Math. Statist., 6):68 75, 1955. [1] V. Korolev, I. Shevtsova. An improvement of the Berry Esseen inequality with applications to Poisson and mixed Poisson random sums. Scand. Actuar. J., 1):81 15, 1. Available online since 4 June 1. [11] V. Yu. Korolev, I. G. Shevtsova. An improvement of the Berry Esseen inequalities. Dokl. Math., 811):119 13, 1. [1] V. Yu. Korolev, I. G. Shevtsova. On the upper bound for the absolute constant in the Berry Esseen inequality. Theory Probab. Appl., 544):638 658, 1. [13] V. Yu. Korolev, I. G. Shevtsova. Sharpened upper bounds for the absolute constant in the Berry Esseen inequality for mixed Poisson random sums. Dokl. Math., 81):18 18, 1. [14] V. Yu. Korolev, I. G. Shevtsova. A new moment-type estimate of convergence rate in the Lyapunov theorem. Theory Probab. Appl., 553):55 59, 11. [15] E. Lukacs. Characteristic Functions. Griffin, London, nd edition, 197. [16] H. P. Mulholland, C. A. Rogers. Representation theorems for distribution functions. Proc. London Math. Soc., 8):177 3, 1958. [17] Yu. S. Nefedova, I. G. Shevtsova. Structural improvement of nonuniform estimates for the rate of convergence in the central limit theorem with applications to Poisson random sums. Dokl. Math., 84):675 68, 11. [18] Yu. S. Nefedova, I. G. Shevtsova. On non-uniform convergence rate estimates in the central limit theorem. Theory Probab. Appl. in Russian), 571):6 97, 1. [19] E. Peköz, A. Röllin, N. Ross. Degree asymptotics with rates for preferential attachment random graphs. Ann. Appl. Probab., 33):1188 118, 13. [] I. Shevtsova. On the absolute constants in the Berry Esseen type inequalities for identically distributed summands. arxiv:1111.6554, 11. [1] C. Stein. Estimation of the mean of a multivariate normal distribution. Ann. Statist., 996):1135 1151, 1981. [] I. Tyurin. New estimates of the convergence rate in the Lyapunov theorem. arxiv:91.76, 9. [3] I. S. Tyurin. On the accuracy of the Gaussian approximation. Dokl. Math., 83):84 843, 9. [4] V. M. Zolotarev. Modern Theory of Summation of Random Variables. VSP, Utrecht, The Netherlands, 1997. 11