Compound binomial approximations

Size: px

Start display at page:

Download "Compound binomial approximations"

Nicholas Horn
5 years ago
Views:

1 AISM (6) 58: DOI 1.17/s Vydas Čekanavičius Bero oos Compound binomial approximations eceived: 3 August 4 / Accepted: 7 January 5 / Published online: 8 March 6 The Institute of Statistical Mathematics, Tokyo 6 Abstract We consider the approximation of the convolution product of not necessarily identical probability distributions q I + p F, ( = 1,...,n), where, for all, p = 1 q [, 1], I is the Dirac measure at point zero, and F is a probability distribution on the real line. As an approximation, we use a compound binomial distribution, which is defined in a one-parametric way: the number of trials remains the same but the p are replaced with their mean or, more generally, with an arbitrary success probability p. We also consider approximations by finite signed measures derived from an expansion based on Krawtchouk polynomials. Bounds for the approximation error in different metrics are presented. If F is a symmetric distribution about zero or a suitably shifted distribution, the bounds have a better order than in the case of a general F. Asymptotic sharp bounds are given in the case, when F is symmetric and concentrated on two points. Keywords Compound binomial distribution Kolmogorov norm Krawtchouk expansion Concentration norm One-parametric approximation Sharp constants Shifted distributions Symmetric distributions Total variation norm V. Čekanavičius Department of Mathematics and Informatics, Vilnius University, Naugarduko 4, Vilnius 35, Lithuania vydas.cekanavicius@maf.vu.lt B. oos (B) Department of Mathematics, SPST, University of Hamburg, Bundesstr. 55, 146 Hamburg, Germany roos@math.uni-hamburg.de

2 188 V. Čekanavičius, B. oos 1 Introduction 1.1 The aim of the paper Binomial approximations are usually applied to the sums of independent or dependent indicators (see Ehm 1991; Barbour et al. 199, Sect. 9.; Soon 1996; oos ; Čekanavičius and Vaitkus 1; Choi and Xia ). The present paper is devoted to a more general situation of a generalized Poisson binomial distribution, which is defined as the convolution product of not necessarily identical probability distributions q I + p F, ( {1,...,n}, n N ={1,,...}) on the set of real numbers. Here, for all, p = 1 q [, 1], I u is the Dirac measure at point u, I = I, and F is assumed to be a probability distribution on.as approximations, we choose a compound binomial distribution, which is defined in a one-parametric way: the number n of trials remains the same but the p are replaced with their mean or, more generally, with an arbitrary success probability p. We also consider the approximation with a suitable finite signed measure, which can be derived from a related expansion based on Krawtchouk polynomials. Bounds for the distance in several metrics are given. It turns out that, if F is a symmetric distribution about zero or a suitably shifted distribution, i.e. the convolution of a distribution G on with an adequate Dirac measure I u at point u, the accuracy of approximation will be increased. In the compound Poisson approximation, similar investigations were made by Le Cam (1965) and Arak and Zaĭtsev (1988). In the case of a symmetric distribution F concentrated on two points, we present bounds containing asymptotic sharp constants. emarkably, the main body of research in compound approximations is restricted to compound Poisson laws or finite signed measures derived from related expansions. It seems that compound binomial approximation hardly attracted any attention. It is evident that, in contrast to approximations by (compound) Poisson laws, the ones by (compound) binomial distributions are exact when p = p for all. As can be seen below, the bounds for the distances given in this paper reflect this fact. It should be noted that the one-parametric binomial approximation is not the only one in this context. In Barbour et al. (199, Sect. 9.), it was proposed to use the two-parametric binomial approximation, matching two moments of the approximated law. The two-parametric approach can also be extended to the compound case and is discussed in a separate paper (see Čekanavičius and oos 4). In this paper, a combination of different techniques is used. The main arguments are the Krawtchouk expansion from oos () and several norm estimates. It should be mentioned that the proofs of some of these norm estimates are rather complicated and require the deep Arak-Zaĭtsev method (cf. Čekanavičius 1995; Arak and Zaĭtsev 1988). The structure of the paper is the following. In the next subsection, we proceed with some notation. In Sect., we discuss important known facts. Section 3 is devoted to the main results, the proofs of which are given in Sect. 4. In the Appendix, we collect some important facts on the Krawtchouk expansion. 1. Some notation For our purposes, it is more convenient to formulate all results in terms of distributions or signed measures rather than in terms of random variables. Let F (resp. S,

3 Compound binomial approximations 189 M) denote the set of all probability distributions (resp. symmetric probability distributions about zero, resp. finite signed measures) on. All products and powers of finite signed measures in M are defined in the convolution sense; for W M, set W = I. The exponential of W is defined by the finite signed measure exp{w} = W m m!. Let W = W + W denote the Hahn-Jordan decomposition of W. The total variation norm W, the Kolmogorov norm W, and the Lévy concentration norm W h of W M are defined by W =W + () + W (), W =sup W((,x]), x W h = sup W([x, x + h]), (h [, )), x respectively. Note that the total variation distance F G between F,G F is equal to sup A F (A) G(A), where the supremum is over all Borel measurable sets A. It should be mentioned that is only a seminorm on M, i.e. it may happen that, for non-zero W M, W =. But if we restrict ourselves to finite signed measures concentrated on the set of all integers Z, then is indeed a norm, the so-called local norm, that coincides with the l -norm of the counting density of the signed measure under consideration. We denote by C positive absolute constants, which may differ from line to line. Similarly, by C( ) we denote constants depending on the indicated argument only. By a condition of the type f(x) C<1for a real-valued function f(x)of some values x, we mean that a positive absolute constant C<1exists such that, for all x, f(x) C. In other words, f(x) is bounded away from 1 uniformly in x. Forx, let x be the largest integer not exceeding x. We always let = 1, n N, p [, 1], q = 1 p, ( {1,...,n}), p = (p 1,...,p n ), p = 1 n p max = max 1 n p, p min = min 1 n p, δ = p max p min, p, q = 1 p, λ = np, p [, 1], q = 1 p, =1 γ k (p) = (p p ) k, γ k = γ k (p), (k N), =1 η(p) = γ (p) + (γ 1 (p)), θ(p) = η(p) npq, θ = θ(p) = γ np q, n GPB(n, p, F) = (q I + p F), Bi(n, p, F ) = (qi + pf) n, (F F). =1 Note that γ k (p), η(p), and θ(p) not only depend on p but also on p. For brevity, this dependence will not be explicitly indicated. The binomial distribution with parameters n and p is defined by Bi(n, p) = Bi(n, p, I 1 ).

4 19 V. Čekanavičius, B. oos Using the above notation, the goal of the present paper can be summarized as follows: Give bounds for the accuracy of approximation of the generalized Poisson binomial distribution GPB(n, p, F), (F F) by the compound binomial law Bi(n, p, F ) and by related finite signed measures, which are defined in Subsect.. below. Known facts.1 Ehm s result In what follows, we discuss some known results in the one-parametric binomial approximation to the Poisson binomial distribution GPB(n, p, I 1 ).Byusing Stein s method, Ehm (1991) proved that the total variation distance d 1 = GPB(n, p, I 1 ) Bi(n, p) between GPB(n, p, I 1 ) and the binomial distribution Bi(n, p) can be estimated in the following way: 1 6 min{θ, γ } d 1 min{θ, γ }. (1) From (1), we see that d 1 and min{θ, γ } have the same order. In estimating d 1, the quantities θ and γ play a different role. First note that, as has been shown in (oos, emark on p. 59), we have { θ δ min 1, δ }. () 4p q In particular, we have θ 1. Since θ δ n =1 p p npq (cf. oos, p. 63), we obtain θ min{γ,θ} θ, which implies that the distance d 1 is small if and only if θ is small, or, since θ = 1 (np q) 1 n =1 p q, if and only if the quotient of the variances of the involved distributions is approximately equal to one (see also Ehm 1991, Corollary ). Therefore, looking at (1), the upper bound θ is much more important than the γ. To say it using the terminology by Barbour et al. (199, p. 5), the factor θ/γ = (np q) 1 is a magic factor.. Approximations using the Krawtchouk expansion In oos (), the same binomial approximation problem as in Subsect..1 was investigated. In fact, by using generating functions, an expansion based on Krawtchouk polynomials was constructed. In what follows, we collect some basic facts about this expansion needed later on. Further informations can be found in the Appendix below. In Theorem 1 of the above mentioned paper, it was shown that, for arbitrary p [, 1] and F = I 1, the identity GPB(n, p, F) = a (p) (F I) ( qi + pf ) n (3) =

5 Compound binomial approximations 191 holds, where the Krawtchouk coefficients a (p) are given by a (p) = 1, a 1 (p) = γ 1 (p), a (p) = 1 ((γ 1(p)) γ (p)), a 3 (p) = 1 6 (γ 1(p)) γ 1(p)γ (p) 1 3 γ 3(p), and, for {1,...,n}, a (p) = 1 1 a k (p) γ k (p). (4) k= Note that the coefficients a (p) not only depend on p but also on p. Alternatively, by (4), a (p) can be considered as a function of (γ 1 (p),...,γ (p)). It is evident that (3) also holds for a general distribution F F. Taking into account (3), as an approximation of GPB(n, p, F), it is useful to choose the finite signed measure Bi(n, p, F ; s) = s a (p) (F I) ( qi + pf ) n, (F F) (5) = with s {,...,n} being fixed. Note that Bi(n, p, I 1 ; ) = Bi(n, p) and that, for s = 1 and p = p,wehavebi(n, p, I 1 ; 1) = Bi(n, p). It should be mentioned that, in the remaining cases, Bi(n, p, F ; s) also depends on p. In Theorem of oos (), it was shown that GPB(n, p, I 1 ) Bi(n, p, I 1 ; s) C 1 (s) min{θ(p), η(p)} (s+1)/, (6) GPB(n, p, I 1 ) Bi(n, p, I 1 ; s) C (s) (θ(p))(s+1)/, npq (7) where, for (7), we have to assume that θ(p) C<1. For p = p and s = 1, (6) has the same order as Ehm s upper bound [see (1)]. Note that the constants C 1 (s) and C (s) can be given explicitly (see also oos 1a, Corollary 1). In oos (, Theorem 3), it was proved that, if γ >, then, for W = GPB(n, p, I 1 ) Bi(n, p), W θ Cθv, (8) W πe θ C πnpq θ np q v, (9) with { γ 3 v = min 1, + 1 } γ np q np q + θ, where, for (9), we have to assume that θ < C < 1. For example, from (8), it follows that W /(πe)θ as θ and np q.

6 19 V. Čekanavičius, B. oos.3 The Le Cam-Michel trick and its generalizations As explained above, the purpose of the present paper is the investigation of the changes in the accuracy of approximation, when, in GPB(n, p, I 1 ) and Bi(n, p, I 1 ; s) from Subsect.., I 1 is replaced with a more general distribution F F. Thus we deal with compound distributions, which, however, retain structures similar to the ones presented in above. In fact, by the properties of the total variation norm, sup GPB(n, p,f) Bi(n, p, F ; s) = GPB(n, p,i 1 ) Bi(n, p, I 1 ; s). (1) F F Consequently, the supremum is achieved when F = I 1. In the compound Poisson approximation, a similar property has been observed by Le Cam (1965, p. 187) and later rediscovered by Michel (1987, p. 167). For the Kolmogorov norm and the concentration seminorm with h =, similar assertions hold. Moreover, generalizations with respect to arbitrary finite signed measures are possible. Indeed, if W M is concentrated on Z + ={, 1,,...} and if F 1,F,F 3 F, where F and F 3 are assumed to be concentrated on [, ) and on (, ), respectively, then, as is easily shown, W({m})F1 m W({m})F m W({m})F3 m W, (11) W sup x W sup (F m F m+1 )((, x]) W, (1) x F3 m ({x}) W. (13) If, in (11) (13), we set W = GPB(n, p, I 1 ) Bi(n, p, I 1 ; s), we arrive at (1) and the respective equalities for the Kolmogorov and local norms. In view of these inequalities, one can expect some improvement in approximation accuracy only under additional assumptions on F. Below, we will show that such improvements are possible, when F is either suitably shifted or symmetric. Note that, with a few exceptions only, we do not require the finiteness of moments of F..4 Compound Poisson approximations by Le Cam, Arak and Zaĭtsev In this paper, we consider shifted and symmetric distributions F F. By a shifted distribution F F, we mean F = I u G, where G F and u. Then minimizing the distance with respect to u, one can expect some improvement of the accuracy of approximation. Shifted distributions play an important role in compound Poisson approximations, see, for example, Le Cam (1965) or Čekanavičius (). Indeed, Le Cam (1965) proved that sup G F inf (I u G) n exp{n(i u G I)} u C. (14) n1/3

7 Compound binomial approximations 193 Note that, as is easily shown, sup G F G n exp{n(g I)} is bounded away from zero, which stresses the advantage of shifted distributions in this context. Similar estimates hold for the symmetric distributions. More precisely, if in (14), we replace I u G with an arbitrary symmetric distribution in S then the accuracy of compound Poisson approximation will be Cn 1/. If we replace I u G by a symmetric distribution with non-negative characteristic function then the accuracy of compound Poisson approximation will be Cn 1. In general, both results are of the right order, see Arak and Zaĭtsev (1988, chap. 5). It is noteworthy that, in a similar context, Barbour and Choi (4) used Stein s method for the approximation of the sum of independent integer valued random variables by a translated Poisson distribution. 3 Main results In what follows, let Bi(n, p, F ; s), (F F) be defined as in (5). Further, if not stated otherwise, we assume that p [,.3]. (15) Theorem 3.1 Let s {,...,n} and h [, ). Let us assume that 4θ(p) C<1. (16) 3(1 p) Then the following inequalities hold. For G F, inf GPB(n, p, I (η(p)) (s+1)/ ug) Bi(n, p, I u G; s) C(s). (17) u (np) (s+1)/+(s+1)/(s+4) For F S, GPB(n, p, F) Bi(n, p, F ; s) C(s) (η(p))(s+1)/ (np) s+1, (18) GPB(n, p, F) Bi(n, p, F ; s) h C(s) (η(p))(s+1)/ Q 1/(s+3) (np) s+1 h (19) ( ln Q h +1) 6(s+1)(s+)/(s+3), where we write Q h := Q h, np, F := exp{3 1 np(f I)} h. emark 1 (i) All approximations (17) (19) are exact if p = p for all. In fact, in this case, the upper bounds vanish, since η(p) = γ =. In particular, here, the conditions (15) and (16) are superfluous (see also emark 1(ii)). (ii) From (), it follows that (16) is satisfied, when p = p.3 and δ.3, since, in this case, we have 4θ 3(1 p).4 { (1 p) min.3 } 1,.9. 4p q In particular, if p max.3 and p = p, then (16) is valid. In Theorem 3.1, the condition p [,.3] seems to be superfluous. It may suffice to assume that θ(p) C for a suitable constant C. But we could not prove it.

8 194 V. Čekanavičius, B. oos (iii) The bounds (17) (19) have a better order than (θ(p)) (s+1)/, which, in turn, appears as the interesting part in the bounds from (6) and (1). Generally, for the total variation distance, there seems to be no hope for upper bounds similar to (17) and (18). Good upper bounds for the concentration norm in the context of shifted distributions remain as an open question. (iv) In contrast to (19), the estimates (17) and (18) are uniform in G F and F S, respectively. However, due to the method of proof, we cannot say much about the constants C(s). It should be mentioned, that, in order to obtain the explicit conditions (15) and (16), in the proofs we often deal with explicit constants. But, since our goal was to obtain a weak condition, the leading constants in the estimates turned out to be quite large. (v) It is easily shown that (18) follows from (19). However, (19) leads to estimates of a better order than (18), if we use a Le Cam-type bound for the concentration function of compound Poisson distributions; for example, see oos (5, Proposition 3), where it was shown that, for t (, ), h [, ), and an arbitrary distribution F F, exp{t(f I)} h 1 e t max{f((, h)), F ((h, ))}. () If, in Theorem 3.1, we set p = p and s = 1, we obtain the results with respect to the compound binomial approximation. Corollary 3.1 Let h [, ). Let us assume that 4θ p [,.3] and C<1. (1) 3(1 p) Then the following inequalities hold. For G F, inf GPB(n, p, I ug) Bi(n, p, I u G) C γ u λ. 4/3 For F S, GPB(n, p, F) Bi(n, p, F) C γ λ, GPB(n, p, F) Bi(n, p, F) h C γ λ Q1/5 h ( ln Q h +1) 36/5, where Q h is defined as in Theorem 3.1. For symmetric distributions concentrated on Z \{}, alternative estimates can be shown. In particular, in this case, it is possible to derive a bound for the total variation norm, which is comparable with (18) for the weaker Kolmogorov norm. Theorem 3. Let the assumptions of Theorem 3.1 be valid. If F S is concentrated on the set Z \{}, then GPB(n, p, F) Bi(n,p,F; s) C(s) σ (η(p))(s+1)/ (np) s+1, () GPB(n, p, F) Bi(n, p, F ; s) h C(s) h + 1 (η(p))(s+1)/, (3) (np) s+3/ where, for (), we assume that F has finite variance σ.

9 Compound binomial approximations 195 emark (i) The total variation bound () is slightly worse than (18); indeed, the variance σ of F S concentrated on Z \{} cannot be smaller than one. (ii) Under the assumptions of Theorem 3., an upper bound for the Kolmogorov norm can be shown by using Tsaregradskii s (1958) inequality. Unexpectedly, the resulting bound is of worse order than () and is therefore omitted. To be more precise, here σ appears instead of σ from (). (iii) Inequality (3) exhibits a better order than the bound, which can be derived from (19) and (). If, in Theorem 3., we set p = p and s = 1, we obtain the results regarding the compound binomial approximation. Corollary 3. Let the assumptions of Corollary 3.1 be valid. If F S is concentrated on the set Z \{}, then GPB(n, p, F) Bi(n, p, F) C σ γ λ, (4) GPB(n, p, F) Bi(n, p, F) h C h + 1 γ λ, 5/ where, for (4), we assume that F has finite variance σ. In the previous results, the method of proof does not allow us to get reasonable estimates of absolute constants. However, in the special case, when F is a symmetric distribution concentrated on two points we are able to obtain asymptotic sharp constants. Theorem 3.3 Let α (, ) and F = 1 (I α + I α ). Let c (1) = , c () = , c (3) = be defined as in Lemma 4.7 below. If the conditions in (1) are satisfied, then GPB(n, p, F) Bi(n, p, F) c (1) γ γ C, (5) λ λ5/ GPB(n, p, F) Bi(n, p, F) c () γ γ C, (6) λ λ5/ GPB(n, p, F) Bi(n, p, F) c (3) γ γ C λ 5/ λ. (7) 3 emark 3 (i) In view of (), one may ask why, in (5), the variance σ = α of F does not occur. The answer is simply that, similar to (1), we may assume that α = 1. (ii) From (5), it follows that, under the assumptions of Theorem 3.3, GPB(n, p, F) Bi(n, p, F) c (1) γ λ, (8) as λ. Here, as usual, means that the quotient of both sides tends to one. In particular, (8) is valid if we assume that p.3, θ, and λ. Similar relations hold for the Kolmogorov and local norms.

10 196 V. Čekanavičius, B. oos 4 Proofs 4.1 Norm estimates For several proofs below, we need the following well-known relations VW V W, VW V W, VW h V h W, (9) W W, W h W, where V,W M, h [, ). Note that, if W() =, then max{ W, W h } 1 W. As usual, for m Z + and complex valued x C, let ( x m) = m k=1 [(x k + 1)/k]. Lemma 4.1 If F F, t (, ), N, n Z +, p (, 1), then (F I) exp{t(f I)} 3 te, (3)! (F I) exp{t(f I)}, (31) t / ( ) n + 1/ (F I) (qi + pf) n (pq) / (3) e 1/4( n ) n/ ( ) /. (33) n + (n + )pq For a proof of (3) and (31), see oos (1b, Lemma 3) and oos (3, Lemma 4), respectively. For (3) and (33), see oos (, Lemma 4). Lemma 4. Let n N, p [, 1], r (, 1), t = rnp, and g(p) = (p) k (k 1) = ep (e p 1 + p). k! (p) k= If pg(p)<(1 r)/, then ( sup (I + p(f I)) n exp{ t(f I)} 1 F F Proof Let F F and y = np t = (1 r)np >. Then with p g(p) ) 1. 1 r T := (I + p(f I)) n exp{ t(f I)} = [(I + p(f I)) exp{ p(f I)}] n exp{y(f I)} = [ I + p (F I) ] n exp{y(f I)}, = k= ( p) k (1 k) (F I) k, g(p). k!

11 Compound binomial approximations 197 Therefore, by using (31), ( ) n { T p (F y } I) exp (F I) = = n! ( (n )! y (p g(p)) 1 p g(p) ) 1. 1 r The lemma is proved. Lemma 4.3 Let G F, t (, ), and N. Then inf (I ug I) C() exp{t(i u G I)}. (34) u t /+/(+) The estimate (34) was proved in Čekanavičius (1995, Theorem 3.1). For the results with respect to symmetric distributions, we need the following result. Lemma 4.4 Let F S, t (, ), N, and h [, ). Then (F I) exp{t(f I)} C(), (35) t (F I) exp{t(f I)} h C() t Q 1/(+1) h ( ln Q h +1) 6(+1)/(+1), (36) where Q h := Q h,t,f := exp{4 1 t(f I)} h. Proof For h>, inequality (36) follows from the more general Theorem 1.1 in Čekanavičius (1995). Note that, in this theorem, there is a misprint in the power of the last factor (compare the statement of the theorem in the paper with its Eq. (4.5)). For h =, (36) is valid as well, since, for W M and h [, ), W h lim inf r h W r and h Q h is continuous from the right; see Hengartner and Theodorescu (1973, Theorem 1.1.4). The estimate (35) follows from (36). In what follows, we need the Fourier transform Ŵ(x) = eixy dw(y), (x ) of a finite signed measure W M. Here, i denotes the complex unit. It is easy to check that, for V,W M and a,x, exp{w}(x) = exp{ŵ(x)}, V W (x) = V(x)Ŵ(x), Î a (x) = e ixa, Î(x) = 1. Lemma 4.5 Let W M be concentrated on Z satisfying k Z k W({k}) <. Then, for all a and b (, ), Further, W 1 + bπ π π π ( Ŵ(x) + 1 d ( e ixa Ŵ(x) ) ) dx. (37) b dx W 1 π π π Ŵ(x) dx. (38)

12 198 V. Čekanavičius, B. oos Proof The proof of (37) can be found, for example, in Presman (1985, Lemma on p. 419). As was pointed out by Presman, this inequality would be equivalent to a corresponding lemma by Esseen in the lattice case (cf. Ibragimov and Linnik 1971, p. 9). Inequality (38) is an immediate consequence of the well-known inversion formula W({k}) = (π) 1 π π Ŵ(x)e ikx dx, (k Z). Lemma 4.6 Let Z + and t (, ). IfF S is concentrated on the set Z \{}, then (F I) exp{t(f I)} 3.6 1/4 ( ) 1 + σ, ( ), (39) te ( + 1/ ) +1/, (F I) exp{t(f I)} (4) te where, for (39), we assume that F has finite variance σ.iff = 1 (I α + I α ) for some α (, ), then (F I) exp{t(f I)}! t. (41) Proof The proof of (41) is easily done by using (31) and the fact that, under the present assumptions, F I = 1 (I α I)(I α I). (4) We now prove (4) by using (38). Let F S be concentrated on Z \{} and W = (F I) exp{t(f I)}. Then, for x, F(x) = F({k}) cos(kx), 1 F(x) ( = 4 F({k}) sin kx ), k=1 Ŵ(x) = ( F(x) 1) exp{t( F(x) 1)}. It is easy to check that, for each G F concentrated on Z, wehave 1 π π π Ĝ(x) dx = k= k=1 ( G({k}) ) G. Therefore, for arbitrary A>, applying (), we obtain π π A } exp{a( F(x) 1)} dx π exp{ (F I) π Ae. (43)

13 Compound binomial approximations 199 Hence, using (38), W 1 π = 1 π π π π π (1 F(x)) exp{t( F(x) 1)} dx { (1 F(x)) exp 1 ( { π sup x exp x ) +1/. t + 1/ ( F(x) 1) tx + 1/ }) π π } { t } exp + 1 ( F(x) 1) dx { t } exp + 1 ( F(x) 1) dx ( + 1/ te Inequality (4) is shown. We now prove (39) by using (37) with a =. The parameter b will be chosen later. By the same arguments as above, we derive π π k=1 ( + 1/ ) +1/. Ŵ(x) dx 4π (44) te Note that d dx F(x) ( kx ) ( kx ) = 4 kf({k}) sin cos k=1 ( ( 4 k F({k}) F({}) sin x )) 1/ σ(1 F(x)) 1/. =1 Hence, letting B = , we obtain d dx Ŵ(x) d dx F(x) (1 F(x)) t(1 F(x)) exp{t( F(x) 1)} tσ ( ) sup ( y 1 (1 y) exp{ By} ) exp{t(1 B)( F(x) 1)}. t y By elementary calculus, we see that the sup-term is bounded by.69/(e ). With the help of (43), we therefore get π π d dx Ŵ(x) 173 ) π dx t( 15 σ te π { 9 t } exp 5 ( F(x) 1) dx 61 ) +1/. t( 4 πσ (45) te

14 V. Čekanavičius, B. oos If t.18, then, since σ 1, (.18 ) W 4 1 (1 + σ) ( (3.6) ) (1 + σ). t te If t.18, then letting b = σ t π.18 and taking into account (44) and (45), we obtain W 1 + bπ [ ( + 1/4 ) +1/ 61 4π + π te 4 (3.6) ( ) (1 + σ), te πσ t ( ) +1/ ] b te where we used the simple inequality (1 + 1/(4)) +1/ (5/4) 5/, ( N). The proof of (39) is completed. emark 4 By using Tsaregradskii s (1958) inequality, it would be possible to derive an estimate for (F I) exp{t(f I)}, ( N) under the same assumptions as for (39). However, the resulting bound would have been of worse order than the one in (39); see also emark (ii). 4. Asymptotically sharp norm estimates In what follows, let H (z) be the Hermite polynomial of degree Z +, satisfying, for z, / H (z) =! ( 1) m (z) m ( m)! m!, h (z) := 1 d / = ( 1) e z/ ( z π dz e z (+1)/ π H ). (46) Lemma 4.7 Let Z +, t,α (, ), and F = 1 (I α + I α ). Then (F I) exp{t(f I)} c(1) t (F I) exp{t(f I)} c() t (F I) exp{t(f I)} c(3) C(), ( ), (47) t +1/ C(), ( ), (48) t +1/ C(), (49) t +1/ t +1

15 Compound binomial approximations 1 where c (1) = 1 +1 h (x) dx, c () = 1 sup h +1 1 (x), (5) x c (3) = 1 sup h +1 (x). x In particular, we have c (1) = 1 3 { π exp 3 6 }( e ) 6, (51) c () = 1 3 { 8 π exp 3 6 } 3 6, (5) c (3) = 3 8 π. (53) The constants given in (51) (53) are important for Theorem 3.3. For the proof of Lemma 4.7, we need the following result, which is a slight but trivial improvement of Proposition 3 in oos (1999). Lemma 4.8 Let Z +, S be a set, and b : (, ) S be a bounded function. Then, for t (, ), sup x S sup z [ (1 + z ) t (+1)/ po( t + z ] t +b(t, z, x), t) ( 1) h (z) C() t, where po(, t)= po(, t)denotes the counting density of the Poisson distribution with mean t and, for m Z and N, po(m, t) = 1 po(m 1, t) 1 po(m, t). Proof of Lemma 4.7 We may assume that t 1. In view of (4) and the simple relation (F I) exp{t(f I)}= po(m, t) F m, (F F) (cf. oos 1999, Lemma 1), we see that, letting t = t/, (F I) exp{t(f I)} = 1 { t t } ( ) (I α I) exp (I α I)} (I α I) exp{ (I α I) = 1 po(m ( ) 1,t ) po(m,t )I (m m 1 )α = 1 ( ) m 1 = m = k Z po(m, t ) po(k + m, t )I kα.

16 V. Čekanavičius, B. oos This gives T (1) := (F I) exp{t(f I)} = 1 po(m, t ) po(k + m, t ). k Z By using simple transformations, we obtain T (1) t = po( t + y 1 t,t ) po( t + z dy t + b,t ) dy 1, where b = y y ( 1, ] and z = y 1 +y / t. From Lemma 4.8, it follows that po( t + y 1 t,t ) = ( 1) h (y 1 ) t (+1)/ po( t + z t + b,t ) = ( 1) h (z) t (+1)/ + b 1 + (1 + y1 b )t (+)/ (1 + z )t (+)/ where b 1 and b are functions of (, t,y 1 ) and (, t,z,b ), resp., with max{ b 1, b } C(). Combining this with the above, we arrive at T (1) t = ( + 1 ), where h (y 1 )h (z) dy h (y 1 )h (y 1 + y ) dy = t +1 dy 1 = t +1/ dy 1 and 1 is a quantity satisfying b h (y 1 ) 1 (1 + z )t +3/ b 1 b + = , say. It is easily shown that C() t +1 dy 1 dy + (1 + y 1 )(1 + z )t +, 3 C() t +1 dy 1 dy b 1 h (z) (1 + y1 +3/ )t, 4 C(), t +3/ giving 1 C()t (+1), since t = t/ 1/. This implies that T (1) c(1) C() t 1 C() t, +1/ t,, dy 1 dy

17 Compound binomial approximations 3 where c (1) = 1 h (y 1 )h (y 1 + y ) dy 1 dy. Similarly, T () := (F I) exp{t(f I)} = 1 sup po(m, t ) 1 po(k + m, t ) k Z t = sup po( t + y 1 t,t ) 1 po( t + z, t + b,t ) dy 1 y T (3) := (F I) exp{t(f I)} = 1 sup po(m, t ) po(k + m, t ) k Z t = sup po( t + y 1 t,t ) po( t + z, t + b,t ) dy 1 y and where T () c() C() t t, +1/ c () T (3) c (3) C() t +1/ t, +1 = 1 sup, h (y 1 )h 1 (y 1 + y ) dy 1 y. h (y 1 )h (y 1 + y ) dy 1 c (3) = 1 sup y By partial integration, (46), and some simple substitutions, we see that, for y, k, l Z + with k l, and m = k + l, h k (y 1 )h l (y 1 + y ) dy 1 = ( 1) l h m (y 1 )h (y 1 + y ) dy 1 = ( 1)k e y /4 m/+1 π e x H m ( x y 3/ ) dx. Using the well-known summation theorem for the Hermite polynomials H m (x 1 + x ) = 1 m/ m r= ( ) m H m r (x 1 )Hr (x ), r (x1,x ),

18 4 V. Čekanavičius, B. oos the orthogonality relation e x H r1 (x) H r (x) dx = { π r 1 r 1!, if r 1 = r,, if r 1 r, (r 1,r Z + ) (see Szegö 1975, formulas (5.5.1) and (5.5.11), p ), and (46), we obtain h k (y 1 )h l (y 1 + y ) dy 1 = ( 1)k e y /4 ( m+1 π H m y ) ( = ( 1)l h (m+1)/ m y ). From the above, the equalities in (5) follow. The identities (51) (53) are shown by using straight forward calculus. This completes the proof of the lemma. Lemma 4.9 Let Z +, n N, < p C < 1/, α (, ), F = 1 (I α + I α ), and c (1), c (), and c (3) be defined as in Lemma 4.7. Then (F I) (I + p(f I)) n c(1) (np) (F I) (I + p(f I)) n c() (F I) (I + p(f I)) n (np) c(3) (np) +1/ Proof We may assume that np 1. We have where C(), ( ), (54) (np) +1/ C(), ( ), (55) (np) +1/ C(). (56) (np) +1 (F I) (I + p(f I)) n = (F I) exp{np(f I)} + (1) 1, (1) 1 (F I) [(I + p(f I)) n exp{np(f I)}] = ([(I + p(f I))exp{ p(f I)}] n I)(F I) exp{np(f I)} = ([I + ] n I)(F I) exp{np(f I)} and = (I + p(f I)) exp{ p(f I)} I = p (F I) k= Therefore, letting g(p) be defined as in Lemma 4., ( p(f I)) k (k 1). k! ( ) n (1) 1 (p g(p)) r (F I) +r exp{np(f I)}. r r=1

19 Compound binomial approximations 5 The latter norm term can be estimated by using (41). In fact, for r N, (F I) +r exp{np(f I)} (F I) + exp{βnp(f I)} F I r 1 (F I) r 1 exp{(1 β)np(f I)} ( + )!r 1 (r 1)! (βnp) + ((1 β)np), r 1 where β (, 1) is arbitrary. This gives (1) + )! p g(p) 1 ( β + (np) +1 ( p g(p) 1 β r=1 ) r 1. Since g(1/) = 1 and p C<1/, we can choose a suitable β (, 1) such that p g(p) 1 β C<1. Hence (1) 1 C()p(np) (+1). By (47), we obtain (F I) (I + p(f I)) n c(1) (1) C() (np) 1 + (np) C() +1/ (np). +1/ Hence, (54) follows. Similarly, by using (4), (F I) (I + p(f I)) n = (F I) exp{np(f I)} + () 1, (F I) (I + p(f I)) n = (F I) exp{np(f I)} + (3) 1, where () 1 C()p(np) (+1) and (3) 1 C()p(np) (+3/). By (48) and (49), we obtain (55) and (56). The lemma is shown. 4.3 Proofs of the theorems Proof of Theorems 3.1 and 3. We first prove (17). Let G F, x = n/8, and y = 3/4. By using (3), (9), and (11), we obtain T := inf GPB(n, p, I u G) Bi(n, p, I u G; s) u = inf a (p) (I u G I) ( qi + pi u G ) n u =s+1 = inf (I u G I) s+1 exp{xp(i u G I)} u (I + p(i u G I)) n yn exp{ xp(i u G I)} a (p)(i u G I) s 1 (I + p(i u G I)) ( yn +1) (+1) T 1 T =s+1 =s+1 a (p) (1 p) +1 (I 1 I) s 1 (I + p(i 1 I)) yn +1,

20 6 V. Čekanavičius, B. oos where denotes convolution, T 1 := inf (I u G I) s+1 exp{xp(i u G I)} u by Lemma 4.3, and T := (I + p(i 1 I)) n yn exp{ xp(i 1 I)} C(s) (np) (s+1)/+(s+1)/(s+4) g(.3) 1.4 by Lemma 4. and the assumption p.3. In oos (, Lemma 1) it was shown that, for {1,...,n}, Therefore, ( η(p) ) / n (n )/ ( η(p) e ) /. a (p) (n ) (n )/ T C(s)(η(p))(s+1)/ (np) (s+1)/+(s+1)/(s+4) 1+ ( η(p) e ) ( s 1)/ 1 (1 p) (I 1 I) s 1 (I+p(I / 1 I)) yn +1, =s+ where, in view of (33), we see that the term in brackets is bounded by C ( η(p) ) / C. (1 p) ( yn +1)pq = Inequality (17) is shown. The proofs of (18), (19), (), and (3) are quite similar to the above one. The main difference is that we have to replace T 1 with T (1) 1 = (F I) s+1 exp{xp(f I)} C(s), (57) (np) s+1 T () 1 = (F I) s+1 exp{xp(f I)} h C(s) Q1/(s+3) (np) s+1 h ( ln Q h +1) κ, (58) T (3) 1 = (F I) s+1 exp{xp(f I)} C(s) σ, (59) (np) s+1 T (4) 1 = (F I) s+1 C(s) h + 1 exp{xp(f I)} h, (6) (np) s+3/ respectively, where κ = 6(s + 1)(s + )/(s + 3). Note that it is assumed that, for (57) and (58), we have F S, and that, for (59) and (6), F S is concentrated on Z \{}. Further, Lemmas 4.4 and 4.6 are used. The theorem is proved.

21 Compound binomial approximations 7 Proof of Theorem 3.3 We may assume that α = 1 and that, in view of (6) and (7), λ 1 and n 3. Then, by (3), GPB(n, p, F) ( qi + pf ) n γ = (F I) (I + p(f I)) n + (1), where, by using Lemma 4.6 and Theorem 3., (1) γ (F I) [(I + p(f I)) n (I + p(f I)) n ] + γ 3 3 (F I)3 (I +p(f I)) n 3 + GPB(n, p, F) Bi(n, p, F; 3) ( pγ C λ + γ 3 3 λ + γ ) 3 λ 4 C γ λ. 3 The proof of (5) is easily completed by the help of Lemma 4.9. Inequalities (6) and (7) are similarly shown. Acknowledgements This work was mostly done during the first author s stay at the Department of Mathematics at the University of Hamburg, Germany, in June/July 3, and also the second author s stay at the Mathematical Institute of the University of Cologne, Germany, in summer term 4. The first author would like to thank the members of the Department of Mathematics in Hamburg for their hospitality. The second author would like to thank Professor J. Steinebach for the invitation to Cologne and the Mathematical Institute for its hospitality. Finally, we thank the referees for helpful comments. Appendix 1: The Krawtchouk expansion We now give a review of some facts from oos (). In Takeuchi and Takemura (1987a), the reader can find further information concerning the Krawt-chouk expansion of the distribution of the sum of possibly dependent Bernoulli random variables. Multivariate generalizations were derived in Takeuchi and Takemura (1987b) and oos (1a). Let S n be the sum of n N independent Bernoulli random variables X 1,...,X n with success probabilities p 1,...,p n, that is, the distribution of S n is given by L(S n ) = GPB(n, p, I 1 ). Then (3) with F = I 1 is equivalent to P(S n = m) = a (p) bi(m, n, p), (61) = where m Z +, p [, 1] is arbitrary, ( ) k bi(m, k, p) = p bi(m, k, p) = m q k m, for m, k Z +,m k, m, otherwise,

22 8 V. Čekanavičius, B. oos and bi(m, k, p) = 1 bi(m 1, k,p) 1 bi(m, k, p) for N. In fact, (61) and (3) can be derived from each other by means of the simple equality k+ bi(m, k, p) z m = (1 + p(z 1)) k (z 1) for,k Z +, z C. The right hand side of (61), resp. of (3), is called the Krawtchouk expansion of L(S n ) with parameter p, and a (p),...,a n (p) are called the corresponding Krawtchouk coefficients. With our assumptions, relation (3.5) of Takeuchi and Takemura (1987a) is similar to (61). For n, m Z + and {,...,n}, wehave d dp bi(m, n, p) = n[] bi(m, n, p), where n [] = n!/(n )!, and hence, in view of (61), we see that P(S n = m) = = a (p) n [] d dp bi(m, n, p), (m Z +). As was mentioned in Subsect.., we have a (p) = 1. In what follows, we give some alternative formulae for a 1 (p),...,a n (p). For this, we need the Krawtchouk polynomials Kr(; x, n, p) [x] being orthogonal with respect to the binomial distribution. They are defined by (see Szegö 1975, formula (.8.), p. 36) Kr(; x, n, p) = k= Then, for {1,...,n}, a (p) = a (p) = a (p) = ( )( ) n x x ( p) k q k, k k 1 k(1)< <k() n r=1 (n, Z +,x C). (6) (p k(r) p), (63) P(S n = m) Kr(; m, n, p), (64) k= a (p) = 1 π α ( ) n k 1 k k! ( p) k µ (k), (65) π e i x n k=1 ( 1 + (p k p)αe ix) dx, (66)

23 Compound binomial approximations 9 where α (, ) is arbitrary and, for k {,...,n}, µ (k) = m [k] P(S n = m) m=k denotes the kth factorial moment of S n. The equalities (4) and (63) (66) can be derived with the help of the generating functions n a (p) z = (1 + (p k p)z), (z C) and = k=1 Kr(; m, n, p) z = (1 + qz) m (1 pz) n m, (67) = for n, m Z +, n m, and z C. For (67), see Szegö (1975, formula (.8.4), p. 36). In view of (64), we see that the Krawtchouk coefficients can be defined by using the Krawtchouk polynomials. But the same is true for the differences bi(m, k, p). Indeed, for m, k,, r Z + and p [, 1], we have ( ) k + (pq) bi(m, k, p) = Kr(; m, k +, p)bi(m, k +, p). (68) Using (4), (6), and (68), the counting densities of the signed measures Bi(n,p,I 1 ;s) [see (5)] can be derived. It turns out that, for m Z +,wehave ( Bi(n, p, I 1 ; 1)({m}) = bi(m, n, p) 1 γ ) 1(p) (m np) npq and, if n {, 3,...}, Bi(n, p, I 1 ; )({m}) ( = bi(m, n, p) 1 γ 1(p) (m np) npq + (γ 1(p)) γ (p) [ m (1 + (n 1)p)m + n(n 1)p ]). n(n 1)(pq) It is worth mentioning that, as one can expect, the first s moments of L(S n ) and Bi(n, p, I 1 ; s) coincide. This follows from the fact that, for s {,...,n}, k {,...,s}, and µ (k) as above, m [k] Bi(n, p, I 1 ; s)({m}) = µ (k). m=k Finally, note that the connection between Bi(n, p, F ; s), for F F, and Bi(n, p, I 1 ; s) is given by Bi(n, p, F ; s) = Bi(n, p, I 1 ; s)({m})f m.

24 1 V. Čekanavičius, B. oos eferences Arak, T.V., Zaĭtsev, A.Yu. (1988). Uniform limit theorems for sums of independent random variables. (Translated from ussian). In Proceedings of the Steklov Institute of Mathematics, 174(1). Providence: American Mathematical Society. Barbour, A.D., Holst, L., Janson, S. (199). Poisson approximation. Oxford: Clarendon Press. Barbour, A.D., Choi, K.P. (4). A non-uniform bound for translated Poisson approximation. Electronic Journal of Probability, 9, Čekanavičius, V. (1995). On smoothing properties of compound Poisson distributions. Lithuanian Mathematical Journal, 35, Čekanavičius, V. (). On multivariate compound distributions. Teoriya Veroyatnostei i ee Primeneniya, 47, (English translation in Theory of Probability and its Applications, 47, ). Čekanavičius, V., oos, B. (4). Two-parametric compound binomial approximations. Lietuvos Matematikos inkinys, 44, Čekanavičius, V., Vaitkus, P. (1). Centered Poisson approximation via Stein s method. Lithuanian Mathematical Journal, 41, Choi, K.P., Xia, A. (). Approximating the number of successes in independent trials: Binomial versus Poisson. Annals of Applied Probability, 1, Ehm, W. (1991). Binomial approximation to the Poisson binomial distribution. Statistics & Probability Letters, 11, Hengartner, W., Theodorescu,. (1973). Concentration functions. New York: Academic Press. Ibragimov, I.A., Linnik,Yu.V. (1971). Independent and stationary sequences of random variables. Groningen: Wolters-Noordhoff. Le Cam, L. (1965). On the distribution of sums of independent random variables. In: J. Neyman, L. Le Cam (Eds.). Bernoulli Bayes Laplace anniversary volume. (pp. 179 ) Berlin Heidelberg New York: Springer. Michel,. (1987). An improved error bound for the compound Poisson approximation of a nearly homogeneous portfolio. ASTIN Bulletin, 17, Presman, É.L. (1985). Approximation in variation of the distribution of a sum of independent Bernoulli variables with a Poisson law. Teoriya Veroyatnostei i ee Primeneniya, 3, (English translation in Theory of Probability and its Applications, 3, ) oos, B. (1999). Asymptotics and sharp bounds in the Poisson approximation to the Poissonbinomial distribution. Bernoulli, 5, oos, B. (). Binomial approximation to the Poisson binomial distribution: The Krawtchouk expansion. Teoriya Veroyatnostei i ee Primeneniya, 45, (See also in Theory of Probability and its Applications, 45, 58 7.) oos, B. (1a). Multinomial and Krawtchouk approximations to the generalized multinomial distribution. Teoriya Veroyatnostei i ee Primeneniya, 46, (See also in Theory of Probability and its Applications, 46, ) oos, B. (1b). Sharp constants in the Poisson approximation. Statistics & Probability Letters, 5, oos, B. (3). Improvements in the Poisson approximation of mixed Poisson distributions. Journal of Statistical Planning and Inference, 113, oos, B. (5). On Hipp s compound Poisson approximations via concentration functions. Bernoulli, 11, Soon, S.Y.T. (1996). Binomial approximation for dependent indicators. Statistica Sinica, 6, Szegö, G. (1975). Orthogonal polynomials (4th ed.). Providence: American Mathematical Society. Takeuchi, K., Takemura, A. (1987a). On sum of 1 random variables. I. Univariate case. Annals of the Institute of Statistical Mathematics, 39, Takeuchi, K., Takemura, A. (1987b). On sum of 1 random variables. II. Multivariate case. Annals of the Institute of Statistical Mathematics, 39, Tsaregradskii, I.P. (1958). On uniform approximation of the binomial distribution by infinitely divisible laws. Teoriya Veroyatnostei i ee Primeneniya, 3, (English translation in Theory of Probability and its Applications, 3, )

25 AISM DOI 1.17/s EATUM Compound binomial approximations Vydas Čekanavičius Bero oos The Institute of Statistical Mathematics, Tokyo 6 Erratum to: AISM 58: DOI 1.17/s The original version of the history unfortunately contained a mistake. The correct approval history is given here. eceived: 3 August 4 / evised: 7 January 5 The online version of the original article can be found at V. Čekanavičius Department of Mathematics and Informatics, Vilnius University, Naugarduko 4, Vilnius 35, Lithuania vydas.cekanavicius@maf.vu.lt B. oos (B) Department of Mathematics, SPST, University of Hamburg, Bundesstr. 55, 146 Hamburg, Germany roos@math.uni-hamburg.de

U N I V E R S I T Ä T

U N I V E R S I T Ä T U N I V E S I T Ä T H A M B U G Compound binomial approximations Vydas Čekanavičius and Bero oos Preprint No. 25 2 March 25 FACHBEEICH MATHEMATIK SCHWEPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE POZESSE