On Approximating a Generalized Binomial by Binomial and Poisson Distributions

Similar documents
A Pointwise Approximation of Generalized Binomial by Poisson Distribution

APPROXIMATION OF GENERALIZED BINOMIAL BY POISSON DISTRIBUTION FUNCTION

Non Uniform Bounds on Geometric Approximation Via Stein s Method and w-functions

A Note on Poisson Approximation for Independent Geometric Random Variables

A Non-uniform Bound on Poisson Approximation in Beta Negative Binomial Distribution

Poisson Approximation for Independent Geometric Random Variables

1. Introduction. Let the distribution of a non-negative integer-valued random variable X be defined as follows:

AN IMPROVED POISSON TO APPROXIMATE THE NEGATIVE BINOMIAL DISTRIBUTION

On bounds in multivariate Poisson approximation

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method

Mathematical Statistics 1 Math A 6330

ON POINTWISE BINOMIAL APPROXIMATION

Poisson approximations

arxiv: v1 [stat.me] 28 Mar 2011

Stein s method and zero bias transformation: Application to CDO pricing

Chapter 1. Sets and probability. 1.3 Probability space

LIST OF FORMULAS FOR STK1100 AND STK1110

Series of Error Terms for Rational Approximations of Irrational Numbers

On discrete distributions with gaps having ALM property

Weak convergence in Probability Theory A summer excursion! Day 3

Power Series Solutions to the Legendre Equation

Notes on Poisson Approximation

Math 324 Summer 2012 Elementary Number Theory Notes on Mathematical Induction

Week 2. Review of Probability, Random Variables and Univariate Distributions

Fisher Information, Compound Poisson Approximation, and the Poisson Channel

Fisher information and Stam inequality on a finite group

A COMPOUND POISSON APPROXIMATION INEQUALITY

arxiv: v1 [math.pr] 16 Jun 2009

Mixtures and Random Sums

CSE 312 Final Review: Section AA

STEIN S METHOD, SEMICIRCLE DISTRIBUTION, AND REDUCED DECOMPOSITIONS OF THE LONGEST ELEMENT IN THE SYMMETRIC GROUP

Entropy, Compound Poisson Approximation, Log-Sobolev Inequalities and Measure Concentration

3. DISCRETE RANDOM VARIABLES

Probability and Distributions

Continuous Random Variables and Continuous Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

1: PROBABILITY REVIEW

On the Poisson Approximation to the Negative Hypergeometric Distribution

Topic 9 Examples of Mass Functions and Densities

STAT/MATH 395 PROBABILITY II

Things to remember when learning probability distributions:

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

Exam 3, Math Fall 2016 October 19, 2016

Probability Distributions Columns (a) through (d)

5. Conditional Distributions

Power Series Solutions to the Legendre Equation

Multivariate distributions

Two finite forms of Watson s quintuple product identity and matrix inversion

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Classical Probability

p. 4-1 Random Variables

2 Chance constrained programming

Total variation error bounds for geometric approximation

Clinical Trial Design Using A Stopped Negative Binomial Distribution. Michelle DeVeaux and Michael J. Kane and Daniel Zelterman

Topic 3: The Expectation of a Random Variable

Matchings in hypergraphs of large minimum degree

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

Complete moment convergence of weighted sums for processes under asymptotically almost negatively associated assumptions

Probability. Computer Science Tripos, Part IA. R.J. Gibbens. Computer Laboratory University of Cambridge. Easter Term 2008/9

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Lecture 3. Discrete Random Variables

Northwestern University Department of Electrical Engineering and Computer Science

Independence of some multiple Poisson stochastic integrals with variable-sign kernels

Conditional distributions (discrete case)

Probability reminders

On monotonicity of expected values of some run-related distributions

MATH 324 Summer 2011 Elementary Number Theory. Notes on Mathematical Induction. Recall the following axiom for the set of integers.

Pseudo-Boolean Functions, Lovász Extensions, and Beta Distributions

1 Review of Probability and Distributions

PACKING-DIMENSION PROFILES AND FRACTIONAL BROWNIAN MOTION

A Short Introduction to Stein s Method

PAijpam.eu REPEATED RANDOM ALLOCATIONS WITH INCOMPLETE INFORMATION Vladimir G. Panov 1, Julia V. Nagrebetskaya 2

In N we can do addition, but in order to do subtraction we need to extend N to the integers

Wittmann Type Strong Laws of Large Numbers for Blockwise m-negatively Associated Random Variables

18.175: Lecture 17 Poisson random variables

Chap 2.1 : Random Variables

Research Article On the Isolated Vertices and Connectivity in Random Intersection Graphs

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

Random Variables. P(x) = P[X(e)] = P(e). (1)

ON THE MOMENTS OF ITERATED TAIL

INDUCTION AND RECURSION. Lecture 7 - Ch. 4

A Few Special Distributions and Their Properties

Univariate Discrete Distributions

Chapter 4 Multiple Random Variables

The first divisible sum

Yi Wang Department of Applied Mathematics, Dalian University of Technology, Dalian , China (Submitted June 2002)

Chiang Mai J. Sci. 2016; 43(3) : Contributed Paper

Discrete Distributions

arxiv: v1 [math.nt] 8 Jan 2014

1 Random Variable: Topics

Sharp threshold functions for random intersection graphs via a coupling method.

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Factorization of integer-valued polynomials with square-free denominator

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

Fundamental Tools - Probability Theory II

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

M378K In-Class Assignment #1

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Transcription:

International Journal of Statistics and Systems. ISSN 0973-675 Volume 3, Number (008), pp. 3 4 Research India Publications http://www.ripublication.com/ijss.htm On Approximating a Generalized inomial by inomial and Poisson Distributions P. Wongkasem and K. Teerapabolarn * Department of Mathematics, Faculty of Science urapha University, Chonburi, 03 Thailand R. Gulasirima Applied Statistics Program, Faculty of Science and Technology Suan Dusit Rajabhat University, angkok, 0300 Thailand *Corresponding author E-mail: kanint@buu.ac.th Abstract The aim of this paper is to approximate a generalized binomial distribution by binomial and Poisson distributions. Using the ω-function and the two Stein identities, the main result of each approximation in terms of the total variation distance between two distributions and its upper bound is obtained. Moreover, some numerical examples have been given to show applications of the results concerning the three special cases of generalized binomial distribution (binomial, hypergeometric and Pólya distributions). 000 Mathematics Subject Classification: Primary 60F05. Keywords: A generalized binomial distribution; binomial approximation; binomial distribution; hypergeometric distribution; Pólya distribution; Poisson approximation; Poisson distribution; Stein identity; ω-function.. Introduction In 979, Dwass [5] introduced a generalized discrete distribution that covered the binomial, hypergeometric and PÓlya distributions and he called the distribution that a generalized binomial distribution. The distribution depends on four parameters, A,, n and α, where A and are positive, n is a positive integer,α is an arbitrary real number satisfying (n )α and A (i) and (i) are never negative for i =,..., n, () i where x = x( x α) ( x ( i ) α).

4 P. Wongkasem et al. The distribution of the generalized binomial random variable is defined by p( k ) ( k) ( n k) n A =, k = 0,,..., n. ( n) k ( ) n [ AA ( α) ( A ( k ) α)][ ( α) ( ( n k ) α)] =, k ( )( α) ( ( n ) α) (.) k = 0,,..., n. (.) The form of (.) is similar to the distribution form with regard to the Pólya urn model in Johnson et al. [7] on pp. 58. Dwass [5] pointed out the three special cases of the distribution in (.) by α as follows: (i). If α = 0, then the binomial distribution with parameters n and result, that is, A is the k n k n A p( k ) =,k 0,,...,n. k = (.3) (ii). Ifα > 0, then the outcome is the hypergeometric distribution with parameters A,, n andα and, for A α and are integers, the distribution α A α α k n k p( k ) =,k = 0,,..., min{ n,a α} A α + α n (.4) is the well known classical hypergeometric distribution. (iii). If α < 0, then the result of (.) is the Pólya distribution with parameters A,, n and α given by p( k ) A α + k α + n k k n k =,k = 0,,...,n. A α α + n n (.5)

Generalized inomial by inomial and Poisson Distributions 5 Note that in the special case where α = the distribution is the beta-binomial distribution with parameters A, and n. If A,, n and α are all positive integers, then it is the same distribution as in rown and Phillips [3], and it is the negative hypergeometric distribution as α =. It is well known that the hypergeometric distribution can be approximated by the binomial distribution and both binomial and hypergeometric distributions can also be approximated by the Poisson distribution under certain conditions on their parameters. Similarly, if the conditions on the parameters of the generalized binomial, binomial and Poisson distributions are satisfied, then the generalized binomial distribution can also be approximated by the binomial distribution or the Poisson distribution. In this paper, the generalized binomial distribution has been approximated by binomial and Poisson distributions, and the accuracy of each approximation is measured in terms of the total variation distance and its upper bound. The total variation distance between generalized binomial and binomial distributions and the total variation distance between generalized binomial and Poisson distributions are defined by and d TV (G(A,,n,α),(n,p)) = sup G( A,,n, α)( E ) ( n, p )( E ) E (.6) d TV (G(A,,n,α),P(λ)) = sup G( A,,n, α)( E ) P( λ)( E ), (.7) E where E is a subset of {0,...,n} and E a subset of {0} and G(A,,n,α), (n, p) and P(λ) are generalized binomial, binomial and Poisson distributions, respectively. The main tools for determining an upper bound for each total variation distance are the ω-function associated with the generalized binomial random variable and the Stein identities for binomial and Poisson distributions. In 998, Majsnerowska [8] adapted the relation of ω-function associated with a non-negative integer-valued random variable X (Cacoullos and Papathanasiou [4]) to be pk ( ) μ ( k+ ) ω( k+ ) = ω( k) 0, k = 0,,..., (.8) pk ( + ) where ω(0) = μ, p(k) > 0 for all k 0 and μ and are mean and variance of X. The recurrence relation (.8) together with the two Stein identities are used to derive two main results in Section. In Section 3, some numerical examples have been given to show applications of the results concerning the binomial, hypergeometric and Pólya distributions which are the special cases of the generalized binomial distribution. The conclusions of this study are presented in the last section.

6 P. Wongkasem et al.. Main results This section uses the relation of ω-function in (.8) and the two Stein identities of binomial and Poisson distributions to derive the main results associated with (.6) and (.7). Let X be the generalized binomial random variable with probability distribution defined as in (.). Its mean and variance are μ = and ( nα ) =, respectively, and its associated ω-function is given as the ( ) ( α) following proposition. Proposition. Let ω(x) be the ω-function associated with the generalized binomial random variable X and p(k) > 0 for every 0 k n. Then ( n k)( A kα) ω( k) =, k = 0,,..., n ( ) (.) where ( nα ) =. ( ) ( α) Proof. Following Dwass [5] and (.8), the recurrence relation of ω-function associated with the random variable X can be expressed in the form k[ ( n k) α] k ω( k) = + ω( k ), k =,..., n, ( ) ( n k+ )[ A ( k ) α] where ω(0) =. ( ) Therefore, we have ( n )( A α) ( n )( A α) ( n n)( A nα ) ω() =, ω() =,, ω( n) =, ( ) ( ) ( ) which gives (.). Remark.. If α = 0, then (.) reduces to ( n k) A ω( k) =, k = 0,,..., n ( ) (.) with =, which is the ω-function associated with the binomial random ( ) variable and the distribution in (.3).

Generalized inomial by inomial and Poisson Distributions 7. If α > 0 and A α and α are integers, then the new form of ω-function in (.) can be written as ( n k)( Aα k) ω( k) =, k = 0,,...,min, ( Aα+ α) { n A α} (.3) with the same variance as in (.). This is the ω-function associated with the classical hypergeometric random variable and the distribution in (.4). 3. If α < 0, then (.) can be rewritten as ( n k)( Aα + k) ω( k) =, k = 0,,..., n, (.4) ( Aα α) ( nα ) where =. It is the ω-function associated with the Pólya ( ) ( α) random variable and the distribution in (.5). For giving a main result for each of the approximations, the relation of the ω- function is first mentioned, which was stated by Cacoullos and Papathanasiou [4], as follows: If a function g satisfies then where Δg(x) = g(x+) g(x). E ω(x)δg(x) < and = Var( X ) <, Cov(X,g(X)) = E ω(x)δg(x), (.5) y taking g(x) = x, E[ω(X)]= is obtained as a consequence of (.5).. A result of the binomial approximation The theorem below shows an estimate of the total variation distance between G(A,,n,α) and (n,p). To obtain the estimate the Stein identity for the binomial setting (see arbour et al. [] on pp. 88-89 or arbour and Chen [] on pp. 03) is applied for fixed parameters n and p = q (0,), every subset E of {0,...,n} and the bounded real valued function f = f E : {0} (defined as in []) G( A,,n, α)( E ) ( n, p)( E ) = E[( n X) pf( X + ) qxf( X)]. (.6) For any subset E of {0,...,n}, Ehm [6] showed that k, E k, E n+ n+ p q sup Δ f( k ) = sup f( k+ ) f( k ), (n + )pq (.7)

8 P. Wongkasem et al. where f(0) = f() and f(k) = f(n) for k n. Theorem. Let X be the generalized binomial random variable and p = A A + > 0. Then the following inequality holds: n A A ( G( A,, n, α), ( n, )) A + A + A + nn ( ) αδ( α), ( n+ )( α) where δ ( α) = if α 0 and δ( α) = if α < 0. Proof. It can be seen that + n+ (.8) E[( n X) pf( X + ) qxf( X)] = E[ npf ( X + ) px Δf ( X ) Xf ( X )] and, by (.6), (.6) and (.7), we have = E[ μ f( X + )] pe[ XΔf( X)] E[ Xf( X)] = E[ μ f( X + )] pe[ XΔ f( X)] Cov( X, f ( X )) E[ μ f ( X )] = E[ μδf( X)] pe[ XΔ f( X)] E[ ω( X) Δf( X)] (by (.5)) { μ ω( ) ( )} = E px X Δf X n+ n+ p q ( G( A,, n, α), ( n, p)) E μ px ω ( X). (.9) ( n+ ) pq y Proposition, ka ( n k)( A kα ) ( n k) kα μ pk ω( k) = =, A + this implies that For α 0, 0 if α 0, μ pk ω( k) 0 if α < 0.

Generalized inomial by inomial and Poisson Distributions 9 and E μ px ω( X) E μ px ω( X) Therefore, from both cases, we obtain = μ pe( X) E[ ω( X)] = μq pqn( n ) α =, A + α pqn( n ) α = for α < 0. α E μ px ω( X) = where δ ( α) = if α 0 and δ( α) = if α < 0. pqn( n ) α δ ( α ), α (.0) Substituting the right hand side of (.9) by (.0) and p and q by, respectively. Hence the inequality (.8) holds. A A + and The following two corollaries are consequences of Theorem with regard to approximating the hypergeometric and Pólya distributions by binomial distribution. Corollary. If α > 0 and A α and α are integers, then the following inequality holds: n+ n+ A A ( H( A,, n, α), ( n, )) A + A + A + nn ( ) α. (.) ( n+ )( α) n If additionally p = and A is an integer, then n n n ( H( A,, n, α), ( A, )) A + A + A + AA ( ) α, (.) ( )( α) where H(A,,n,α) is the hypergeometric distribution. Corollary. For α < 0, then A A ( PY( A,, n, α), ( n, )) A + A + A + nn ( )( α ), ( n+ )( α) n+ n+ (.3)

0 P. Wongkasem et al. where PY(A,,n,α) is the Pólya distribution. Remark. It should be noted that the result of Theorem or each result of corollaries and, yields a good binomial approximation provided that ( n ) α α is small, that is, α is small and A + α is large.. A result of the Poisson approximation Similar to the binomial approximation, the Stein identity for the Poisson case (see arbour et al. [] on pp. 6-7 or arbour and Chen [] on pp. 65) is applied for every positive constant λ, every subset E of {0} and the bounded real valued function f = f E : {0} (defined as in []). G( A,,n, α)( E) P( λ)( E ) = E[ λf( X + ) Xf( X)]. (.4) For any subset E of {0}, arbour et al. [] proved that Δ f k = f k+ f k λ e λ k, E k, E sup ( ) sup ( ) ( ) ( ). (.5) The following theorem presents a Poisson estimate of the total variation distance between G(A,,n,α) and P(λ) in terms of four parameters A,, n and α. Theorem. Let X be the generalized binomial random variable and 0, and let A (n )( α ) for α < 0. Then λ = A + > ( n ) α + A( A + α) ( G( A,, n, α), P( )) e. (.6) ( )( α) Proof. From (.4), we have d ( G( A,, n, α), P ( λ)) = λe[ f( X + )] E[ Xf( X)] TV = λe[ f( X + )] Cov( X, f( X)) μe[ f( X)] = λe[ Δf( X)] Cov( X, f( X)) = λe[ Δf( X)] E[ ω( X) Δ f( X)] (by (.5)) { λ ω( ) ( )} = E X Δf X ; which by (.5), λ = λ ( e ) E λ ω ( X) (.7)

Generalized inomial by inomial and Poisson Distributions thus As, by Proposition, λ ω( k) E λ ω( X) ( n k)( A kα) = [( n k) α + A] k = 0, = λ E[ ω( X)] = λ n ( ) α + AA ( + α) = λ. ( )( α) Substituting this result into (.7) and putting λ =, thus the theorem is proved. Immediately from (.6), additional results of the Poisson approximation are obtained for the three distributions as follows: Corollary 3. For α = 0, A ( G( n, ), P ( )) e. (.8) Corollary 4. For α > 0, if A α and α are integers, then ( n ) α + A( A + α) ( H( A,, n, α), P( )) e. (.9) ( )( α) Corollary 5. For α < 0, if A ( n )( α), then ( n ) α + A( A + α) ( PY( A,, n, α), P( )) e. (.0) ( )( α) A Remark 3.. If and α are small, then the result of (.6) is a good Poisson estimate.. It is noted that, for α > 0 and > 0, n ( ) α + AA ( + α) A n ( )( α) + AA ( + + α ) < <. ( )( α) ( )( + α) Hence, the bound in (.0) is always less than the bounds in (.8) and (.9).

P. Wongkasem et al. 3. Numerical examples The following numerical examples are given to illustrate how well binomial and Poisson distributions approximate generalized binomial distribution and to see how tight the upper bound for each total variation distance between two distributions by using the results in theorems and, or corollaries -5. Table : Sample values of each total variation distance ( d TV ) between generalized binomial and binomial distributions and its upper bound (U..) d TV U.. d TV U.. d TV U.. n A (.) (.) (.) (.) (.3) (.3) 5 0 0.0009 0.0000 0.00043 0.00044 0.0009 0.0000 0 0 0.00080 0.00086 0.00080 0.00086 0.00078 0.00086 0 0 0.009 0.00345 0.0039 0.0063 0.0080 0.00344 5 5 0.00044 0.00047 0.0057 0.008 0.00043 0.00047 0 5 0.006 0.0099 0.0046 0.0053 0.0059 0.0099 0 5 0.00459 0.00747 0.00577 0.00944 0.00445 0.00746 5 50 0.00075 0.00088 0.0088 0.0085 0.00075 0.00088 0 50 0.006 0.00353 0.08 0.099 0.00 0.0035 0 50 0.0055 0.095 0.045 0.03093 0.00540 0.09 Table provides sample values of each total variation distance between generalized binomial and binomial distributions and its upper bound for fixed = 000 and α =, corresponding to corollaries and, respectively. Table : Sample values of each total variation distance between generalized binomial and Poisson distributions and its upper bound (U..) d TV U.. d TV U.. d TV U.. n A (.8) (.8) (.9) (.9) (.0) (.0) 5 0 0.00047 0.00049 0.00066 0.00068 0.0008 0.0009 0 0 0.00087 0.00095 0.0066 0.0080 0.00009 0.0000 0 0 0.0049 0.008 0.00439 0.0053 0.003 ******* 5 5 0.0065 0.0094 0.00309 0.00340 0.00 0.0048 0 5 0.00436 0.00553 0.00598 0.00747 0.0076 0.00359 0 5 0.0058 0.00984 0.0040 0.073 0.0036 0.0055 5 50 0.00893 0.006 0.00968 0.090 0.0088 0.00 0 50 0.086 0.0967 0.04 0.0304 0.00964 0.063 0 50 0.04 0.036 0.0973 0.04303 0.0088 0.00

Generalized inomial by inomial and Poisson Distributions 3 For the Poisson approximation, sample values of each total variation distance between generalized binomial and Poisson distributions and its upper bound have been presented for fixed A + = 000 and α = 0,,, corresponding to corollaries 3, 4 and 5, respectively, as seen in Table. The numerical results in the two tables suggest that binomial and Poisson approximation to the generalized binomial distribution are quite efficient when n and/or A are small, that is, the estimate of the total variation distance between two distributions of each result is close to the true value of the distance provided that n and/or A are small and A + α is large. y comparing the numerical results in Table, (.) and (.3), and the results in Table, (.9) and (.0), it indicates that the result in (.) is better than the result in (.9) and in the case where n and A are more different the result in (.3) is also better than the result in (.0). 4. Conclusions In this study the upper bounds in (.8) and (.6) are the estimates of the total variation distance between the generalized binomial and binomial distributions and the total variation distance between the generalized binomial and Poisson distributions, respectively. Furthermore, each upper bound in (.8) and (.6) is also a criterion for measuring the accuracy of the corresponding approximation, that is, if an obtained bound is small, then a good binomial or Poisson approximation to the generalized binomial distribution is obtained. Conversely, if such a bound is large, then the binomial or Poisson distribution is not appropriate to approximate the generalized binomial distribution. From the results of (.8) and (.6), it is found that the upper bound of the binomial approximation is small when α is small and A α is large, and the upper bound of the Poisson is small when both α and are small. Acknowledgements. The first two authors would like to thank Faculty of Science, urapha University, for financial support. References [] arbour, A. D., Holst, L., and Janson, S., 99, Poisson approximation, Oxford Studies in Probability, Clarendon Press, Oxford. [] arbour, A.D. and Chen, L.H.Y., 005, An introduction to Stein s method, Singapore University Press, Singapore. [3] rown, T. C., and Phillips, M. J., 999, Negative binomial approximation with Stein s method, Meth. Comp. Appl. Probab., (4), pp. 407-4. [4] Cacoullos, T., and Papathanasiou, V., 989, Characterization of distributions by variance bounds, Statist. Probab. Lett., 7(5), pp. 35-356.

4 P. Wongkasem et al. [5] Dwass, M., 979, A generalized binomial distribution, Amer. Statistician, 33(), pp. 86-87. [6] Ehm, W., 99, inomial approximation to the Poisson binomial distribution, Statist. Probab. Lett., (), pp. 7-6. [7] Johnson, N. L., Kotz, S., and Kemp, A. W., 005, Univariate Discrete Distributions, 3 rd edition, Wiley, New York. [8] Majsnerowska, M., 998, A note on Poisson approximation by ω- functions, Appl. Math., 5(3), pp. 387-39.