B671-672 Supplemetal otes 2 Hypergeometric, Biomial, Poisso ad Multiomial Radom Variables ad Borel Sets 1 Biomial Approximatio to the Hypergeometric Recall that the Hypergeometric istributio is fx = x x x = x x 0 elsewhere which we may write as x = max 0, +,...,mi, fx = x 1 i x 1 i 1 i Sice there are terms i the umerator ad the deomiator we may divide both by to obtai fx = Sice x 1 x 1 1 x 1 i x 1 1 i i i for i = 0,1,...,x 1 i for i = 0,1,..., x 1 i 1 for i = 0,1,..., 1 Fall2004 1 supp2.tex
we have that x 1 x 1 ad it follows that x x 1 x x x 1 x x 1 1 i 1 x 1 x i i 1 x fx x x x 1 If we assume that lim = p ad that ad x are fixed the ad lim x It follows that x 1 lim lim x x x x 1 x 1 x = x x = x = p x p x x p x p x x p x p x x x so that the hypergeometric distributio ca be approximated by the biomial with p = Fall2004 2 supp2.tex
2 Poisso Approximatio to the Biomial The probability of x successes i Beroulli trials with trials ad probability of success p o each trial is give by the biomial distributio i.e. P X = x = p x x p x for x = 0,1,2,..., Suppose ow that such that lim p = lim λ = λ > 0 The biomial distributio ca the be writte as Hece P X = x = x 1 x λ x x! = = x 1 p x x p x 1 x + 1 x! i λ x λ x! λ P x λx x! x λ x λ λ x λ x Fall2004 3 supp2.tex
ad ow for fixed x L x = U x = λx x! x 1 λ x λ x x! λ λ λx x! e λ x λx x! e λ so that lim L x = lim U x = λx x! e λ Hece lim P x = λx x! e λ i.e. the biomial distributio approaches the Poisso as p λ ad hece ca be used to approximate the biomial uder this coditio. Fall2004 4 supp2.tex
3 Multivariate Hypergeometric ad Multiomial istributios Cosider a populatio of idividuals each classified ito oe of k mutually exclusive categories C 1,C 2,...,C k. Suppose that there are i idividuals i category C i for i = 1,2,...,k. ote that k i=1 i =. If we draw a radom sample of size without replacemet from this populatio the probability of x i i the sample from category C i, i = 1,2,...,k is Px 1,x 2,...,x k =! 1 x1 2 x2 k xk x 1!x 2! x k! which is called the multivariate hypergeometric distributio with parameters 1, 2,..., k. It is ot widely used sice the multiomial distributio provides a excellet approximatio. Rewrite the distributio as Px 1,x 2,...,x k =! ki=1 x i! ki=1 x i 1 1 l=0 j=0 i j l ividig the umerator ad deomiator by as i the developmet of the biomial approximatio to the hypergeometric yields Sice i x i 1 Px 1,x 2,...,x k =! ki=1 x i! i j 1 i ki=1 x i 1 j=0 1 l=0 i j l for j = 0,1,...,x i 1 i = 1,2,...,k l 1 for l = 0,1,..., 1 Fall2004 5 supp2.tex
we have that k i=1 ad i x i 1 xi k i=1 1 x i 1 j=0 1 i j k i=1 l 1 xi i Thus the multivariate hypergeometric distributio is bouded below by ad is bouded above by! x 1!x 2! x k! k i=1! x 1!x 2! x k! i x x i 1 i 1 xi ki=1 i Thus if is fixed ad lim i = p i for each i the lim Px 1,x 2,...,x k =! x 1!x 2! x k! It follows that the multivariate hypergeometric distributio ca be approximated by the multiomial distributio with p i = i for i = 1,2,...,k. k i=1 p x i i Fall2004 6 supp2.tex
4 Borel Sets ad Measurable Fuctios 4.1 ecessity for Borel Sets Let Ω = [0,1 ad let PE be the legth of E i.e. the uiform probability measure. It is ot possible for P to be defied for all subsets of Ω i such a way that P satisfies 0 PE 1, PΩ = 1, P is coutably additive, ad PE is equal to the legth of E. To show this a o-measurable set is costructed by the followig procedure: 1 efie a equivalece relatio o Ω = [0, 1 such that Ω = [0,1 = t T E t where the E t are disjoit. The equivalece relatio is defied by x y mod 1 x + r = y where r is a ratioal umber where mod 1 x 1 + x 2 = x 1 + x 2 if x 1 + x 2 1 x 1 + x 2 1 if x 1 + x 2 > 1 for x 1,x 2 [0,1 2 Use the Axiom of Choice to select a set F cotaiig exactly oe poit from each of the E t. 3 Order the ratioal umbers as ad defie the sets F i by r 0,r 1,r 2,... where r 0 = 0 F i = mod 1 F + r i Fall2004 7 supp2.tex
You ca show that the F i are mutually exclusive ad that F i = Ω = [0,1 4 If F is measurable so are each of the F i ad It the follows that PF = PF i 1 = PΩ = P[0,1 = P F i = = PF i PF Thus F is ot measureable i.e. o value PF ca be assiged to F. Referece: Couterexamples i Probability ad Statistics, 1986 Romao ad Siegel, Wadsworth ad Brooks/Cole Fall2004 8 supp2.tex
4.2 Measurable Fuctios ad Radom Variables I the study of probablity ad its applicatios to statistics we eed to have a collectio of radom variables measurable fuctios large eough to esure that probabilities are well defied. Recall that most of classical aalysis calculus, etc. deals with cotiuous fuctios ad limits of sequeces of cotiuous fuctios. Sice limits of sequeces of cotiuous fuctios are ot ecessarily cotiuous ad also sice a sequece of cotiuous fuctios may ot ted to a fiite limit it is coveiet to exted slightly the defiitio of the fuctios uder cosideratio by allowig fuctios to take the values ± ad to exted the otio of a Borel σ-field to be the smallest σ-field geerated by the sets { }, {+ } ad B where B is the class of all Borel sets. With this extesio the collectio of measurable fuctios from Ω, W to R, B is closed uder the followig usual operatios: arithmetic operatios sums, products ifima ad suprema poitwise limits of sequeces of fuctios Fall2004 9 supp2.tex
There are ow two defiitios of measurable fuctios: Costructive: A measurable fuctio is the limit of a coverget sequece of simple fuctios where a simple fuctio is a fuctio of the form X = x j I Ej i=1 where the E j are measurable sets i.e. E j W. escriptive: A measurable fuctio is a fuctio such that the iverse image of ay Borel set i B is a measurable set i W. Theorem 4.1 The costructive ad descriptive defiitios are equivalet. Moreover the set of all measurable fuctios is closed uder the usual operatios of aalysis Proof: See Loeve page 107-108. Fall2004 10 supp2.tex
Theorem 4.2 A cotiuous fuctio of a measurable fuctio is measurable. The class of fuctios cotaiig all cotiuous fuctios ad closed uder limits are called Baire fuctios. It follows that Baire fuctios of measurable fuctios are also measurable. Proof: See Loeve page 108-109. Referece: Loeve, M. 1960 Probability Theory Secod Editio Va ostrad Theorem 4.3 A fuctio X = X 1,X 2,...,X from Ω to R is measurable if ad oly if each of the coordiates X 1,X 2,...,X is measurable. Proof: See Loeve page 108-109. Referece: Loeve, M. 1960 Probability Theory Secod Editio Va ostrad It follows that the class of measurable fuctios or radom variables is rich eough to iclude all radom variables of potetial iterest. Much like the fact that the class of Borel sets is rich eough to cotai all evets of iterest ad still be compatible with the eed for probabilities to be coutably additive. Fall2004 11 supp2.tex
example 4.1 efie the sigum fuctio by sgmx = +1 if x > 0 0 if x + 0 1 if x < 0 ad ote that the sigum fuctio is ot cotiuous at x = 0. efie the fuctio f x by f x = mi1,x x 0 max 1,x x < 0 ote that f 0 = 0 ad that f x is cotiuous for every x. However lim f x = sgmx which is ot cotiuous at x = 0 i.e. the limit of a sequece of cotiuous fuctios is ot ecessarily cotiuous. Referece: Couterexamples i Aalysis,1964 Gelbaum ad Olsted, Holde- ay Ic. page 77 Fall2004 12 supp2.tex
example 4.2 efie the fuctio f x by f x = mi, 1 x if 0 < x 1 0 if x = 0 The f x is bouded o the iterval [0,1]. However the fuctio fx = lim f x = 1 x if 0 < x 1 0 if x = 0 is ubouded. Thus the limit of a sequece of bouded fuctios is ot ecessarily bouded. Referece: Couterexamples i Aalysis,1964 Gelbaum ad Olsted, Holde- ay Ic. page 77 Fall2004 13 supp2.tex