University of Groningen. Statistical Auditing and the AOQL-method Talens, Erik

Size: px
Start display at page:

Download "University of Groningen. Statistical Auditing and the AOQL-method Talens, Erik"

Transcription

1 Uiversity of Groige Statistical Auditig ad the AOQL-method Tales, Erik IMPORTANT NOTE: You are advised to cosult the publisher's versio (publisher's PDF if you wish to cite from it. Please check the documet versio below. Documet Versio Publisher's PDF, also kow as Versio of record Publicatio date: 2005 Lik to publicatio i Uiversity of Groige/UMCG research database Citatio for published versio (APA: Tales, E. (2005. Statistical Auditig ad the AOQL-method s.. Copyright Other tha for strictly persoal use, it is ot permitted to dowload or to forward/distribute the text or part of it out the coset of the author(s ad/or copyright holder(s, uless the work is uder a ope cotet licese (like Creative Commos. Take-dow policy If you believe that this documet breaches copyright please cotact us providig details, ad we will remove access to the work immediately ad ivestigate your claim. Dowloaded from the Uiversity of Groige/UMCG research database (Pure: For techical reasos the umber of authors show o this cover page is limited to 10 maximum. Dowload date:

2 Chapter 4 Hypergeometric Distributio The hypergeometric distributio plays a key role i statistical auditig. This chapter describes some importat properties of the hypergeometric distributio we use i subsequet chapters. Sectio 4.1 will give some elemetary properties of the hypergeometric probability. This sectio also gives some properties of the hypergeometric distributio fuctio ad quotiets of hypergeometric distributio fuctios. These properties will be very helpful i Chapter 5. Sectio 4.2 gives exact ad approximate cofidece itervals for the probability that a certai characteristic is preset i a populatio. Fially Sectio 4.3 shows how we ca calculate hypergeometric probabilities i a efficiet ad accurate way. This sectio is essetial to Chapter Properties of the hypergeometric distributio Cosider a populatio of N elemets. A umber of these N elemets may have a certai characteristic that we are iterested i, e.g. the umber of travel declaratios i a yearly populatio which were processed icorrectly. We will deote this umber by M. I auditig applicatios this characteristic is ofte uwated, ad therefore the value of M is relatively small. This umber is ot kow to us i advace. To get more iformatio about M, a radom sample of size is take. The sample cotais K elemets that have the characteristic of iterest. The umber K i the sample follows a hypergeometric distributio parameters, M, ad N. We write K H(, M, N. We use a well-kow exteded defiitio of the biomial coefficiets, that

3 56 Chapter 4. Hypergeometric Distributio will be very coveiet i our algebraic maipulatios the hypergeometric distributio. Recall that ( p q = p! for q = 0, 1,..., p; p = 0, 1, 2,..., q!(p q! where 0! = 1 by defiitio. For other values of p, q Z it is defied ( p q = 0. Usig these otatios we do ot have to icorporate the usual domai for K, amely K = 0,..., the restrictio that K (N M ad K M. Thus for o-egative itegers k we have P{K = k, M, N} = M k k. (4.1.1 We kow that E(K = M N ad M (N M (N Var(K =. N 2 (N 1 The followig properties for the hypergeometric distributio hold. We refer to Lieberma ad Owe (1961. Property The hypergeometric distributio has the followig elemetary properties: P{K = k + 1, M, N} = (M k ( k (k + 1 (N M + k + 1 P{K = k, M, N} P{K = k + 1, M, N} = ( + 1 (N M + k ( + 1 k (N P{K = k, M, N} P{K = k, M + 1, N} = (M + 1 (N M + k (M + 1 k (N M P{K = k, M, N} P{K = k, M, N + 1} = (N + 1 (N M + 1 (N M + k + 1 (N + 1 P{K = k, M, N}

4 4.1. Properties of the hypergeometric distributio 57 P{K = k, M, N} = P{K = k, N M, N} = P{K = M k N, M, N} = P{K = N M + k N, N M, N} P{K k, M, N} = 1 P{K k 1, N M, N} = 1 P{K M k 1 N, M, N} = P{K N M + k N, N M, N} The followig property is a very helpful tool that shows that we are allowed to iterchage M ad out affectig the hypergeometric probabilities. This property will frequetly be used i this thesis. Property If the roles of M ad are iterchaged, this does ot affect the hypergeometric probabilities; i.e. P{K = k, M, N} = P{K = k M,, N}. The proof of this property is simple. A probabilistic explaatio for Property is give i Davidso ad Johso (1993. Notice that P{K = k, M, N} is a uimodal fuctio of k, see Johso, Kotz ad Kemp (1992. It takes o its maximum for the largest iteger that does ot exceed (M+1(+1. If (M+1(+1 is a iteger, say c the it takes o its N+2 N+2 maximum for this iteger c, but also for c Properties of Λ(, M, N We itroduce the followig otatio, Λ(, M, N = P{K k 0, M, N} = k 0 k=0 M k k. (4.1.2 This otatio suppresses the depedece o k 0, because i most of our applicatios we will ot allow the value of k 0 to vary. Uless stated otherwise k 0 will be cosidered fixed i the sequel. I fact we will be more iterested i the behaviour of Λ as a fuctio of, M, ad N. We will discus some properties of Λ(, M, N that are especially useful i Chapter 6.

5 58 Chapter 4. Hypergeometric Distributio Theorem The followig properties hold for Λ(, M, N: 1. Λ(, M, N = Λ(M,, N. 2. Λ(, M, N = 1 if ad oly if M k 0 or k Λ(, M, N = 0 if ad oly if M > N + k Let M {0,..., N 1}, the M 1 k Λ(, M + 1, N = Λ(, M, N 0 k 0 1, or, equivaletly, Λ(, M + 1, N = Λ(, M, N N P{K = k 0 1, M, N 1} for {1,..., N}. 5. Let M {0,..., N 1}, the Λ(, M + 1, N Λ(, M, N. The iequality is strict if ad oly if k 0 M N + k 0 ad > k Let {0,..., N 1}, the Λ( + 1, M, N Λ(, M, N. The iequality is strict if ad oly if k 0 N M + k 0 ad M > k 0. Proof. Parts 1, 2, ad 3 immediately follow from Property 4.1.2, (4.1.2, ad the defiitio of the hypergeometric distributio, respectively. Usig Pascal s triagle we obtai ( N k 0 ( ( M + 1 N M 1 Λ(, M + 1, N = k k k=0 k 0 (( ( ( M M N M 1 = + k k 1 k k=0 k 0 = k k=0 M 1 k + k 0 1 k=0 k ( N M 1 k 1

6 4.1. Properties of the hypergeometric distributio 59 ad hece ( N Λ(, M, N = k 0 k=0 k ( N M k k 0 ( (( ( M N M 1 N M 1 = + k k k 1 k=0 k 0 ( ( M N M 1 k 0 ( ( M N M 1 = + k k k k 1 k=0 k=0 ( ( ( N M N M 1 = Λ(, M + 1, N +. k 0 1 Summatios empty idex sets are equal to zero by defiitio. This proves the first result of part 4. Its secod result is obvious. Part 5 follows immediately from part 4. Part 6 follows from part 5 by applyig the result from part 1. Theorem 4.1.1, part 5 shows that the probability of acceptig the populatio is decreasig i M. Part 6 shows that this probability also decreases if a larger sample is take. These facts are i accordace ituitio Properties of λ(, M, N This subsectio will focus o the quotiet of Λ(, M + 1, N ad Λ(, M, N, which plays a key role i provig some of the properties i Chapter 5. This quotiet is defied by k 0 λ(, M, N = { Λ(,M+1,N Λ(,M,N > 0 if M < N + k 0, 0 if N + k 0 M N 1. (4.1.3 Accordig to Theorem 4.1.1, part 3 this ratio is well-defied for M N +k 0, ad i the special case M = N +k 0 it is equal to zero. Obviously 0 λ 1, accordig to Theorem 4.1.1, part 4. A umber of properties of λ are collected i the followig theorem. Theorem The followig properties hold for λ(, M, N [0, 1]. 1. λ(, M, N = 1 if ad oly if M < k 0 or k λ(, M, N = 0 if ad oly if M N + k 0.

7 60 Chapter 4. Hypergeometric Distributio 3. Let > k 0, if k 0 M N + k 0, the it ca be writte λ(, M, N = 1 1 g(, M, N, where g(, M, N = Λ(, M, N M 1 > 0. k 0 k Let M {0,..., N 1}, the λ(, M, N λ(, M + 1, N. The iequality is strict if ad oly if max(0, k 0 1 M N + k 0 1 ad > k Let, M {0,..., N 1} the λ(, M, N λ( + 1, M, N. The iequality is strict if ad oly if k 0 N M + k 0 1 ad M > k If M k 0, > k 0 ad N + M k 0, the λ(, M, N < λ(, M, N + 1. Proof. Part 1 follows from (4.1.2 ad Theorem 4.1.1, parts 2 ad 5. Part 2 is obvious. Part 3 follows from Theorem 4.1.1, parts 3, 4, ad 5. Now we prove part 4. For k 0, part 4 follows trivially from part 1. Therefore, we assume > k 0. Usig part 3 we derive for k 0 M N + k 0 that g(, M, N = = k 0 k=max(0,+m N Λ(, M, N M 1 = k 0 k 0 1 k 0! k! (N M k 0 k=0 k 0 h=k+1 M k k k 0 M 1 k 0 1 = N M + h M h + 1 k 0 j=k 1 j. (4.1.4 Notice that from Theorem 4.1.1, part 5 ad the parts 1 ad 2 just established it follows that 0 < λ(, M, N < 1 for k 0 M N + k 0 1, ad ad 1 = λ(, 0, N =... = λ(, k 0 1, N > λ(, k 0, N for k 0 1, λ(, N + k 0 1, N > λ(, N + k 0, N =... = λ(, N 1, N = 0.

8 4.2. Cofidece sets 61 For k 0 = 0 there is o M such that λ(, M, N = 1. Now it remais to prove that λ(, M, N > λ(, M + 1, N or, equivaletly g(, M, N > g(, M + 1, N, for k 0 M N + k 0 2. This follows from (4.1.4, because g(, M, N is a decreasig fuctio of M o this iterval. This cocludes the proof of part 4. Notice that for M < k 0, part 5 follows trivially from part 1. Therefore, we assume M k 0. From Theorem 4.1.1, part 6 ad the parts 1 ad 2 just proved, it follows that 0 < λ(, M, N < 1 for k N M + k 0 1, ad ad 1 = λ(0, M, N =... = λ(k 0, M, N > λ(k 0 + 1, M, N, λ(n M+k 0 1, M, N > λ(n M+k 0, M, N =... = λ(n 1, M, N = 0. To complete the proof of part 5 we have to prove that g(, M, N > g( + 1, M, N for k N M + k 0 2. This follows from (4.1.4, because g(, M, N is a decreasig fuctio of o this iterval. This cocludes the proof of part 5. From (4.1.4 we ca see that g(, M, N < g(, M, N + 1 ad hece λ(, M, N < λ(, M, N + 1 for M k 0, > k 0 ad N + M k 0. This cocludes the proof of part 6. Remark Theorem 4.1.2, parts 4 ad 5 imply logcocavity of the cumulative hypergeometric distributio fuctio i the argumets ad M i all possible poits, ad eve strict logcocavity o a relevat subset. Here, logcocavity of a fuctio f o the o-egative itegers is defied as f (x + 2 f (x [ f (x + 1] 2 x = 0, 1,... Strictess occurs if the iequality is strict. 4.2 Cofidece sets The value of M is ot kow to us, but after takig a radom sample of size we ca give a poit estimate ad costruct a cofidece iterval for M. Suppose we observe k items the characteristic of iterest i the sample. The maximum likelihood estimator for M is the give by the largest iteger ot exceedig K N+1, i.e. K N+1. If K N+1 is a iteger, the K N+1 1 ad K N+1 both maximize the likelihood. This is ot a ubiased estimator. The ubiased

9 62 Chapter 4. Hypergeometric Distributio estimator is give by K N ad a ubiased estimator for its variace is N (N 1 K ( 1 K. Oly providig poit estimators for M will ot suffice. To quatify the ucertaity, we would also like to give cofidece iterval estimators. We prefer to give exact cofidece itervals. Here, exact we mea that we use the uderlyig hypergeometric distributio ad ot some approximatio of this distributio. Due to the discrete character of the hypergeometric distributio it is possible to costruct cofidece sets istead of cofidece itervals. Although from a practical view we prefer cofidece itervals, we caot exclude the possibility of cofidece sets that are ot cofidece itervals. If we observe K = k, where k {0,...,}, we would like to fid a way to associate to this value of k for α (0, 1, a subset of possible values of M {0,..., N}, we call this subset M C (k, to state that M C (K cotais the true value of M probability of at least 1 α, or, i symbols, P{M C (K M M} 1 α, for every M {0,..., N}. (4.2.1 The quatity 1 α is called the cofidece level. The probability i equatio (4.2.1 is called the coverage probability for M. Due to the discrete character of M it is ot possible to exactly attai the omial cofidece level 1 α out usig radomized methods (Wright, These methods will always attai the exact omial cofidece level. We will ot cosider these methods. The methods discussed here are coservative, meaig that the cofidece level will be at least 1 α. We have to costruct M C (K, i.e. M C (0,..., M C ( i such a way that (4.2.1 is satisfied for every M {0,..., N}. We first otice that give the true value of M the probability that M C (K cotais M is the same as the total probability of observig those values of k for which M C (k cotais M. Let R(M be the set cotaiig all these values of k, i.e. R(M = {k M C (k M}, the we ca rewrite the left-had side of (4.2.1 i the followig way P{M C (K M M} = P{K = k, M, N}. (4.2.2 k R(M

10 4.2. Cofidece sets 63 Remember that K H(, M, N. Now, suppose we costruct for every M [0,..., N] sets R (M values of k such that k R (M P{K = k, M, N} 1 α ad let M C (k be the set of all values of M for which k R (M. It is obvious from (4.2.2 that by usig M C (K = M C (K, ad also R(M = R (M, we have foud a way to defie M C (K such that equatio (4.2.1 is satisfied for every M {0,..., N}. There are various methods to defie R (M to costruct cofidece sets. We will discuss two of these methods here Test-method We call M C (k a cofidece set. I those cases where the cofidece sets M C (K actually tur out to be cofidece itervals [M L (K, M U (K], we speak of a 100(1 α% two-sided cofidece iterval lower cofidece boud M L (K ad upper cofidece boud M U (K. Sice we kow that the hypergeometric distributio fuctio is a uimodal fuctio of k, we ca costruct R(M i the followig way. For every M [0,..., N] the set R(M cotais all values of k for which P{K k M} > β ad P{K k M} > γ, β + γ = α. First, we will cosider the case β = γ = α. Note that 2 mi(r(m ad max(r(m are o-decreasig fuctios of M. This esures that the cofidece set M C (K = {M K R(M} will always be a cofidece iterval. If we observe K = k, the the lower ad upper cofidece iterval limits are give by M L (k = smallest iteger M s.t. P{K k} > α 2 ad M U (k = largest iteger M s.t. P{K k} > α 2. This method coicides geeratig a cofidece iterval by ivertig a family of hypothesis tests for M. That is why this method is called the test-method. It also appears to be the same method as described by Katz (1953, Koij (1973 ad Wright (1991.

11 64 Chapter 4. Hypergeometric Distributio Buoaccorsi (1987 showed that this method is always superior to the oe described by Cochra (1977 i the sese that this method always delivers cofidece itervals that are shorter tha the cofidece itervals that were suggested by Cochra. Cochra s itervals were the fiite populatio aalog of the method by Clopper ad Pearso (1934 for the costructio of cofidece itervals for a biomial fractio. Also other values of β ad γ could be cosidered. A very iterestig case is the case of β = 0 ad γ = α. This is the case of oly givig a upper cofidece boud. Bickel ad Doksum (1977 showed that this boud will be uiformly most accurate, because if the iverse test method is used, the the correspodig tests are uiformly most powerful Likelihood-method We could also costruct R(M i the followig way. For every M [0,..., N] we sort the values of k accordig to the size of the accompayig probabilities. Therefore, k (1 has the largest probability, k (2 has the ext largest ad so forth. If ties occur betwee k (i ad k (i+1, the the orderig is ot strict. We deal this issue later. This meas that P { K = k (1 M } P { K = k (2 M }... P { K = k ( M }. Now, for every M [0,..., N] we costruct R(M i such a way that it cosists of the smallest possible umber of elemets, say k (M, such that k (M i=1 P { K = k (i M } 1 α. Because the elemets are selected based o their likelihood, we call the cofidece set M C (K = {M K R(M} obtaied i this way a likelihood cofidece set. This method was first described by Wedell ad Schmee (2001. We will call mi C (K the lower cofidece boud M L (K ad max C (K the upper cofidece boud M U (K. Usig this method it is possible that the cofidece sets produced are ot cofidece itervals, gaps ca occur. A practical solutio is to take the iterval [M L (K, M U (K]. Some theoretical solutios are suggested by Wedell ad Schmee. They also show that the occurrece of

12 4.2. Cofidece sets 65 these gaps is seldom. Usig this method ties ca occur. Ties occur whe P { K = k (k (M M } = P { K = k (k (M+1 M }. These ties ofte occur whe the hypergeometric distributio is symmetric for lower ad upper tail probabilities. Suppose k (k (M < k (k (M+1, the if we choose k (k (M to add to R(M this meas that M U (k (k (M is less tight ad that M L (k (k (M+1 is tighter compared to the choice of k (k (M+1. Of course this choice has to be made before we start samplig. 6 5 Test method Likelihood method 4 3 k M Figure 4.1. Compariso of the 90%-cofidece itervals of the test-method ad the likelihood-method for = 5 ad N = 20. Wedell ad Schmee also showed by simulatio studies that this method performs well i compariso to test-method. Figure 4.1 gives a compariso of the two methods for a 90%-cofidece iterval = 5 ad N = 20. Notice

13 66 Chapter 4. Hypergeometric Distributio that i this case for k = 1 ad k = 4 the cofidece itervals are equally log. I all other cases the likelihood-method produces shorter itervals. It is also possible that the test-method produces shorter itervals, but study of Wedell ad Schmee shows that this will ot occur very ofte Approximate cofidece sets Istead of usig the exact hypergeometric distributio to obtai cofidece sets for M, also i certai cases approximatios of this distributio ca be used. We use these approximatios to fid cofidece itervals for p = M. Of course N cofidece itervals for M ca be obtaied by multiplyig the populatio size N. We will describe three approximatios, that is the approximatio by the biomial distributio, the approximatio by the Poisso distributio, ad the approximatio by the ormal distributio. The questio arises whe we are allowed to use a certai approximatio. Text books give so-called rules of thumb. However, these rules differ amog text books, ad are almost always give out ay quatitative assessmet of the quality of such approximatios. Therefore, we should ot pay too much attetio to rules of thumb. Schader ad Schmid (1992 showed that two rules of thumb for approximatig the biomial distributio by the ormal distributio are of dubious quality i umerical accuracy. Leemis ad Kishor (1996 ivestigated rules of thumb for ormal ad Poisso approximatios of the biomial distributio. From their article we ca see, especially whe we look at it from a auditig poit of view (i which the proportios are usually very small, that usig rules of thumb out ay quatitative assessmet of the quality of the approximatios should be avoided. Therefore, if possible we should use a exact approach. We will apply these approximatios to the test-method β = γ = α 2. Therefore, i terms of p our problem focusses o solvig the followig equatios to fid the smallest iteger value of N p L such that P{K k p = p L } = i=k pl i (1 pl i > α 2,

14 4.2. Cofidece sets 67 ad the largest iteger value of N p U such that P{K k p = p U } = k i=0 pu i (1 pu i > α 2. Note that, p L ad p U are elemets of {0, 1/N, 2/N,...,1}. Our (1 α- cofidece iterval for p becomes [p L, p U ]. I certai cases we ca approximate the hypergeometric distributio by aother discrete or eve cotiuous distributio Biomial approximatio For relatively small values of p ad large values of N we ca approximate the hypergeometric distributio by the biomial distributio. As a rule of thumb p < 0.1 ad N 60 is sometimes used. Now, p L ad p U are elemets of [0, 1], ad we have to solve the followig problem. Fid p L ad p U such that ad P{K k p = p L } = P{K k p = p U } = i=k k i=0 ( p i L i (1 p L i = α 2, ( pu i i (1 p U i = α 2. This cofidece iterval is kow as the Clopper-Pearso cofidece iterval for p (Clopper ad Pearso, The followig relatioship relates the tail of a biomial distributio the tail of a F-distributio k i=0 ( { p i (1 p i = P Y i } (1 p(k + 1 p( k Y F(2( k, 2(k + 1. A proof ca be foud i Leemis ad Kishor (1996. Now, it follows immediately that ad p L = k+1 k F 1 α 2 (2( k + 1, 2k 1 p U = 1 + k k+1 F α 2 (2( k, 2(k + 1,

15 68 Chapter 4. Hypergeometric Distributio where F 1 α 2 (, ad Fα 2 (, deote the 100 (1 α/2th ad the 100 (α/2th percetile of the F-distributio. May statistical software packages provide the percetiles of the F-distributio. For large degrees of freedom umerical problems ca occur, the approximate methods could be used. Vollset (1993 compared thirtee methods that produce two-sided cofidece itervals for the biomial proportio. Newcombe (1998 further examied seve of these methods. The Clopper-Pearso method is kow to be rather coservative, meaig that the coverage probabilities usually exceed 1 α. Very ofte approximate methods as adjusted Wald itervals or cotiuity corrected score itervals are suggested to tackle this problem (e.g. Vollset, 1993; Leemis ad Kishor, Blyth ad Still (1983 remark that the Clopper-Pearso method is oly a approximatio of the exact iterval ad cosider procedures correct cofidece coefficiet. These methods give umerical results that are very similar to the approach the acceptability fuctio of Blaker ad Spjøtvoll (2000. Poisso approximatio For small values of p ad extremely large values of the Poisso approximatio ca be used. As a rule of thumb (p < 0.01 ad ( 1000 is sometimes used. Now, p L ad p U are elemets of [0, 1] agai, ad we have to solve the followig problem. Fid p L ad p U such that P{K k p = p L } = e p L (p L i i=k i! = α 2, ad k e p U (p U i P{K k p = p U } = = α i! 2. i=0 The followig relatioship relates the tail of a Poisso distributio the tail of a χ 2 -distributio. k 1 i=0 e p (p i i! = P{Y > 2p} Y χ 2 (2k. A proof ca be foud i Johso et al. (1992. Now, it follows immediately that p L = 1 2 χ 2 α 2 (2k

16 4.2. Cofidece sets 69 ad p U = 1 2 χ 2 1 α (2(k + 1, 2 where χ 2 α 2 ( ad χ 2 1 α 2 ( deote the 100 (α/2th ad the 100 (1 α/2th percetile of the χ 2 -distributio. Also this cofidece iterval is coservative. It is possible to icrease some of the lower edpoits ad decrease some of the higher edpoits ad still satisfy the coverage requiremet. Examples ca be foud i Crow ad Garder (1959, Casella ad Robert (1989, ad Kabaila ad Byre (2001. Normal approximatio We ca also use the ormal distributio to approximate the hypergeometric distributio. To do so the rule of thumb p 4 is sometimes used. We ca approximate the hypergeometric distributio by a ormal distributio mea ad variace equal to mea ad variace of K. Therefore, p L ad p U are elemets of [0, 1] agai, ad usig cotiuity correctios we have to solve the followig problem. Fid p L ad p U such that ad P{K k p = p L } = 1 k 0.5 p L = α p L (1 p L N 2, N 1 P{K k p = p U } = k p U = α p U (1 p U N 2. N 1 Solvig these equatios gives the followig cofidece iterval [p L, p U ] = 1 [ u + (2k ± 1 2u ± u 2 2u ] ( (k ± ( k (2k ± 1 2 where u = + N N 1 Z 2 1 α 2, Z 2 1 α the 100 (1 α/2th percetile of the stadard ormal distributio. 2 More simplified versios of this approximatio are also used.

17 70 Chapter 4. Hypergeometric Distributio Lig ad Pratt (1984 compared several ormal approximatios for the hypergeometric distributio. They show that especially the so-called Peizer approximatios tur out to be very accurate. These complicated approximatios origiate from a upublished paper by Peizer. However, these approximatios are ot ivertible i closed form. Moleaar (1973 gave two relatively simpler ormal approximatios that are ivertible i closed form, but still give very complicated solutios. These approximatios will probably give more accurate bouds tha the method described above. A crude approximatio ca be obtaied by usig the approximate ormality of p mea equal to the ubiased estimator for p, i.e. K, ad variace equal to the ubiased estimator for the variace of this estimator, i.e. ( ( N K 1 K. N( 1 If we also correct for cotiuity, the we fid the followig cofidece iterval [p L, p U ] = [ ( k ± Z 2 1 α 2 N N( 1 ( ( k 1 k ] Computig the hypergeometric distributio Theorem 4.1.1, part 4 ca be used to fid some recursive properties that we will use i calculatig the hypergeometric distributio. It shows that we ca compute Λ(, M, N from Λ(, M + 1, N, by usig the hypergeometric probability P{K = k 0 1, M, N 1}. But suppose that we already calculated Λ(, M + 1, N from Λ(, M + 2, N, the we ca use this step to facilitate the computatio of P{K = k 0 1, M, N 1}. Property gives a few examples of this. Property The followig recursive properties facilitate the computatio of the hypergeometric distributio. 1. If k 0 M N + k 0 1 ad k N 1, the Λ(, M, N = Λ(, M + 1, N + C 1 (, M, N

18 4.3. Computig the hypergeometric distributio 71 C 1 (, M, N = M k M + 1 If k N 1, the N M 1 N M + k 0 C 1 (, M + 1, N, C 1 (, N + k 0, N =! (N + k 0! k 0! N! 2. If k M N + k ad k N, the Λ(, M, N = Λ( 1, M, N C 2 (, M, N. C 2 (, M, N = N M N + 1 M k 0 k 0 1 C 1( 1, M, N. 3. If k 0 M N + k 0 ad k N, the Λ(, M, N = Λ(, M + 1, N + C 1 (, M, N C 1 (, M, N = M + 1 C 2(, M + 1, N. 4. If k M N + k 0 ad k 0 N 1, the Λ(, M, N = Λ( + 1, M 1, N + C 3 (, M, N C 3 (, M, N = M C 1 ( + 1, M 1, N. Proof. First we prove part 1. Usig Theorem 4.1.1, part 4 we fid Λ(, M, N = Λ(, M + 1, N + C 1 (, M, N C 1 (, M, N = M 1 k 0 k 0 1, ad Λ(, M + 1, N = Λ(, M + 2, N + C 1 (, M + 1, N

19 72 Chapter 4. Hypergeometric Distributio C 1 (, M + 1, N = +1 k 0 M 2 k 0 1. From Theorem 4.1.1, part 4 we otice that C 1 (, M, N > 0 ad C 1 (, M + 1, N > 0 if k 0 M N + k 0 1 ad k N 1. Combiig the expressios for C 1 (, M, N ad C 1 (, M + 1, N gives C 1 (, M, N = M k M + 1 For M = N + k 0 ad k N 1 we fid N M 1 N M + k 0 C 1 (, M + 1, N. C 1 (, N + k 0, N = Λ(, N + k 0, N Λ(, N + k 0 + 1, N = Λ(, N + k 0, N =! (N + k 0!. k 0! N! I provig part 2 we agai use Theorem 4.1.1, part 4 i combiatio part 1. Usig this theorem we fid Λ(, M, N = Λ( 1, M, N C 2 (, M, N C 2 (, M, N = ( 1 k 0 M k 0 1 M, ad Λ( 1, M, N = Λ( 1, M + 1, N + C 1 ( 1, M, N M 1 k C 1 ( 1, M, N = 0 k Observe that C 2 (, M, N > 0 ad C 1 ( 1, M, N > 0 if k M N +k 0 +1 ad k 0 +2 N. Combiig the expressios for C 2 (, M, N ad C 1 ( 1, M, N gives C 2 (, M, N = To prove part 3 we use the previous results N M N + 1 M k 0 k 0 1 C 1( 1, M, N. Λ(, M, N = Λ(, M + 1, N + C 1 (, M, N

20 4.3. Computig the hypergeometric distributio 73 C 1 (, M, N = M 1 k 0 k 0 1, ad Λ(, M + 1, N = Λ( 1, M + 1, N C 2 (, M + 1, N C 2 (, M + 1, N = ( 1 k 0 M k ( 0 N M+1. Note that C 1 (, M, N > 0 ad C 2 (, M + 1, N > 0 if k 0 M N + k 0 ad k 0 +1 N. Combiig the expressios of C 1 (, M, N ad C 2 (, M+ 1, N gives C 1 (, M, N = M + 1 C 2(, M + 1, N. I provig part 4 we use Theorem 4.1.1, part 4 i combiatio part 1 ad fid Λ(, M, N = Λ( + 1, M, N + C 2 ( + 1, M, N C 2 ( + 1, M, N = ( 1 k 0 M k 0 1. M We agai use Theorem 4.1.1, part 4 to fid Λ( + 1, M, N = Λ( + 1, M 1, N C 1 ( + 1, M 1, N C 1 ( + 1, M 1, N = 1 M k 0 k ( 0 N +1. Note that C 1 (+1, M 1, N > 0 ad C 2 (+1, M, N > 0 if k 0 +1 M N + k 0 ad k 0 N 1. Combiig the expressios of C 1 ( + 1, M 1, N ad C 2 ( + 1, M, N gives C 2 ( + 1, M, N = M + 1 C 1( + 1, M 1, N.

21 74 Chapter 4. Hypergeometric Distributio Usig the previous results we fid Λ(, M, N = Λ( + 1, M, N + C 2 ( + 1, M, N = Λ( + 1, M, N + M + 1 C 1( + 1, M 1, N = Λ( + 1, M 1, N C 1 ( + 1, M 1, N+ + M + 1 C 1( + 1, M 1, N = Λ( + 1, M 1, N + M C 1 ( + 1, M 1, N Table 4.1. The values of Λ(, M, 8 for k 0 = 2. M \ Table 4.1 gives the values of Λ(, M, 8 for all possible combiatios of ad M k 0 = 2. This table gives a illustratio of Property First we ca use Theorem 4.1.1, parts 2 ad 3. This gives us Λ(, M, 8 = 1 if 2 or M 2 ad Λ(, M, 8 = 0 for all combiatios of ad M for which M > 10. We start computig this table Λ(3, 7, 8. Usig Property 4.3.1, part 1 it immediately follows that Λ(3, 7, 8 = C 1 (3, 7, 8 = 3! ( ! 2! 8! = 3/8 = 0.375, ote that Λ(3, 8, 8 = 0. Agai by usig Property 4.3.1, part 1 we ca calculate

22 4.3. Computig the hypergeometric distributio 75 Λ(3, 6, 8: Λ(3, 6, 8 = Λ(3, 7, 8 + C 1 (3, 6, 8 = 3/ /8 = 3/8 + 5/7 3/8 = 3/8 + 15/56 = 9/14 0, We ca repeat this procedure util we have foud Λ(3, 3, 8 ad by the we have foud Λ(3, M, 8 for all possible values of M. We ca calculate Λ(4, 6, 8, Λ(4, M, 8 = 0 for M > 6, by usig Property 4.3.1, part 2: Λ(4, 6, 8 = Λ(3, 6, C 1(3, 6, 8 = 9/14 2/5 4 15/56 = 9/14 3/7 = 3/ Sice Λ(4, 7, 8 = 0, it follows that C 1 (3, 6, 8 = 3/14. Now we ca apply Property 4.3.1, part 1 to fid the remaiig values of Λ(4, M, 8. By repeatig the procedure above the table ca be completed. Sometimes we have to use the terms of Λ to fid a recursive expressio. For istace if we would like calculate Λ(, M, N from Λ(, M, N 1 or from Λ(, M 1, N 1. We itroduce the followig otatio. We write P(, M, N as a (k vector, elemets P j (, M, N = P{K = j} = M j j, j = 0,...,k 0 ad ι = (1,...,1 a (k vector. Now, it follows that Λ(, M, N = ι P(, M, N. (4.3.1 How we compute the probabilities P j (, M, N from P j (, M, N 1 will be show i the followig property. Property If M k 0, k 0 ad N >, the for j = 0,...,k 0 0 if j < + M N j+1 N P j (, M, N = P j+1(, M, N 1 if j = + M N < k 0 ( j / N if j = + M N = k 0 (N (N M N (N M + j P j(, M, N 1 if j > + M N.

23 76 Chapter 4. Hypergeometric Distributio Proof. The cases j < + M N, j > + M N ad j = k 0 = + M N follow immediately from the defiitio of the hypergeometric probability. Note that if M k 0, k 0 ad N >, the P j (, M, N 1 > 0 implies that P j (, M, N > 0. For j = + M N < k 0 the probability P j (, M, N 1 equals zero, but the probability P j+1 (, M, N 1 does have a positive value. It is ot difficult to see that for P j (, M, N = j = j + 1 N j+1 1 = j + 1 N P j+1(, M, N 1. Notice that oce + M N 0, all elemets of P(, M, N are positive. We ca fid a similar property if we would like to compute the probability P j (, M, N from the probability P j (, M 1, N 1. Property If M > k 0, k 0 ad N >, the for j = 0,...,k 0 { 0 if j < + M N P j (, M, N = M (N N (M j P j(, M 1, N 1 if j + M N. Proof. This follows immediately from the defiitio of the hypergeometric probability. If M > k 0, k 0 ad N >, the P j (, M 1, N 1 > 0 implies that P j (, M, M > 0. The properties we derived here will be essetial i the developig of the algorithms that we will describe i Chapter 5 ad 6. These properties eable the algorithms to be efficiet ad accurate.

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Some Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation

Some Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation Some Properties of the Exact ad Score Methods for Biomial Proportio ad Sample Size Calculatio K. KRISHNAMOORTHY AND JIE PENG Departmet of Mathematics, Uiversity of Louisiaa at Lafayette Lafayette, LA 70504-1010,

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Access to the published version may require journal subscription. Published with permission from: Elsevier. This is a author produced versio of a paper published i Statistics ad Probability Letters. This paper has bee peer-reviewed, it does ot iclude the joural pagiatio. Citatio for the published paper: Forkma,

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Ma 530 Introduction to Power Series

Ma 530 Introduction to Power Series Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

MATH/STAT 352: Lecture 15

MATH/STAT 352: Lecture 15 MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

ENGI Series Page 6-01

ENGI Series Page 6-01 ENGI 3425 6 Series Page 6-01 6. Series Cotets: 6.01 Sequeces; geeral term, limits, covergece 6.02 Series; summatio otatio, covergece, divergece test 6.03 Stadard Series; telescopig series, geometric series,

More information

ON POINTWISE BINOMIAL APPROXIMATION

ON POINTWISE BINOMIAL APPROXIMATION Iteratioal Joural of Pure ad Applied Mathematics Volume 71 No. 1 2011, 57-66 ON POINTWISE BINOMIAL APPROXIMATION BY w-functions K. Teerapabolar 1, P. Wogkasem 2 Departmet of Mathematics Faculty of Sciece

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M. MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense, 3. Z Trasform Referece: Etire Chapter 3 of text. Recall that the Fourier trasform (FT) of a DT sigal x [ ] is ω ( ) [ ] X e = j jω k = xe I order for the FT to exist i the fiite magitude sese, S = x [

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

7 Sequences of real numbers

7 Sequences of real numbers 40 7 Sequeces of real umbers 7. Defiitios ad examples Defiitio 7... A sequece of real umbers is a real fuctio whose domai is the set N of atural umbers. Let s : N R be a sequece. The the values of s are

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Week 5-6: The Binomial Coefficients

Week 5-6: The Binomial Coefficients Wee 5-6: The Biomial Coefficiets March 6, 2018 1 Pascal Formula Theorem 11 (Pascal s Formula For itegers ad such that 1, ( ( ( 1 1 + 1 The umbers ( 2 ( 1 2 ( 2 are triagle umbers, that is, The petago umbers

More information

Bertrand s Postulate

Bertrand s Postulate Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

A LARGER SAMPLE SIZE IS NOT ALWAYS BETTER!!!

A LARGER SAMPLE SIZE IS NOT ALWAYS BETTER!!! A LARGER SAMLE SIZE IS NOT ALWAYS BETTER!!! Nagaraj K. Neerchal Departmet of Mathematics ad Statistics Uiversity of Marylad Baltimore Couty, Baltimore, MD 2250 Herbert Lacayo ad Barry D. Nussbaum Uited

More information

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes. Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai

More information

INEQUALITIES BJORN POONEN

INEQUALITIES BJORN POONEN INEQUALITIES BJORN POONEN 1 The AM-GM iequality The most basic arithmetic mea-geometric mea (AM-GM) iequality states simply that if x ad y are oegative real umbers, the (x + y)/2 xy, with equality if ad

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

x = Pr ( X (n) βx ) =

x = Pr ( X (n) βx ) = Exercise 93 / page 45 The desity of a variable X i i 1 is fx α α a For α kow let say equal to α α > fx α α x α Pr X i x < x < Usig a Pivotal Quatity: x α 1 < x < α > x α 1 ad We solve i a similar way as

More information

Mathematical Induction

Mathematical Induction Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

MATHEMATICS. The assessment objectives of the Compulsory Part are to test the candidates :

MATHEMATICS. The assessment objectives of the Compulsory Part are to test the candidates : MATHEMATICS INTRODUCTION The public assessmet of this subject is based o the Curriculum ad Assessmet Guide (Secodary 4 6) Mathematics joitly prepared by the Curriculum Developmet Coucil ad the Hog Kog

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

The Growth of Functions. Theoretical Supplement

The Growth of Functions. Theoretical Supplement The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that

More information

Math 257: Finite difference methods

Math 257: Finite difference methods Math 257: Fiite differece methods 1 Fiite Differeces Remember the defiitio of a derivative f f(x + ) f(x) (x) = lim 0 Also recall Taylor s formula: (1) f(x + ) = f(x) + f (x) + 2 f (x) + 3 f (3) (x) +...

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

Solutions to Tutorial 3 (Week 4)

Solutions to Tutorial 3 (Week 4) The Uiversity of Sydey School of Mathematics ad Statistics Solutios to Tutorial Week 4 MATH2962: Real ad Complex Aalysis Advaced Semester 1, 2017 Web Page: http://www.maths.usyd.edu.au/u/ug/im/math2962/

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Lesson 10: Limits and Continuity

Lesson 10: Limits and Continuity www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

Chi-Squared Tests Math 6070, Spring 2006

Chi-Squared Tests Math 6070, Spring 2006 Chi-Squared Tests Math 6070, Sprig 2006 Davar Khoshevisa Uiversity of Utah February XXX, 2006 Cotets MLE for Goodess-of Fit 2 2 The Multiomial Distributio 3 3 Applicatio to Goodess-of-Fit 6 3 Testig for

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3 Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd- Numbered Ed- of- Chapter Exercises: Chapter 3 (This versio August 17, 014) 015 Pearso Educatio, Ic. Stock/Watso

More information

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1). Assigmet 7 Exercise 4.3 Use the Cotiuity Theorem to prove the Cramér-Wold Theorem, Theorem 4.12. Hit: a X d a X implies that φ a X (1) φ a X(1). Sketch of solutio: As we poited out i class, the oly tricky

More information