Statistical Intervals Based on a Single Sample (Devore Chapter Seven)

Statitical Iterval Baed o a Sigle Sample Devore Chapter Seve MATH-252-01: robability ad Statitic II Sprig 2018 Cotet 0 Itroductio 1 0.1 Motivatio...................... 1 0.2 Remider of Notatio................ 2 1 Cofidece Iterval Uig Stadard Normal ercetile 2 1.1 Cofidece Iterval for the Mea of a Normal opulatio with Kow Variace......... 2 1.2 Aide: Bayeia Statitic ad robability Theory 4 1.3 Sample Size Determiatio............. 4 1.4 Oe-ided Iterval................. 5 1.5 Geeral Formalim................. 5 1.6 Cofidece Iterval for the opulatio roportio 6 2 Cofidece Iterval Whe the Variace i Etimated 6 2.1 Large-Sample Iterval............... 6 2.2 The t-ditributio................. 7 2.3 redictio Iterval................. 8 Copyright 2018, Joh T. Whela, ad all that 2.4 Tolerace Iterval................. 9 3 Cofidece Iterval for the Variace of a Normal opulatio 10 4 Summary of Cofidece Iterval 12 4.1 For Normal Radom Sample........... 12 4.2 Approximate Cofidece Iterval for Large Sample.......................... 12 Tueday 23 Jauary 2018 0 Itroductio 0.1 Motivatio We cotiue our tudy of iferetial tatitic, which allow u to defie procedure where the cotet of a radom ample tell u omethig about the ukow parameter of the uderlyig probability ditributio. For example, if we kow that a radom ample i draw from a ormal ditributio with a kow tadard deviatio ad a ukow mea µ, we ca make a 1

rule that ay for thi radom ample, cotruct a iterval of a pecified width cetered o the ample mea, ad thi iterval will have a 95% probability of cotaiig the true populatio mea µ. Thi i kow a a cofidece iterval for µ. A a pecific example where thi might be ueful, whe GS wa itroduced back i the 20th cetury, the govermet added ome radom error ito the reported poitio, for atioal ecurity purpoe. So for example, if you put a GS receiver o top of your houe ad attempted to meaure the elevatio above ea level, the GS would retur the true value plu a radom error with a tadard deviatio of about 50 meter. Thi radom error varied from day to day, o you could get a more accurate meauremet by averagig together multiple meauremet. If you took, ay, 25 meauremet, ad ued their ample mea to cotruct a 95% cofidece iterval o your elevatio, there would be a 95% chace your iterval icluded the true elevatio, a 2.5% chace the whole iterval wa above the true elevatio, ad a 2.5% chace that it wa below. 0.2 Remider of Notatio If Z i a tadard ormal radom variable, o Z > z α = α 0.1 1 α Z z α = Φz α 0.2 Note that becaue the tadard ormal ditributio i ymmetric, Φ z α = 1 Φz α = α 0.3 1 Cofidece Iterval Uig Stadard Normal ercetile 1.1 Cofidece Iterval for the Mea of a Normal opulatio with Kow Variace Suppoe {X i } i a radom ample of ize draw from a ditributio of mea µ ad tadard deviatio. I.e., for each i, EX i = µ ad V X i = 2. You howed i MATH 251 that the ample mea X = 1 X i 1.1 ha mea EX = µ ad variace V X = 2 /. You alo aw that it i ormally ditributed uder the followig coditio: 2

1. Exactly, if the uderlyig probability ditributio from which each X i i draw i a ormal ditributio. 2. Approximately, for large, by the Cetral Limit Theorem. If oe of thoe coditio hold, we ca defie the approximately, i the latter cae tadard ormal radom variable Z = X µ / 1.2 Give a value α betwee 0 ad 1, we ca cotruct a iterval that Z ha a probability 1 α of ladig i. If we put the boudarie of the iterval at ±z α/2 we fid Z < z α/2 = Φ z α/2 = α/2 1.3a z α/2 < Z < z α/2 = Φz α/2 Φ z α/2 = 1 α α 2 2 = 1 α 1.3b z α/2 < Z = 1 Φz α/2 = 1 1 α = α/2 1.3c 2 For example, takig α = 0.05, ice Φ1.96 0.9750 0.975 ad therefore z 0.025 1.96, Z ha a 2.5% chace of lyig below 1.96 Z ha a 95% chace of lyig betwee 1.96 ad 1.96 Z ha a 2.5% chace of lyig above 1.96 Thi ha a iteretig applicatio to the ituatio where we kow but ot µ. We already kow that X ca be ued to etimate µ, but ow we ca cotruct a iterval which ha a good chace of cotaiig µ a follow: 1 α z α/2 < Z < z α/2 z α/2 < X µ / < z α/2 z α/2 < X µ < z α/2 X z α/2 < µ < X + z α/2 X + z α/2 > µ > X z α/2 X z α/2 < µ < X + z α/2 1.4 z α/2 The iterpretatio of thi i a bit tricky: it temptig to look at X < µ < X + z α/2 = 1 α 1.5 or, pecifically X 1.96 < µ < X + 1.96 0.95 1.6 3

ad thik that if I take a ample, fid x, ad cotruct the iterval x 1.96, x + 1.96, that I ca ay there a 95% probability that µ lie i that iterval. Thi i wrog, though, ice we cotructed the iterval i a frequetit picture i which X i a radom variable, ot µ. From thi perpective, the value of µ, while it may ot be kow to u, i ot radom, ad we re really ot talkig about the probability for µ to take o a particular value. Rather, we re ayig that whatever the true, ukow value of µ i, if we collect a ample {x i } of ize ad cotruct a iterval x 1.96, x + 1.96, it ha a 95% chace of bracketig that true value. 1.2 Aide: Bayeia Statitic ad robability Theory A a aide, thi i ot the oly poible way to do thig. There are ituatio e.g., obervatioal rather tha experimetal ciece where you ca t ecearily repeat the collectio of a ample ad take aother hot at your cofidece iterval. I that cae, there really i oe et of value {x i } ad you would like to tate omethig about your degree of belief i differet poible value of µ. We ca till cat all of thi i the laguage of radom variable if we wat, a follow. Suppoe we ca aig a probability ditributio to the value of µ, ad repreet it by a radom variable M M i a uppercae µ. Give a value µ for M, we cotruct a radom ample {X i } of ize uig a ormal ditributio with that µ, ad the cotruct X from that ample o it doubly-radom i that cae. We ve ee that X z α/2 < µ < X + z α/2 M = µ = 1 α 1.7 We could ak itead, uppoe we do the whole experimet ad fid that X equal ome pecific value x. We could the ak for a iterval Ix uch that 1 M Ix X = x = 1 α 1.8 The detail are beyod the cope of thi coure, but you ca get a flavor of it by thikig about a impler cae where X ad M are dicrete radom variable. The, the uual frequetit probabilitie are writte i term of ome pecific value µ ad correpod to X = x M = µ 1.9 but if we ve doe the experimet ad have a pecific x, we d really like to talk about M = µ X = x 1.10 But thi i jut the ort of ituatio that Baye theorem wa meat to hadle. We ca write M = µ X = x = X = x M = µ M = µ X = x 1.11 Thi i the bread ad butter of Bayeia probability which Devore rather dimiively refer to a ubjective probability. There are complicatio, otably i decidig what to aig a the prior probability ditributio M = µ, but it doe allow you to actually talk about the probabilitie of thig you re itereted i. 1.3 Sample Size Determiatio Give a ample of ize, we ee that the width of the iterval we cotruct at cofidece level 1 α will be w = 2z α/2 1.12 1 Actually, ice you wat to ue all the iformatio i the data, what you re lookig for i M I{x i } {X i = x i } = 1 α. 4

Ofte, though, we ll be i a poitio of beig able to collect a bigger ample poibly at greater expee ad therefore we d like to kow how big the ample hould be to correpod to a certai width. If we kow, ad our deired cofidece level, we require i.e., or = = 2zα/2 w 1.4 Oe-ided Iterval w 2z α/2 1.13 1.14 2 = 2z α/2 1.15 w The cofidece iterval X z α/2 < µ < X + z α/2 = 1 α 1.16 i called a two-ided iterval becaue the probability α for µ to lie iide the iterval i plit up, with a probability of α/2 that it lie below ad α/2 that it lie above. It alo poible to defie a oe-ided iterval with all of the probability o oe ide. For example, µ < X + z α = 1 α 1.17 i a upper limit at cofidece level 1 α ad X z α < µ = 1 α 1.18 i a lower limit at cofidece level 1 α. 1.5 Geeral Formalim A bit of otatio i eeded to geeralize thi precriptio. We aid that X could be ued to etimate µ. It i ometime called a etimator ˆµ where the hat mea etimator ad we write it i blue to emphaize that it a radom variable. If we wat to talk about a geeral parameter, the covetio i to call that parameter θ. I thi cae θ i jut µ. Give a radom ample {X i }, we cotruct a radom variable tatitic h{x i }; θ whoe probability ditributio doe t deped o µ. I thi cae h{x i }; θ i h{x i }; µ = X µ /, which obey a tadard ormal ditributio idepedet of the value of µ. The ed of the cofidece iterval o h{x i }; θ are defied by o that h{x i }; θ < a = α/2 b < h{x i }; θ = α/2 1.19a 1.19b a < h{x i }; θ < b = 1 α 1.20 I thi cae, a i z α/2 ad b i z α/2. The ext tep i to covert if poible the boud o h{x i }; θ to boud o θ itelf, θ < l{x i } = α/2 u{x i } < θ = α/2 1.21a 1.21b o that the lower ad upper boud l{x i } ad u{x i } have a probability α of bracketig the true parameter value θ. I thi cae, l{x i } i X z α/2 ad u{x i } i X + z α/2. ractice roblem 7.1, 7.5, 7.7, 7.15, 7.19, 7.23, 7.26 5

Thurday 25 Jauary 2018 1.6 Cofidece Iterval for the opulatio roportio We ca apply thi method to etimatio of the proportio of a large populatio atifyig ome property, i.e., the parameter p of a biomial ditributio. If X Bi, p, we ca alo thik of X a beig the um X = B i of Beroulli radom variable {B i } obeyig B i = 1 = p; B i = 0 = 1 p q. We kow that X ha mea EX = p ad variace V X = pq ad that for p 10 ad q 10, the biomial ditributio aociated with X i reaoably approximated by a ormal ditributio. Thi mea that if we cotruct the radom variable the X p pq 1.22 1 α z α/2 < X p < z α/2 pq 1.23 ˆp p z α/2 < < z α/2 p1 p/ where we ve defied the etimator ˆp = X/. To get a iterval for the parameter p, we wat to olve for p at the edpoit. Squarig the relatio at either ed of the iterval give ˆp p 2 p1 p/ = z2 α/2 1.24 which ca be coverted ito the quadratic equatio 1 z2 α/2 p 2 2 ˆp + z2 α/2 + ˆp2 = 0 1.25 Uig the quadratic formula ad a bit of algebra, we ca fid the root z2 α/2 ˆp + 2 p ± = ± z ˆp1 ˆp α/2 + z2 α/2 4 2 1.26 1 + z2 α/2 which give the cofidece iterval o the parameter p the populatio proportio p < p < p + 1 α. 1.27 You might have aked, if we re approximatig the biomial ditributio by a ormal ditributio ayway, why ca t we jut take ˆp ± z α/2 ˆp1 ˆp/ a the CI limit? Ufortuately, that require to be rather large, ad o it bet to ue the more complicated limit give i 1.26. You do t eed to memorize the form of thi, but you hould write it o your formula heet for the exam. 2 Cofidece Iterval Whe the Variace i Etimated 2.1 Large-Sample Iterval Recall that lat time we aw that give a radom ample {X i } with kow variace 2 = V X i, we could cotruct the radom variable Z = X µ / ; 2.1 If we ca tate that thi i a tadard ormal radom variable, either becaue the uderlyig {X i } are kow to be ormallyditributed radom variable, or approximately by virtue of the 6

Cetral Limit Theorem, we ca ue it to et a cofidece iterval o the ukow value of µ: 1 α z α/2 < Z < z α/2 X z α/2 < µ < X + z α/2 2.2 What do we do if i alo ukow? Well, we kow that the ample variace S 2 = 1 1 X i X 2 2.3 i a etimator for the variace 2, o what if we cotruct X µ S/? 2.4 If we re relyig o the cetral limit theorem to make thi approximately Gauia for large, it all baically work a before, ad you ca et a cofidece iterval X z α/2 S, X+z α/2 S which ha a 1 α chace of cotaiig µ. The catch i that, ice we re addig radome by uig S itead of, we require 40 rather tha 30. 2.2 The t-ditributio If i ot large, but {X i } are iid ormally-ditributed radom variable with ukow µ ad, we ca cotruct a radom variable T = X µ S/ 2.5 Thi i ot a tadard ormal radom variable; it obey omethig called a t-ditributio, alo kow a a Studet 2 t- 2 Studet wa the pe ame of tatiticia William Sealy Goet. ditributio, with 1 degree of freedom. Devore ever actually write dow the probability ditributio fuctio; you do t eed to memorize it, but i cae you wat to play aroud with plottig it, it f T t; 1 /2 1 + t2 2.6 1 If you really wat to kow, with the proportioality cotat writte out it f T t; ν = [ν+1]/2 Γ[ν + 1]/2 1 + t2 2.7 νπγν/2 ν Note that the umber of degree of freedom df i ν = 1; thi ort of make ee, becaue whe = 1, we ca t defie S, ad the whole thig goe haywire. Note alo that a ν or equivaletly become large, the ditributio doe ted toward a tadard ormal ditributio: lim f T t; 1 lim 1 + t2 = lim 1 [ ] 1/2 1 + t2 = /2 e t2 1/2 = e t 2 /2 2.8 Here what the t ditributio look like for variou value of ν = 1: 7

Thi mea that 1 α t α/2, 1 < T < t α/2, 1 t α/2, 1 < X µ S/ < t α/2, 1 2.9 By aalogy to the defiitio of z α, we defie t α,ν kow a the t critical value by T > t α,ν = 1 F T t α,ν ; ν = α ad by the ame maipulatio a before S S X t α/2, 1 < µ < X + t α/2, 1 = 1 α 2.10 2.3 redictio Iterval So far we ve ued cofidece iterval to make tatemet about the parameter of the uderlyig ditributio, give propertie of a ample draw from that ditributio. Itead, we could try to cotruct a iterval with, ay, a 95% chace of cotaiig aother value draw from that ditributio. Specifically, if 8

X 1,..., X, X +1 are iid radom variable, we ca try to make predictio about the + 1t variable baed o the firt. Of coure, the bet etimator for X +1 which we ca cotruct from X 1,..., X i X = 1 X i 2.11 The error we make with that gue ha mea EX X +1 = EX EX +1 = µ µ = 0 2.12 ad variace V X X +1 = V X + V X +1 = 2 / + 2 = 1 + 1 2 2.13 If the uderlyig ditributio for each X i i ormal, the X X +1, beig a liear combiatio of ormal radom variable, i ormal, ad Z = X X +1 0 2.14 1 + 1 i a tadard ormal radom variable. If we do t kow ahead of time, we ca ue the ample variace S 2 = 1 X i X 2 2.15 1 calculated from the firt radom variable ad cotruct degree of freedom, ad therefore we ca ay X t α/2, 1 S 1 + 1 < X +1 < X + t α/2, 1 S 1 + 1 = 1 α 2.17 ad, give a ample {x i } of ize, cotruct the predictio iterval x ± t α/2, 1 1 + 1 2.18 for x +1 at predictio level 1 α 100%. Note that predictio iterval are wider tha cofidece iterval i geeral; i particular whe become large, the width of a cofidece iterval goe to zero, but the width of the predictio iterval goe to 2z α/2. 2.4 Tolerace Iterval Oe more type of iterval to coider i a tolerace iterval. Thi i cotructed, baed o a ample with mea x ad tadard deviatio, to have a 1 α 100% chace of cotaiig k% of the area of the uderlyig probability ditributio. Thi i a complicated cotructio, ad for our purpoe, it imply treated a a black box with a aociated table i the appedix. It motly a matter of udertadig the defiitio. T = X X +1 S 1 + 1. 2.16 ractice roblem Thi tur out to oce agai obey a t ditributio with ν = 1 7.29, 7.35, 7.37, 7.41 9

Tueday 30 Jauary 2018 3 Cofidece Iterval for the Variace of a Normal opulatio So far, we ve coidered cofidece iterval o the mea or i oe cae, the proportio of a ditributio. Fially, let coider how to et a cofidece iterval o the variace of the ormal ditributio from which we ve draw a ample. We kow that the ample variace S 2 = 1 1 X i X 2 3.1 ca be ued to etimate the variace, ice ES 2 = 2 3.2 We ca et up a cofidece rage for 2 by coiderig the ratio S 2 = 1 2 1 X i X 2 3.3 2 Now, ice X i µ 3.4 i a tadard ormal radom variable, we kow X i µ 2 3.5 2 obey a chi-quare ditributio with degree of freedom df. If we ue the ample mea X itead of the populatio mea µ, we get X i X 2 2 = 1S2 2 3.6 which tur out to obey a χ 2 ditributio with ν = 1 df. So we eed to et iterval for a chi-quare radom variable. Jut a we defied z α ad t α,ν a the left edge of a area α uder the pdf for a tadard ormal rv ad a t-ditributio with ν df, repectively, we defie χ 2 α,ν a the correpodig quatity for a chi-quare ditributio with ν degree of freedom: Γ[ν + 1]/2 νπγν/2 t α,ν 1 e z2 /2 dz = 1 Φz α = α 3.7a 2π 1 + t2 ν z α [ν+1]/2 dt = 1 F T t α,ν; ν = α 3.7b 1 χ 2 x ν/2 1 e x/2 α,ν dx = 1 F 2 ν/2 Γν/2 χ 2 ; ν = α 2 2 α,ν 3.7c 10

where we ve writte the cdf for the chi-quare ditributio i term of the tadard Gamma cdf F x; α. I practice each of thee three value z α, t α,ν, χ 2 α,ν i tabulated. So, havig defied the chi-quared critical value χ 2 α,ν, we wat to plit up the area uder the chi-quared ditributio ito the firt α/2, the middle 1 α, ad the lat α/2. The oe complicatio i that, ice the chi-quared ditributio i ot ymmetric like the tadard ormal ad t ditributio were, we have to hadle the lower limit more carefully. If we wat a value o that α/2 of the area uder the curve lie to the left of it, the 1 α/2 lie to the right, ad we have χ 2 1 α/2,ν ad χ2 α/2,ν a our α 100th ad 1 α 100th ad percetile. The lack of ymmetry mea χ 2 1 α/2,ν χ 2 α/2,ν 3.8 I fact, both χ 2 1 α/2,ν ad χ2 α/2,ν mut be poitive. The iterval cotaiig the middle 1 α of probability i thu 1 α χ 2 X i X 2 1 α/2, 1 < < χ 2 α/2, 1 χ 21 α/2, 1 < 1S 2 < 2 < χ 2 α/2, 1 2 1S2 2 < χ 2 α/2, 1 1S2 χ 2 1 α/2, 1 3.9 ad, if we prefer to write a cofidece iterval for the tadard deviatio, 1S 2 1S < < 2 = 1 α 3.10 χ 2 α/2, 1 χ 2 1 α/2, 1 ractice roblem 7.43, 7.45, 7.55 11

4 Summary of Cofidece Iterval 4.1 For Normal Radom Sample At cofidece level 1 α 100%: kow ukow variable lower upper µ x z α/2 µ x z α x + z α/2 µ x + z α µ x t α/2, 1 µ x t α, 1 x + t α/2, 1 µ x + t α, 1 µ µ 1 χ 2 α/2, 1 1 χ 2 α, 1 µ 0 4.2 Approximate Cofidece Iterval for Large Sample At approximate level 1 α 100%: 1 χ 2 1 α/2, 1 1 χ 2 1 α, 1 Ditributio variable lower upper geeral µ x z α/2 geeral µ x z α x + z α/2 geeral µ x + z α Beroulli/Biomial p ˆp+ z2 α/2 2 z α/2 1+ z2 α/2 ˆp1 ˆp + z2 α/2 4 2 ˆp+ z2 α/2 2 +z α/2 1+ z2 α/2 ˆp1 ˆp + z2 α/2 4 2 12