Topic 6 Sampling, hypothesis testing, and the central limit theorem

CSE 103: Probability ad statistics Fall 2010 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem 61 The biomial distributio Let X be the umberofheadswhe acoiofbiaspistossedtimes The distributio ofx is sofudametal to probability ad statistics that it merits a special ame, the biomial(, p) distributio Here it is, precisely: Σ = {0,1,,} ( ) Pr(X = k) = p k (1 p) k k The figure below shows the biomial(100, 06) distributio 009 008 007 006 Pr(k) 005 004 003 002 001 0 0 10 20 30 40 50 60 70 80 90 100 k The key property of this distributio is that it is tightly cocetrated aroud p = 60 This is the mea of X, as ca be see by writig it i the form X = X 1 + +X, where each X i B(p); i words, X i is 1 if the ith coi toss comes up heads How spread out is the distributio? To quatify this, we eed to compute the variace of X Oce agai, the represetatio of X as the sum of X i s comes i hady, coupled with the followig useful fact Fact 1 If Z 1,,Z are idepedet, the var(z 1 + +Z ) = var(z 1 )+ +var(z ) Each X i has variace p(1 p), so X has variace p(1 p) ad thus stadard deviatio p(1 p) Fact 2 If X biomial(,p), the: E(X) = p var(x) = p(1 p) stddev(x) = p(1 p) 6-1

CSE 103 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem Fall 2010 There s aother very useful fact about the biomial that comes up over ad over agai: about 95% of the distributio lies withi two stadard deviatios of the mea That is to say, ( Pr p 2 p(1 p) X p+2 ) p(1 p) 095 (61) We ll see how this approximatio arises a little later o As a example of its use, cosider the figure show above for biomial(100,06) The stadard deviatio is σ = 100(06)(04) 5 Ad ideed almost all the distributio lies i the rage p±2σ = 60±10 The remarkable thig about (61) is that although X ca potetially take o + 1 possible values, it effectively stays withi a rage of size just O( ); this is miiscule compared to +1 for large 62 Hypothesis testig 621 Testig a vaccie Suppose there is a certai disease that cattle ca cotract; ad that ay give cow has a 25% chace of cotractig it withi the course of a year A ew serum is proposed, ad we wat to test it to see how well it works So we choose cows at radom, iject them with the serum, ad the keep a eye o them over the ext year Let s say K of them remai healthy Oe possibility is that the serum has o effect whatsoever; call this hypothesis H 0 If this hypothesis is true, the the chace of ifectio is exactly the same for the cows that were ijected as it is for the cow populatio at large, ad thus K biomial(, 075) We hope that the experimet yields a large value of K; this will provide evidece agaist hypothesis H 0 But how large does K eed to be? It depeds o how much statistical cofidece we wat i our assertio that H 0 is false A commo goal is a 95% cofidece level Suppose we have a sample of = 100 cows The biomial(100, 075) distributio has mea 75 ad stadard deviatio σ = 100(025)(075) 43 Usig equatio (61), we ca deduce that if H 0 were true, we would expect (with 95% cofidece) a value of K i the rage 75 ± 86 If K exceeds this, that is if K > 83, the we ca reject H 0 with a 95% cofidece level 622 A blood pressure drug Suppose a ew blood pressure drug is proposed, ad to test it, a sample of patiets is selected Their blood pressure is measured with ad without the drug Perso BP with drug BP without drug 1 x 1 x 1 2 x 2 x 2 x x The ith trial is called a success if x i < x i ; let K be the total umber of successes How should the evidece be evaluated? Let H 0 be the hypothesis that the drug is useless Uder this hypothesis, K biomial(,05) Thus K has expected value 05 ad stadard deviatio (05)(05) = 05 As we oted above, with 95% cofidece, we d expect K to be withi two stadard deviatios of its mea, that is, K = 05± So if K exceeds 05+, we ca declare that H 0 is ivalidated with 95% cofidece, implyig that the drug is ot etirely useless 6-2

CSE 103 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem Fall 2010 63 Samplig We live i a age where the govermet ad the mass media are cotiuously pollig the public They wish to kow the fractio of people who like sushi, thik Obama is doig a good job, support Tiger Woods, thik God exists, feel pessimistic about the future, smoke, thik the war i Iraq is uecessary, etc This costat pollig affects decisio-makig at every level It advises politicias o what to say ad do It helps compaies decide how to most effectively advertise their products It helps ivestors decide where to etrust their moey Ad so o I each such case, there is a ukow probability p that is sought: the fractio of people who like sushi, for istace Determiig this fractio exactly would require askig everyoe, which is prohibitively expesive So istead a sample of people is chose at radom, ad they are each asked whether they like sushi Let K be the umber of positive resposes The K has the biomial(, p) distributio, with expected value p ad variace p(1 p) A estimate of the fractio of sushi-lovers is K/ Notice that ( ) K E = p ( ) K var ( ) K stddev = var(k) = 2 p(1 p) = p(1 p) A stadard rule of thumb, give i equatio (61) is that the estimate K/ will lie withi two stadard deviatios of its expected value with 95% probability Thus we ca assert with 95% cofidece that the true fractio of people who like sushi is K p(1 p) ±2 But this does t make sese, sice we do t kow what p is A quick fix is to otice that p(1 p) 1/4 (the maximizatio ca be doe by calculus, for istace), ad thus a valid 95% cofidece iterval is K ± 1 Thus a sample size of 2,500 gives a estimate that is accurate to withi 1/ = 002 (at 95% cofidece), whereas a sample size of 10,000 gives a accuracy of 001 What if we wat a higher level ofcofidece, like 99%, or eve 999%? Equatio (61) is o logerhelpful; we eed similar approximatios for other cofidece levels It turs out that these are easy to obtai, for ay desired cofidece level, usig the ormal approximatio to the biomial distributio 64 The ormal distributio You are probably readig these otes i a library or cafeteria Take a look at the 100 or so people earest to you If you were to plot the heights of all the me, it would look kid of like this: 6-3

CSE 103 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem Fall 2010 If you were to plot the blood pressure of all the wome, it would look much the same Or if you plotted the velocities of the molecules i the air, oce agai, same thig This oe distributio is everywhere For that reaso it is called the ormal distributio The ormal distributio with mea µ ad variace σ 2 is deoted N(µ,σ 2 ) Ulike most of the examples we ve see so far, it is a cotiuous distributio, a desity over the real lie The desity at ay poit x is 1 σ 2 2π e (x µ) /2σ 2 Do t worry too much about this specific fuctioal form Istead, let s look at pictures of three differet ormal distributios 04 035 N(0,1) N(2,1) N(0,4) 03 025 Pr(x) 02 015 01 005 0 6 4 2 0 2 4 6 x The red curve (or, if you re readig this i black ad white, the tall curve i the middle) is N(0,1): the ormal distributio with mea 0 ad variace 1 This is sometimes called the stadard ormal All other ormal distributios are simply scaled ad traslated versios of it For istace, the blue curve (the oe o the right) is N(2,1), which is obtaied by shiftig the stadard ormal two uits to the right The black curve (the wide oe i the middle), o the other had, is N(0,4), obtaied by scalig the stadard ormal by a factor of two Here s a summary of the relatioship betwee differet ormal distributios Fact 3 If X N(µ,σ 2 ) the ax +b N(aµ+b,a 2 σ 2 ) 6-4

CSE 103 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem Fall 2010 641 Sums of idepedet radom variables Here s a iterestig fact about the ormal distributio Fact 4 If X,Y are idepedet, with X N(µ 1,σ 2 1) ad Y N(µ 2,σ 2 2), the X+Y N(µ 1 +µ 2,σ 2 1+σ 2 2) I words, if a buch of idepedet radom variables are each ormally distributed, ad you add them up, the result still has a ormal distributio I fact, somethig much more spectacular is true: eve if the idepedet radom variables are ot ormally distributed, their sum looks like a ormal distributio! Fact 5 (Cetral limit theorem) Suppose X 1,,X are idepedet with mea µ ad variace σ 2 The for sufficietly large, the distributio of (X 1 + +X )/ is approximately N(µ,σ 2 /) Equivaletly, the distributio of X 1 + +X is approximately N(µ,σ 2 ) This astoishig fact helps explai why the ormal distributio is observed all over the place 642 Tails of the ormal I samplig ad hypothesis testig, we ultimately ed up addig idepedet radom variables; more o this below Thus the result is well-approximated by the ormal distributio, ad by aalyzig the tails of this distributio, we ca get itervals for ay desired level of cofidece Suppose X N(µ,σ 2 ) If we wat a 95% cofidece iterval, we seek a value of z for which Pr(µ z X µ+z) = 095 The aswer, it turs out, is z = 2σ If we wat a 99% cofidece iterval, we seek a value of z for which Pr(µ z X µ+z) = 099 I this case, z = 3σ Here is a summary of some key facts about tails of the ormal Fact 6 If X N(µ,σ 2 ), the: With 66% probability, X lies betwee µ σ ad µ+σ With 95% probability, X lies betwee µ 2σ ad µ+2σ With 99% probability, X lies betwee µ 3σ ad µ+3σ 65 Samplig revisited 651 Polls with yes/o aswers From the populatio at large people are chose at radom ad are asked a particular questio ( Do you approve of the corporate bailouts? ) The goal is to estimate the fractio p of the uderlyig populatio that would aswer yes Let s say that K of the sampled people say yes; the K biomial(,p), ad thus a reasoable estimate of p is K/ To aalyze the situatio further, it helps to approximate the biomial by a ormal distributio Notice that K = X 1 +X 2 + +X 6-5

CSE 103 Topic 6 Samplig, hypothesis testig, ad the cetral limit theorem Fall 2010 where the X i are idepedet B(p) radom variables ( did the ith perso say yes? ) By the cetral limit theorem, ( ) K p(1 p) is distributed like N p, p(1 p) Writig σ = 1 2, we ca coclude from Fact 6 that the estimate K/ is accurate withi 1 2 with at least 66% probability 1 with at least 95% probability 3 2 with at least 99% probability Ad other cofidece itervals are also easy to obtai, from stadard tables of the ormal distributio 652 Polls with umeric aswers May polls demad umeric rather tha yes/o aswers How may glasses of wie do you drik daily? What is your salary? How old are you? Ad so o Suppose, for istace, that we wish to assess the overall educatioal levels (umber of years of schoolig) of residets of Sa Diego If we were to cosider all Sa Diegas of age 25, this would be a distributio with some ukow mea µ ad stadard deviatio σ We would like to estimate µ, the average educatioal level So we radomly pick people of age 25 ad ask them how may years they have spet i school Suppose their aswers are X 1,,X The empirical mea (the mea of the samples) is ad the empirical stadard deviatio is M = X 1 + +X (X1 M) S = 2 + +(X M) 2 It is reasoable to use M as a estimate of µ What kid of cofidece iterval ca we give for it? This is where the cetral limit theorem oce agai comes i Sice the X i are idepedet, we ca assert that ) M is distributed like N (µ, σ2 Thus we immediately have the followig cofidece itervals, from Fact 6 With 95% probability, M is lies betwee µ 2σ ad µ+ 2σ With 99% probability, M lies betwee µ 3σ ad µ+ 3σ Thus, for istace, whe usig M to estimate µ, we ca be 95% certai that it is accurate withi ± 2σ But wait: we do t kow σ So istead, it is customary to use the empirical stadard deviatio istead, ad to assert that M is accurate withi ± 2S, with 95% cofidece Returig to the schoolig example, suppose we poll = 400 people, ad fid that the empirical average ad stadard deviatio are M = 116 ad S = 41, respectively The we ca assert with 95% cofidece that the average umber of years of schoolig of Sa Diegas is 116± 2 41 400 = 116±041 6-6