A Confidence Interval for μ - PDF Free Download

INFERENCES ABOUT μ Oe of the major objectives of statistics is to make ifereces about the distributio of the elemets i a populatio based o iformatio cotaied i a sample. Numerical summaries that characterize the populatio distributio are called parameters. The populatio mea μ ad populatio variace σ 2 are two importat parameters. Others are media, rage, mode, etc. 1 Poit ad Iterval Estimatio of μ whe σ is kow ad is large Poit estimatio of μ does ot require that σ be kow or a large The poit estimate of μ is the sample mea ȳ Poit estimates by themselves do ot tell how much ȳ might differ from μ, that is, the accuracy or precisio of the estimate A measure of accuracy is the differece betwee sample mea ȳ ad populatio mea μ is called samplig error 3 Methods for makig ifereces are basically desiged to aswer oe of two types of questios: (a Approximately what is the value of the parameter? or (b Is the value of the parameter less tha (say 6? Statisticias aswer the first questio by estimatig the parameter usig the sample. The secod case might require a test of a hypothesis. 2 A Cofidece Iterval for μ A iterval estimate, called a cofidece iterval, icorporates iformatio about the amout of samplig error i ȳ A cofidece iterval for μ takes the form (ȳ E, ȳ + E, for a umber E A associated umber called the cofidece coefficiet helps assess how likely it is for μ to be i the iterval. 4

To derive the specific form of the cofidece iterval for μ, for the case whe σ is kow ad is large, the CLT result must be used. By the CLT, Z =(Ȳ μ/σ ȳ is approximately N(0, 1. Let Z have a N(0, 1 distributio (exactly. Let z α/2 deote the 1 α/2 quatile of the stadard ormal distributio, for a give umber α, 0 <α<1. The the followig probability statemet is true: ( P z α/2 (Ȳ μ σȳ z α/2 =1 α 5 Usig this statemet, a (1-α100% cofidece iterval for μ, with cofidece coefficiet 1 α ca be calculated usig a radom sample of data y1,y2,...,y It is usually writte i the form (ȳ σ z α/2, ȳ + σ z α/2. Example: Suppose = 36, σ =12, ȳ =24.8 Ad for α =0.05, z α/2 z0.025 =1.96 7 Maipulatig the iequalities, without chagig values, we have P (σȳ z α/2 μ Ȳ σ ȳ z α/2 =1 α P (Ȳ σ ȳ z α/2 μ Ȳ + σ ȳ z α/2 =1 α If, for example, α =0.05the P (Ȳ σ z.025 μ Ȳ + σ z.025 =0.95 ad, sice z0.025 =1.96, P (Ȳ 1.96 σ μ Ȳ +1.96 σ =0.95 6 Thus a 95% C.I. for μ is ( 24.8 12 1.96, 24.8+ 12 1.96 36 36 The iterval for μ is: (20.88, 28.72 with cofidece coefficiet 0.95. We might say that we are 95% cofidet that the populatio mea μ is betwee 20.88 ad 28.72. But what do we actually mea whe we say that we are 95% cofidet? 8

Iterpretatio of a Cofidece Iterval Before the sample is draw, the probability is (1 α that the radom iterval (Ȳ σ z α/2, Ȳ + σ z α/2 will cotai μ (because Ȳ is a radom variable. However, oce the sample is draw, ad ȳ is calculated, the iterval ceases to be radom. It is a umerical iterval (ȳ σ z α/2, ȳ + σ z α/2, calculated specifically for the draw sample; thus we caot associate a probability with it. We may say that the process which led us to this iterval will, o the average, produce a iterval cotaiig μ, 100(1 α% ofthetime. 9 Notes: I the textbook Example 5.1 uses s, the sample stadard deviatio, i place of σ. Of course, s is a poit estimate of σ. Thus it cotais samplig error. However, whe is large, σ is sometimes approximated by s because σ is ot kow. As the cofidece coefficiet icreases the cofidece iterval becomes wider. A wider iterval estimates μ less precisely. Thus a 99% cofidece iterval is less accurate tha a 95% cofidece iterval but oe has more cofidece that μ is i the first iterval. Also ote that icreasig the sample size results i a arrower iterval (more accurate estimate for the same cofidece coefficiet. 11 It is ot correct to say that the iterval (ȳ σ z α/2, ȳ + σ z α/2 cotais μ with a specified probability. If the process of samplig is repeated may times ad 100(1- α% cofidece itervals calculated for each sample, the we are cofidet that 100(1-α% of those itervals will cotai μ. This idea is illustrated i the picture below: 10 Exercise 5.8: The caffeie cotet (i mg was examied for a radom sample of 50 cups of black coffee dispesed by a ew machie. The mea ad stadard deviatio were 110 mg ad 7.1 mg, respectively. Use these data to costruct a 98% cofidece iterval for μ, the mea caffeie cotet for cups dispesed by the machie. As the sample size is large we ca use the CLT ad also approximate σ, the sample stadard deviatio of the populatio by s =7.1 =50 ȳ = 110, σ =7.1, α =.02, z0.01 =2.33 12

] ȳ z0.01 σ, ȳ + z0.01 σ [ ( 110 (2.33 7.1 50 ( ], 110 + (2.33 7.1 50 (107.66, 112.34 We are 98% cofidet that the mea caffeie cotet for cups dispesed by the machie is betwee 107.66 ad 112.34 mg. 13 This will result i a iterval o wider tha (ȳ E, ȳ + E. Example: Give σ =12, α =0.05, z0.025 =1.96 What sample size will give a iterval o wider that 5.6? We set E =2.8 so (1.962 144 (2.8 2 =70.56 Thus the experimeter must choose a sample size =71at least. 15 Choosig the Sample Size The width of a cofidece iterval is (2 σ z α/2 /. It ca be made smaller by chagig to a larger α or icreasig sample size. Let us cosider selectio of to achieve a desired width 2 E for a fixed α/ We wat E to at least equal to σ z α/2 / Thus we eed to select a sample so that (z α/2 2 σ 2 E 2 14 Choosig the Sample Size (cotiued Would the sample size =71esure the width to be 5.6 if the populatio has a larger variace? Would =71be eough? The aswer is NO sice the formula (z α/2 2 σ 2 E 2 ivolves the populatio variace σ 2. Precisio i estimatio depeds o both α ad σ 2. If the variace of populatio elemets is very small, i.e., the elemets are tightly clustered about the populatio mea μ, the oly a small sample is eeded for the estimate ȳ to be very ear μ. 16 [

Statistical Tests for μ Estimatio (either poit or iterval estimatio was used to help aswer a questio like Approximately what is the value of μ? The other kid of questio metioed earlier is Is it likely that μ is less tha (or greater tha the value μ0 (a predetermied value?. A Test of Hypothesis is used to aswer this kid of questio. As i estimatio, the sample mea ȳ of a radom sample of elemets from the populatio is used to aswer this questio. 17 Example: To determie whether the mea yield per acre (i bushels, μ, of a variety of soybeas icreased the curret year over thelasttwoyearswheμ is believed to be 520 bushels per acre, the followig might be tested. H0 : μ 520 vs. Ha : μ>520 The fact that μ may equal a specific value is always icluded i the ull hypothesis. The decisio to state whether the data supports the research hypothesis or ot is based o a quatity computed from the data called the test statistic. 19 Every test of hypothesis features (a a Null Hypothesis H0 which describes a characteristic of the populatio (it is believed to be as it curretly exists, (b a Research Hypothesis (or Alterative Hypothesis Ha, which is a proposal about this characteristic, by the perso(s coductig the statistical study. The idea is that the ull hypothesis is presumed to hold uless there is overwhelmig evidece i the data i support of the research hypothesis. 18 If ȳ is i the rejectio regio the reject H0 ad say the evidece favors Ha. Basically, the test amouts to computig ȳ ad lookig at its value relative to μ0 ad the μ values i Ha. For example, i the above example if ȳ<520 we will say there is ot sufficiet evidece to reject H0. Eve if ȳ>520, we might still say there is ot sufficiet evidece to reject H0 This is ok so log as ȳ is ot too much greater tha 520. How much is too large? To decide this, use the probability distributio of Ȳ. Begi by pickig a small probability α like α = 0.05, or α =0.01, orα =0.001. 20

The reject H0 oly whe the probability of obtaiig a value of ȳ larger tha the observed value ȳ is α whe μ is ideed 520, i.e., whe H0 is true. That is, if H0 is true, the chace of observig a sample that results i a ȳ as large as the oe calculated should be very small. 21 That is, P (Ȳ > μ 0 + zα σ α. This suggests that we reject H0 : μ μ0 i favor of, say Ha : μ>μ0 whe ȳ>μ0 + zα σ (i.e., ȳ is too much whe ȳ exceeds the umber μ0 + zα σ All possible ȳ values satisfyig ȳ>μ0 + zα σ costitute the rejectio regio. Istead of comparig ȳ to μ0 + zα σ, it is easier to first calculate zc = ȳ μ 0 σ/ 23 How to determie the rejectio regio By the CLT, Ȳ is approximately a ormal radom variable. σ2 We will igore the approximatio ad assume Ȳ N(μ,. The, for a give α, (Ȳ μ P σ/ >z α = α where zα is the 1 α quatile of the stadard ormal distributio. So whe H0 : μ μ0 is true, (Ȳ μ 0 P σ/ >z α α 22 We see that comparig ȳ to μ0 + zα σ is the same as comparig zc to zα. That is, we reject the ull hypothesis if zc >zα whis is the same thig as doig so if ȳ>μ0 + zα σ This quatity z is called the test statistic ad it s value ca be calculated usig the data ad the value μ0 specified i the ull hypothesis H0. The otatio zc is used deote the computed value of the test statistic, that is whe we plug i the observed value of ȳ ad obtai a umerical value for z. 24

Type I Error The α correspods to the probability that the ull hypothesis is rejected whe actually it is true ad is called the probability of committig a Type I error. Sice the experimeter selects the value of α used i the test procedure, she is able to specify or cotrol thetypeierror rate or how much of this type of error is permitted i the testig procedure. 25 4. R.R: Reject H0 : i favor of Ha : if z>z.025 i.e. the R.R. is z>1.96 sice z.025 = 1.96 5. Compute the observed value of the test statistic: zc = 573 520 124/ =2.56 36 Decisio: Sice zc > 1.96, zc is i the R.R. Thus H0 : μ 520 is rejected i favor of the research hypothesis Ha : μ>520. It is cocluded that the mea yield this year exceeds 520 bushels/acre. Note that the above test procedure is equivalet to determiig that the observed value for ȳ lies more tha 1.96 stadard deviatios above the mea μ0 = 520. 27 Example 5.5 Suppose from a sample of 36 1-acre plots, the yield of cor this year was measured ad ȳ = 573 ad s = 124 calculated. Ca we coclude that the mea yield of cor for all farms exceeded 520 bushels/acre this year? Here we are goig to assume that σ ca be approximated by s. Useα =.025 Solutio: Set-up five parts of the testig procedure: 1. H0 : μ 520 2. Ha : μ>520 3. T.S.: z = ȳ μ 0 σ/ 26 Summary of Test Procedures:( assume large, ad σ kow. Hypotheses: Case 1: H0 : μ μ0 vs. Ha : μ>μ0 (right-tailed test Case 2: H0 : μ μ0 vs. Ha : μ<μ0 (left-tailed test Case 3: H0 : μ = μ0 vs. Ha : μ μ0 (two-tailed test T.S: zc = ȳ μ 0 σ/ R.R: For Type I error probability of α: Case 1: Reject H0 : if z zα Case 2: Reject H0 : if z zα Case 3: Reject H0 : if z z α/2 28

Decisio: If the computed value of the test statistic, zc, isi the R.R., the we will reject the ull hypothesis H0 : at the specified α value. Otherwise, we say we fail to reject H0 : at the specified α value. Example 5.6 A corporatio maitais a large fleet of compay cars for its salespeople. To check the average umber of miles drive per moth per car, a radom sample of =40cars is examied. The mea ad stadard deviatio for the sample are 2,752 miles ad 350 miles, respectively. Records for previous years idicate that the average umber of miles drive per car per moth was 2,600. Use the sample data to test the research hypothesis that the curret mea μ differs from 2,600. Set α =.05adassumethat σ ca be replaced by s. 29 lies away from μ = 2,600, compute zc = ȳ μ 0 σ/ 2, 752 2, 600 = 350/ 40 =2.75. Thus zc =2.75 > 1.96 ad therefore we reject H0 : μ =2, 600 at α =.05. It follows that the observed value for ȳ lies more tha 1.96 stadard errors above the mea μ =2, 600, so we reject the ull hypothesis i favor of the alterative Ha : μ 2, 600. Sice ȳ>2, 600 we coclude that the mea umber of miles drive is greater tha 2, 600. 31 Solutio The ull hypothesis for this statistical test is H0 : μ = 2,600 ad the research hypothesis is Ha : μ 2,600. Usig α =.05, the two-tailed rejectio regio for this test is z >z.025 or z > 1.96 ad is located as show below. To determie how may stadard errors our test statistics ȳ 30 Level of Sigificace or the p-value of a Statistical Test As a alterative to the formal test where oe uses a rejectio regio based o a specified the Type I error rate α, may researchers compute ad report the level of sigificace or the p-value for the test. This is the probability, whe H0 is true, of observig a statistic as extreme as the oe actually observed. Here extreme is meas large or small accordig to the alterative hypothesis Ha. 32

Specifically, for a give σ 2 ad μ0 we fid the p-value by: 1. First computig the test statistic, zc, zc =(ȳ μ0/(σ/ 2. a If Ha : μ>μ0, thep = P (Z >zc. b If Ha : μ<μ0, thep = P (Z <zc. c If Ha : μ μ0, thep =2P (Z > zc. A p-value smaller tha the pre-specified α value is evidece i favor of rejectig H0. 33 Ifereces About μ whe σ is ukow For large sample sizes, it follows from the CLT that Ȳ is σ2 approximately Normally distributed i.e., Ȳ N(μ,. It follows that the radom variable T 1 = (Ȳ μ / ( S/ has approximately the Studet s t distributio with 1 degrees of freedom. Here Y1,Y2,...,Y are samplig radom variables ad Ȳ = Yi is thus a radom variable. 1 i 35 Example 5.12: I Example 5.7 we tested H0 : μ 380 vs. Ha : μ>380 Calculatig the z statistic, zc = ȳ 380 σ/ 390 380 = 35.2/ 50 =2.01 The level of sigificace, orp-value for this test is p = P (Z >2.01 = 1 P (Z <2.01 = 0.0222 We fail to reject H0 : μ 380 at α =.01 sice p-value is ot less tha.01 34 The deomiator of T 1 is the radom variable S = 1 1 i=1 (Yi Ȳ 2. If samplig from a Normal distributio, the sample mea Ȳ has a Normal distributio (exactly ad therefore (Ȳ μ/(s/ will have a Studet s t distributio (exactly, regardless of sample size. 36

I Chapter 5 util ow, we have assumed that is large ad σ is kow, or is large ad is large eough also to use s as a approximatio to σ. Sice we used the CLT to derive cofidece limits ad tests of hypotheses, the exact ature of the sampled populatio was ot required to be specified. Now we require that the populatio distributio to be Normal. We do t eed to kow σ or have a large sample, i.e. to be large. I this situatio, for ay, (Ȳ μ/(s/ will have a Studet s t-distributio with d.f.= 1. 37 Properties of the Studet s t distributio There are may t-distributios each specified by a sigle parameter called degrees of freedom (df. Like the stadard ormal populatio, the distributio is symmetric about 0 ad has mea equal to 0. The t-distributio has variace df /(df 1, ad hece is more variable tha the stadard ormal distributio which has variace equal to 0. We say that the t-distributio has heavier tails tha the stadard ormal distributio. As the degrees of freedom df icreases, the t-distributio approaches that of the stadard ormal distributio. 39 Usig this fact we ca obtai cofidece itervals ad coduct tests of hypotheses eve though σ is ukow. Note the differece betwee Z ad T 1 is that the parameter σ is i the deomiator of Z. That is, the poit estimator of σ, S is i the deomiator of T 1. 38 Thus as the sample size icreases the distributio of the T 1 radom variable approaches the stadard ormal distributio. 40

Table 2, page 1093, gives quatiles 0.90, 0.95, 0.975, 0.99, 0.995, ad 0.999 for t distributios with selected df. Examples: For df =10 P (T10 > 1.372 = 0.10 P (T10 < 1.372 = 0.90 P (T10 > 1.812 = 0.05 If α = 0.05, the tα satisfies P (T10 >t.05 =0.05 ad from Table 2, t.05 =1.812. Ifα = 0.05, the t α/2 t0.025 =2.228 Compariso of Normal ad t Quatiles tα with idicated df α zα 10 20 30 40 60 240 0.1 1.28 1.37 1.32 1.31 1.30 1.30 1.28 0.05 1.65 1.81 1.72 1.70 1.68 1.67 1.65 0.01 2.33 2.76 2.53 2.46 2.42 2.39 2.34 41 43 Cofidece Iterval for μ based o the t-distributio Let the sample size be, ad for α beaspecifiedvalue,say, for e.g..05. A (1 α100% cofidece iterval for μ is give by ( s s ȳ t α/2, ȳ + t α/2 This may also be writte as ( ȳ t α/2 sȳ, ȳ + t α/2 sȳ Here, t α/2 is the 1 α/2 percetile of the t-distributio with 1 degrees of freedom ad sȳ is the stadard error of the mea. 42 Test Procedures based o the t-distributio: Hypotheses: Case 1:H0 : μ μ0 vs. Ha : μ>μ0 (right-tailed test Case 2: H0 : μ μ0 vs. Ha : μ<μ0 (left-tailed test Case 3 H0 : μ = μ0 vs. Ha : μ μ0 (two-tailed test T.S: tc = ȳ μ 0 s/ R.R: For Type I error probability of α: Case 1: Reject H0 : if t t α,( 1 Case 2: Reject H0 : if t t α,( 1 Case 3: Reject H0 : if t t α/2,( 1 44

Level of Sigificace (p-value: Case 1: p = P (T 1 >tc. Case 2: p = P (T 1 <tc. Case 3: p =2P (T 1 > tc. Exercise 5.15 A massive multistate outbreak of food-bore illess was attributed to Salmoella eteritidis. Epidimiologists determied that the source of the illess was ice cream. They sampled ie productio rusfrom the compay that produced the ice cream to determie the level of Salmoella eteritidis i the ice cream. 45 From the data ȳ =.456 ad s =.2128 arecomputed,givig tc = ȳ μ 0 s/.456.3 =.2128/ 9 =2.21 Because for the oe-tailed test we eed t α,( 1 ; we look up t.01 with df =9 1=8.Itis2.896 Thus the rejectio regio is: t>2.896 47 These levels (MPN/g are as follows.593.142.329.691.231.793.519.392.418 Use the data to determie whether the mea level of Salmoella eteritidis i the ice cream is greater tha.3 MPN/g with α =.01 Solutio: Need to test H0 : μ.3 vs. Ha : μ>.3 Because of the small sample size, we eed to examie whether the data have bee sampled from a ormal distributio. To do this a ormal probability plot isagoodtool. 46 Sice tc =2.214 does ot exceed 2.896 it is ot i the R.R. Thus, there is isufficiet evidece i the data to reject H0 i.e to say that the mea level of Salmoella eteritidis exceeds the dagerous level of.3 MPN/g. The p-value to be computed is P (T8 > 2.21 To calculate this exactly usig the t-table is ot possible sice it is tabulated for oly a few values of a. However, we ca boud the p-value by otig that, for (df =8, 2.21 lies betwee 1.86 ad 2.306. This gives.025 < p-value <.05 showig that the p-value is ot less tha our α of.01. Thus we fail to reject H0. 48

OC Curve ad the Power of a test The probabilities of the four possible outcomes of a statistical test are: Null Hypothesis Decisio True False Reject H0 TypeIError Correct Decisio α 1 β Accept H0 Correct Decisio Type II Error 1 α β The probabilities of Type I ad II error are α ad β, respectively. 49 The implicatio of this is that whe, based o a test, we fid that we caot reject H0, we will ot say that we accept H0. Because if we say we accept H0 ad β turs out to be large, the the probability is large that we will be committig a Type II error. Thus the correct way to state the decisio is to say, that we fail to reject H0. I practice, β probabilities for several choices of μ (call them β(μ are calculated ad plotted i a graph called the OC curve. The OC curve ca be used to read-off the Type II error probability for a specified set of values sample size ad α. 51 The experimeter ca cotrol oly the Type I error probability. We do this by specifyig a α foraexperimet (before the data values are measured. Whe we are testig a hypothesis about μ usig α for the test, the value of Type II error probability β depeds o the actual value μ which is ot kow. This is because β is the probability of icorrectly acceptig H0 whe Ha is true. Whe Ha is true, the actual value of μ may be ay value uder Ha, i.e, a value ot specified uder H0. Sice this value of μ is ukow, β caot be calculated. 50 Figure 5.12 shows the OC curve for the test of H0 : μ 84, Ha : μ>84 for a populatio with σ =1.4. For example, for α =.05, =10,itca be see that β(84.8.4 ad that β(84.8 decreases as sample size goes from 10 to 25. A coclusio that ca be made about this test from the OC curve is that Type II error probability would be <.1 for a actual μ>84.8 for =25. 52

Figure 5.11 i the textbook (ot reproduced here show how the Type II error probability varies with the value of μ uder the alterative (deoted by μa. Examples 5.8 ad 5.10 show the calculatio β for a particular value of μa ad uses formulas give o page 241 to calculate β. These illustratios use the experimets described i Examples 5.7 ad 5.9 (Read pp. 238/243 for full details Aother quatity that may be calculated for a test procedure for a specified value of μ is called the Power of the test ad is defied as 1 β(μ. The correspodig plot of power agaist a set of μ values is called the power curve. By defiitio, power of a test is the probability of rejectig H0 for a specified value of μ uder Ha. I practice, tests are desiged to have large power for some μ values of iterest so that they have small Type II error probabilities. We ca relate to this idea by thikig of a test as havig very good power if it has a very good chace of detectig whether a chage i μ has actually occured. This is usually doe by selectig a sample size to be used for the experimet so that the desired power is achieved for a specified μ ad α. 53 54 Usig Type II Error Probability β Curves Cosider the Salmoella example agai. We have =9, ad α =.01. Thus df =8ad we estimate σ.25 We ca compute the values of d for several values of μa. The we ll read β for those values of d usig the graph i Table 3 i the Appedix (see ext slide for the curves for α =.01. As a example, for μa =.45, d = μ a μ0 σ =.45.3.25 =.6 Correspodig to d =.6 o the horizotal axis, usig the curve for df =8we see that β(.45 =.79, approx. Similarly, for μa =.45 d =1.0, ad thus β(.55 =.43, approx. We ca costruct a table as show (ext slide. 55 56

Departures from Normality Whe Ȳ is a Normal radom variable, i.e., whe samplig from a Normal populatio with mea μ0, S/ is a T 1 Ȳ μ0 radom variable. This is a theoretical fact. I practice, however, we ever sample exactly from a Normal populatio, so (Ȳ μ 0/S/ will be oly approximately T 1. How much effect ca this have o C.I. s ad tests we costruct? For symmetrically distributed populatios ad ot too small there is little to worry about. 57 Usig Cofidece Itervals to Test Hypotheses We ca always look at a (1-α100% cofidece iterval ad see what the result of a test would be if we carry out the test. For example, cosider the (1-α100% cofidece iterval for μ ( s s ȳ t α/2, ȳ + t α/2 Suppose that μ0 is ot icluded i the above cofidece iterval because s ȳ + t α/2 <μ0 59 For highly skewed populatio distributios the approximatio ca be terrible especially for small. It is recommeded that oe look at a boxplot, Normal plot, ad/or other graphics to see whether severe skewess of the samplig populatio is idicated. If ot, proceed to use the t-distributio. If yes, oe ca use a oparametric procedure or use a trasformatio. We will look at some of these later. 58 By rearragig this we see that this is equivalet to tc beig i the rejectio regio, i.e., ȳ μ0 s/ t α/2,( 1 Observe carefully that this is the same rejectio regio for the test of H0 : μ μ0 vs. Ha : μ>μ0 at level α/2. That is, we will be rejectig H0 at level α/2 if ȳ μ0 s/ t α/2,( 1 This is equivalet to sayig that if μ0 was ot icluded i a 100(1-α% iterval, the H0 will be rejected if the test is carried out at α/2 level. 60

For atwo-tailedtest, the cofidece iterval should be based o the same α as the test to make this iferece. I summary, To test H0 : μ μ0, Ha : μ>μ0, at level α/2, usea 100(1-α% cofidece iterval. To test H0 : μ μ0, Ha : μ<μ0, at level α/2, usea 100(1-α% cofidece iterval. H0 : μ = μ0, Ha : μ μ0, at level α, use a 100(1-α% cofidece iterval. 61