Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for large sample sizes, the CLT gives a statemet about the sample meas. For example, if Y is a N0, 1 radom variable, ad {X i } are distributed iid with mea µ ad variace σ 2, the X µ P σ/ A PY A I particular, if we wat a iterval i which Y lads with probability 0.95, we look olie or i a book for a z table, which will tell us that for a N0, 1 radom variable Y, ice PY 1.96, 1.96 = P 1.96 Y 1.96 = 0.95 X µ σ/ is early N0, 1 distributed, this meas P 1.96 X µ σ/ 1.96 = 0.95 From the above statemet we ca make statemets about experimets i order to quatify cofidece ad accept or reject hypotheses. Cofidece Itervals uppose that durig the presidetial electio, we were iterested i the proportio p of the populatio that preferred Hillary Clito to Doald Trump. It would t be feasible to call every sigle perso i the coutry ad write dow who they prefer. Istead, we ca take a buch of samples, X 1,..., X where 1 if perso i prefers Hillary X i = 0 otherwise
42 seeig theory The the sample mea X = 1 i=1 X i is the proportio of our sample that prefers Hillary. Let p be the true proportio that prefer Hillary p is ot kow. Note that E X = p, sice each X i is 1 with probability p ad 0 with probability 1 p. The by the CLT, X p σ/ N0, 1 ice we do t kow the true value of σ, we estimate it usig the sample variace, defied 2. = 1 1 i=1 X i X 2 This is a cosistet estimator for σ 2, so for large, the probability that it differs greatly from the true variace σ 2 is small. Hece we 1 ca replace σ i our expressio with = 1 i=1 X i X 2. ice X p / is approximately N0, 1 distributed, we have P 1.96 X p / 1.96 = 0.95 Rearragig the expressio for p, we have P 1.96 P 1.96 P 1.96 X p 1.96 X p 1.96 + X p X 1.96 = 0.95 X = 0.95 = 0.95 Eve though we do ot kow the true value for p, we ca coclude from the above expressio that with probability 0.95, p is cotaied i the iterval X 1.96, X + 1.96 This is called a 95% cofidece iterval for the parameter p. This approximatio works well for large values of, but a rule of thumb is to make sure > 30 before usig the approximatio. O the website, there is a cofidece iterval visualizatio. Try selectig the Uiform distributio to sample from. Choosig a sample size of = 30 will cause batches of 30 samples to be picked, their sample meas computed, ad their resultig cofidece itervals displayed o the right. Depedig o the cofidece level picked the above example uses α = 0.05, so 1 α = 0.95, the geerated cofidece itervals will cotai the true mea µ with probability 1 α.
frequetist iferece 43 Hypothesis Testig Let s retur to the example of determiig voter preferece i the 2016 presidetial electio. uppose we suspect that the proportio of voters who prefer Hillary Clito is greater tha 1 2, ad that we take samples, deoted {X i } i=1 from the U.. populatio. Based o these samples, ca we support or reject our hypothesis that Hillary Clito is more popular? Ad how cofidet are we i our coclusio? Hypothesis testig is the perfect tool to help aswer these questios. Costructig a Test A hypothesis i this cotext is a statemet about a parameter of iterest. I the presidetial electio example, the parameter of iterest was p, the proportio of the populatio who supported Hillary Clito. A hypothesis could the be that p > 0.5, i.e. that more tha half of the populatio supports Hillary. There are four major compoets to a hypothesis test. 1. The alterative hypothesis, deoted H a, is a claim we would like to support. I our previous example, the alterative hypothesis was p > 0.5. 2. The ull hypothesis, deoted H 0 is the opposite of the alterative hypothesis. I this case, the ull hypothesis is p 0.5, i.e. that less tha half of the populatio supports Hillary. 3. The test statistic is a fuctio of the sample observatios. Based o the test statistic, we will either accept or reject the ull hypothesis. I the previous example, the test statistic was the sample mea X. The sample mea is ofte the test statistic for may hypothesis tests. 4. The rejectio regio is a subset of our sample space Ω that determies whether or ot to reject the ull hypothesis. If the test statistic falls i the rejectio regio, the we reject the ull hypothesis. Otherwise, we accept it. I the presidetial electio example, the rejectio regio would be RR: {x 1,..., x : X > k} This otatio meas we reject if X falls i the iterval k,, where k is some umber which we must determie. k is determied by the Type I error, which is defied i the ext sectio. Oce k is computed, we reject or accept the ull hypothesis depedig o the value of our test statistic, ad our test is complete.
44 seeig theory Types of Error There are two fudametal types of errors i hypothesis testig. They are deoted Type I ad II error. Defiitio 0.0.16. A Type I error is made whe we reject H 0 whe it is i fact true. The probability of Type I error is typically deoted as α. I other words, α is the probability of a false positive. Defiitio 0.0.17. A Type II error is made whe we accept H 0 whe it is i fact false. The probability of Type II error is typically deoted as β. I other words, β is the probability of a false egative. I the cotext of hypothesis testig, α will determie the rejectio regio. If we restrict the probability of a false positive to be less tha 0.05, the we have P X RR H 0 0.05 i.e. our test statistic falls i the rejectio regio meaig we reject H 0, give that H 0 is true, with probability 0.05. Cotiuig alog our example of the presidetial electio, the rejectio regio was of the form X > k, ad the ull hypothesis was that p 0.5. Our above expressio the becomes P X > k p 0.5 0.05 If > 30, we ca apply the CLT to say, P X p / > k p / k p p 0.5 = PY > / p 0.5 where Y is a N0, 1 radom variable. ice p 0.5 implies k p k 0.5 /, we must also have Hece, Y > k p / Y > k 0.5 / PY > k p / k 0.5 p 0.5 PY > / / o if we boud the probability o the right side of the iequality by 0.05, the we also boud the probability o the left the Type I error, α by 0.05. ice Y is distributed N0, 1, we ca look up a z table to fid that z 0.05 = 1.64, so PY > 1.64 = PY < 1.64 = 0.05
frequetist iferece 45 Lettig k 0.5 / regio. = 1.64, we ca solve for k to determie our rejectio k = 0.5 + 1.64 ice our rejectio regio was of the form X > k, we simply check whether X > 0.5 + 1.64. If this is true, the we reject the ull, ad coclude that more tha half the populatio favors Hillary Clito. ice we set α = 0.05, we are 1 α = 0.95 cofidet that our coclusio was correct. I the above example, we determied the rejectio regio by pluggig i 0.5 for p, eve though the ull hypothesis was p 0.5. It is almost as though our ull hypothesis was H 0 : p = 0.5 istead of H 0 : p 0.5. I geeral, we ca simplify H 0 ad assume the border case p = 0.5 i this case whe we are determiig the rejectio regio.
46 seeig theory p-values As we saw i the previous sectio, a selected α determied the rejectio regio so that the probability of a false positive was less tha α. Now suppose we observe some test statistic, say, the sample proportio of voters X who prefer Hillary Clito. We the ask the followig questio. Give X, what is the smallest value of α such that we still reject the ull hypothesis? This leads us to the followig defiitio. Defiitio 0.0.18. The p-value, deoted p, is defied p = mi{α 0, 1 : Reject H 0 usig a α level test} i.e. the smallest value of α for which we still reject the ull hypothesis. This defiitio is t that useful for computig p-values. I fact, there is a more ituitive way of thikig about them. uppose we observe some sample mea X 1. Now suppose we draw a ew sample mea, X 2. The p-value is just the probability that our ew sample mea is more extreme tha the oe we first observed, assumig the ull hypothesis is true. By extreme we mea, more differet from our ull hypothesis. Below we go through a example which verifies that the ituitive defiitio give above agrees with Defiitio 5.3. Example 0.0.10. uppose that we sampled people ad asked which cadidate they preferred. As we did before, we ca represet each perso as a idicator fuctio, 1 if perso i prefers Hillary X i = 0 otherwise The X is the proportio of the sample that prefers Hillary. After takig the samples, suppose we observe that X = 0.7. If we were to set up a hypothesis test, our hypotheses, test statistic, ad rejectio regio would be H 0 : q 0.5 H a : q > 0.5 Test statistic: X RR: {x 1,..., x : X > k} where q is the true proportio of the etire U.. populatio that favors Hillary. Usig the ituitive defiitio, the p value is the probability that we observe somethig more extreme tha 0.7. ice the ull hypothesis is that q 0.5, more extreme i this case meas, bigger tha 0.7. o the p-value is the probability that, give a ew sample, we observe the ew X is
frequetist iferece 47 greater tha 0.7, assumig the ull, i.e. that q 0.5. Normalizig X, we have X 0.5 P X > 0.7 H 0 = P / 0.7 0.5 > / 0.7 0.5 P Y > /. = p 4. where Y N0, 1. We would the compute the value z p = 0.7 0.5 / by pluggig i the sample stadard deviatio,, ad the umber of samples we took,. We would the look up a z table ad fid the probability correspodig to z p, deoted p this is our p value. We ow claim that this p is equal to the smallest α for which we reject the ull hypothesis, i.e. that our ituitive defiitio of a p-value agrees with Defiitio 5.3. To show that p = mi{α 0, 1 : Reject H 0 usig a α level test}, we eed to show that for ay α < p, we accept the ull hypothesis. We also eed to show that for ay α p, we reject the ull hypothesis. Case 1: uppose α < p. We eed to show that the test statistic X = 0.7 falls i the acceptace regio determied by α. Usig a z table, we could fid z α such that α = PY > z α P X 0.5 / > z α H 0 = P X > z α + 0.5 H 0 ice the RH of the above expressio is the probability of Type I error, the rejectio regio is determied by. X > k α = z α + 0.5 ice α < p, the correspodig z p such that p = PY > z p satisfies z p < z α. By the RH of expressio 1, 0.7 0.5 p = P Y > / which implies z p = 0.7 0.5 / z p + 0.5 = 0.7. This implies that 0.7 = z p + 0.5 < z α + 0.5 = k α Therefore X = 0.7 < k α implies X = 0.7 is i the acceptace regio determied by α. Hece, we accept the ull hypothesis for ay α < p. Case 2: uppose α p. We eed to show that the test statistic X = 0.7 falls i the rejectio regio determied by α. By reasoig similar to the kid i Case 1, we would have z α z p. This implies k α. = z α + 0.5 z p + 0.5 = 0.7 Hece X = 0.7 k α implies that X = 0.7 is i the rejectio regio determied by α. Hece, we reject the ull hypothesis for ay α p.
48 seeig theory Example 5.4 above justifies the defiitio of p-values which gives a easy way to compute them. Give some observatio of our test statistic X, we compute the p-value by calculatig the probability of seeig somethig more differet or extreme" tha our observed X, assumig H 0 is true. By the argumet i Example 5.4, this value is the same as the smallest α level for which we reject H 0.