Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Size: px

Start display at page:

Download "Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants"

Bryan Mitchell Nash
6 years ago
Views:

1 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37

2 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants Average running time was minutes. Were runners faster in 2012? To answer this question, select n runners from the 2012 race at random and denote by X 1,...,X n their running time. 2/37

3 Cherry Blossom run (2) We can see from past data that the running time has Gaussian distribution. The variance was /37

4 Cherry Blossom run (3) We are given i.i.d r.v X 1,...,X n and we want to know if X 1 N(103.5, 373) This is a hypothesis testing problem. There are many ways this could be false: 1. IE[X 1 ] = var[x 1 ] = X 1 may not even be Gaussian. We are interested in a very specific question: is IE[X 1 ] < 103.5? 4/37

5 Cherry Blossom run (4) We make the following assumptions: 1. var[x 1 ] = 373 (variance is the same between 2009 and 2012) 2. X 1 is Gaussian. The only thing that we did not fix is IE[X 1 ] = µ. Now we want to test (only): Is µ = or is µ < 103.5? By making modeling assumptions, we have reduced the number of ways the hypothesis X 1 N(103.5, 373) may be rejected. The only way it can be rejected is if X 1 N(µ, 373) for some µ < We compare an expected value to a fixed reference number (103.5). 5/37

6 Cherry Blossom run (5) Simple heuristic: If X n < 103.5, then µ < This could go wrong if I randomly pick only fast runners in my sample X 1,...,X n. Better heuristic: If X n < (something that 0), then µ < n To make this intuition more precise, we need to take the size of the random fluctuations of X n into account! 6/37

7 Clinical trials (1) Pharmaceutical companies use hypothesis testing to test if a new drug is efficient. To do so, they administer a drug to a group of patients (test group) and a placebo to another group (control group). Assume that the drug is a cough syrup. Let µ control denote the expected number of expectorations per hour after a patient has used the placebo. Let µ drug denote the expected number of expectorations per hour after a patient has used the syrup. We want to know if µ drug < µ control We compare two expected values. No reference number. 7/37

8 Clinical trials (2) Let X 1,...,X ndrug denote n drug i.i.d r.v. with distribution Poiss(µ drug ) Let Y 1,...,Y ncontrol denote n control i.i.d r.v. with distribution Poiss(µ control ) We want to test if µ drug < µ control. Heuristic: If X drug < X control (something that 0), then conclude that µ drug < µ control n drug n control 8/37

9 Heuristics (1) Example 1: A coin is tossed 80 times, and Heads are obtained 54 times. Can we conclude that the coin is significantly unfair? iid n = 80, X 1,...,X n Ber(p); X n = 54/80 =.68 If it was true that p =.5: By CLT+Slutsky s theorem, X n.5 n N(0, 1). J.5(1.5) X n.5 n 3.22 J.5(1.5) Conclusion: It seems quite reasonable to reject the hypothesis p =.5. 9/37

10 Heuristics (2) Example 2: A coin is tossed 30 times, and Heads are obtained 13 times. Can we conclude that the coin is significantly unfair? iid n = 30,X 1,...,X n Ber(p); X n = 13/30.43 If it was true that p =.5: By CLT+Slutsky s theorem, X n.5 n N(0, 1). J.5(1.5) X n.5 Our data gives n J.77.5(1.5) The number.77 is a plausible realization of a random variable Z N(0, 1). Conclusion: our data does not suggest that the coin is unfair. 10/37

11 Statistical formulation (1) Consider a sample X 1,...,X n of i.i.d. random variables and a statistical model (E, (IP θ ) θ Θ ). Let Θ 0 and Θ 1 be disjoint subsets of Θ. Consider the two hypotheses: { H 0 : θ Θ 0 H 1 : θ Θ 1 H 0 is the null hypothesis, H 1 is the alternative hypothesis. If we believe that the true θ is either in Θ 0 or in Θ 1, we may want to test H 0 against H 1. We want to decide whether to reject H 0 (look for evidence against H 0 in the data). 11/37

12 Statistical formulation (2) H 0 and H 1 do not play a symmetric role: the data is is only used to try to disprove H 0 In particular lack of evidence, does not mean that H 0 is true ( innocent until proven guilty ) A test is a statistic ψ {0, 1} such that: If ψ = 0, H 0 is not rejected; If ψ = 1, H 0 is rejected. Coin example: H 0 : p = 1/2 vs. H 1 : p = 1/2. { X n.5 ψ = 1I n J.5(1.5) } > C, for some C > 0. How to choose the threshold C? 12/37

13 Statistical formulation (3) Rejection region of a test ψ: R ψ = {x E n : ψ(x) = 1}. Type 1 error of a test ψ (rejecting H 0 when it is actually true): α ψ : Θ 0 IR θ IP θ [ψ = 1]. Type 2 error of a test ψ (not rejecting H 0 although H 1 is actually true): β ψ : Θ 1 IR θ IP θ [ψ = 0]. Power of a test ψ: π ψ = inf (1 β ψ (θ)). θ Θ 1 13/37

14 Statistical formulation (4) A test ψ has level α if α ψ (θ) α, θ Θ 0. A test ψ has asymptotic level α if In general, a test has the form lim α ψ (θ) α, θ Θ 0. n ψ = 1I{T n > c}, for some statistic T n and threshold c IR. T n is called the test statistic. The rejection region is R ψ = {T n > c}. 14/37

15 Example (1) iid Let X 1,...,X n Ber(p), for some unknown p (0, 1). We want to test: H 0 : p = 1/2 vs. H 1 : p = 1/2 with asymptotic level α (0, 1). pˆn 0.5 Let T n = n J.5(1.5), where pˆn is the MLE. If H 0 is true, then by CLT and Slutsky s theorem, Let ψ α = 1I{T n > q α/2 }. IP[T n > q α/2 ] 0.05 n 15/37

16 Example (2) Coming back to the two previous coin examples: For α = 5%, q α/2 = 1.96, so: In Example 1, H 0 is rejected at the asymptotic level 5% by the test ψ 5% ; In Example 2, H 0 is not rejected at the asymptotic level 5% by the test ψ 5%. Question: In Example 1, for what level α would ψ α not reject H 0? And in Example 2, at which level α would ψ α reject H 0? 16/37

17 p-value Definition The (asymptotic) p-value of a test ψ α is the smallest (asymptotic) level α at which ψ α rejects H 0. It is random, it depends on the sample. Golden rule p-value α H 0 is rejected by ψ α, at the (asymptotic) level α. The smaller the p-value, the more confidently one can reject H 0. Example 1: p-value = IP[ Z > 3.21].01. Example 2: p-value = IP[ Z >.77] /37

18 Neyman-Pearson s paradigm Idea: For given hypotheses, among all tests of level/asymptotic level α, is it possible to find one that has maximal power? Example: The trivial test ψ = 0 that never rejects H 0 has a perfect level (α = 0) but poor power (π ψ = 0). Neyman-Pearson s theory provides (the most) powerful tests with given level. In , we only study several cases. 18/37

19 The χ 2 distributions Definition For a positive integer d, the χ 2 (pronounced Kai-squared ) distribution with d degrees of freedom is the law of the random iid variable Z2 1 + Z Zd 2, where Z 1,...,Z d N(0, 1). Examples: If Z N d (0,I d ), then IZI 2 2 χ 2 d. Recall that the sample variance is given by S n = 1 n (Xi X n) 2 = 1 n 2 Xi n n i=1 i=1 (X n) 2 Cochran s theorem implies that for X 1,...,X n N(µ, σ 2 ), if S n is the sample variance, then iid χ 2 2 = Exp(1/2). ns n χ 2 n 1. σ 2 19/37

20 Student s T distributions Definition For a positive integer d, the Student s T distribution with d degrees of freedom (denoted by t d ) is the law of the random Z variable J V/d independent of V ). Example:, where Z N(0, 1), V χ 2 and Z V (Z is d iid Cochran s theorem implies that for X 1,...,X n N(µ, σ 2 ), if S n is the sample variance, then X n µ n 1 t n 1. S n 20/37

21 Wald s test (1) Consider an i.i.d. sample X 1,...,X n with statistical model (E, (IP θ ) θ Θ ), where Θ IR d (d 1) and let θ 0 Θ be fixed and given. Consider the following hypotheses: { H 0 : θ = θ 0 H 1 : θ = θ 0. Let θˆ MLE be the MLE. Assume the MLE technical conditions are satisfied. If H 0 is true, then ) (d) n I(θˆMLE ) 1/2 (θˆmle θ 0 N d (0,I d ) w.r.t. IP θ0. n n 21/37

22 Wald s test (2) Hence, ( ) ( ) θˆmle θ MLE (d) n n θ 0 I(ˆ ) θˆmle n θ 0 χd 2 w.r.t. IP θ0. n }{{} Tn Wald s test with asymptotic level α (0, 1): ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 (see tables). d Remark: Wald s test is also valid if H 1 has the form θ > θ 0 or θ < θ 0 or θ = θ /37

23 Likelihood ratio test (1) Consider an i.i.d. sample X 1,...,X n with statistical model (E, (IP θ ) θ Θ ), where Θ IR d (d 1). Suppose the null hypothesis has the form (0) (0) H 0 : (θ r+1,...,θ d ) = (θ r+1,...,θ d ), (0) (0) for some fixed and given numbers θ r+1,...,θ d. Let and ˆθ n = argmax l n (θ) (MLE) θ Θ θˆc = argmax l n (θ) ( constrained MLE ) n θ Θ 0 23/37

24 Likelihood ratio test (2) Test statistic: ( ) T n = 2 l n (θˆn) l n (θˆc n). Theorem Assume H 0 is true and the MLE technical conditions are satisfied. Then, (d) T n χ d 2 r w.r.t. IP θ. n Likelihood ratio test with asymptotic level α (0, 1): ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 d r (see tables). 24/37

25 Testing implicit hypotheses (1) Let X 1,...,X n be i.i.d. random variables and let θ IR d be a parameter associated with the distribution of X 1 (e.g. a moment, the parameter of a statistical model, etc...) Let g : IR d IR k be continuously differentiable (with k < d). Consider the following hypotheses: { H 0 : g(θ) = 0 H 1 : g(θ) = 0. E.g. g(θ) = (θ 1,θ 2 ) (k = 2), or g(θ) = θ 1 θ 2 (k = 1), or... 25/37

26 Testing implicit hypotheses (2) Suppose an asymptotically normal estimator θˆn is available: ( ) (d) n θˆ n θ N d (0, Σ(θ)). n Delta method: ( ) (d) n g(θˆn) g(θ) N k (0, Γ(θ)), n where Γ(θ) = g(θ) Σ(θ) g(θ) IR k k. Assume Σ(θ) is invertible and g(θ) has rank k. So, Γ(θ) is invertible and ( ) (d) n Γ(θ) 1/2 g(θˆn) g(θ) N k (0,I k ). n 26/37

27 Testing implicit hypotheses (3) Then, by Slutsky s theorem, if Γ(θ) is continuous in θ, ( ) (d) n Γ(θˆn ) 1/2 g(θˆn) g(θ) N k (0,I k ). Hence, if H 0 is true, i.e., g(θ) = 0, n ) Γ 1 (d) ng(θˆn (ˆ θ )g(ˆ χ 2 n θ n ) k }{{}. n T n Test with asymptotic level α: ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 (see tables). k 27/37

28 The multinomial case: χ 2 test (1) Let E = {a 1,...,a K } be a finite space and (IP p ) family of all probability distributions on E: p Δ K be the n K Δ = p = (p 1,...,p K ) (0, 1) K K : p j = 1. j=1 For p Δ K and X IP p, IP p [X = a j ] = p j, j = 1,..., K. 28/37

29 The multinomial case: χ 2 test (2) iid Let X 1,...,X n IP p, for some unknown p Δ K, and let p 0 Δ K be fixed. We want to test: H 0 : p = p 0 vs. H 1 : p = p 0 with asymptotic level α (0, 1). Example: If p 0 = (1/K, 1/K,..., 1/K), we are testing whether IP p is the uniform distribution on E. 29/37

30 The multinomial case: χ 2 test (3) Likelihood of the model: N 1 N 2 N K L n (X 1,...,X n, p) = p p...p, 1 2 K where N j = #{i = 1,...,n : X i = a j }. Let pˆ be the MLE: N j pˆj =, j = 1,..., K. n pˆ maximizes logl n (X 1,...,X n, p) under the constraint nk pj = 1. j=1 30/37

31 The multinomial case: χ 2 test (4) If H 0 is true, then n(pˆ p 0 ) is asymptotically normal, and the following holds. Theorem ( ) 2 K 0 pˆj p j (d) n n χ 2 j=1 p j 0 }{{} T n n K 1. χ 2 test with asymptotic level α: ψ α = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 K 1. Asymptotic p-value of this test: p value = IP[Z > T n T n ], where Z χ 2 and Z T n. K 1 31/37

32 The Gaussian case: Student s test (1) iid Let X 1,...,X n N(µ, σ 2 ), for some unknown µ IR,σ 2 > 0 and let µ 0 IR be fixed, given. We want to test: H 0 : µ = µ 0 vs. H 1 : µ = µ 0 with asymptotic level α (0, 1). X n µ 0 If σ 2 is known: Let T n = n. Then, T n N(0, 1) σ and ψ α = 1I{ T n > q α/2 } is a test with (non asymptotic) level α. 32/37

33 The Gaussian case: Student s test (2) If σ 2 is unknown: Xn µ 0 Let T n = n 1, where S n is the sample variance. S n Cochran s theorem: X n S n ; ns n χ 2 σ 2 n 1. Hence, T T n t n 1 : Student s distribution with n 1 degrees of freedom. 33/37

34 The Gaussian case: Student s test (3) Student s test with (non asymptotic) level α (0, 1): ψ α = 1I{ T T n > q α/2 }, where q α/2 is the (1 α/2)-quantile of t n 1. If H 1 is µ > µ 0, Student s test with level α (0, 1) is: ψ = 1I{T T n > q α }, α where q α is the (1 α)-quantile of t n 1. Advantage of Student s test: Non asymptotic Can be run on small samples Drawback of Student s test: It relies on the assumption that the sample is Gaussian. 34/37

35 Two-sample test: large sample case (1) Consider two samples: X 1,...,X n and Y 1,...,Y m, of independent random variables such that IE[X 1 ] = = IE[X n ] = µ X, and IE[Y 1 ] = = IE[Y m ] = µ Y Assume that the variances of are known so assume (without loss of generality) that var(x 1 ) = = var(x n ) = var(y 1 ) = = var(y m ) = 1 We want to test: H 0 : µ X = µ Y vs. H 1 : µ X = µ Y with asymptotic level α (0, 1). 35/37

36 Two-sample test: large sample case (2) From CLT: (d) n(x n µ X ) N(0, 1) n and (d) (d) m(ȳ m µ Y ) N(0, 1) n(ȳ m µ Y ) N(0,γ) n m m m γ n Moreover, the two samples are independent so (d) n(x n Y m )+ n(µ X µ Y ) N(0, 1+γ) n m m γ n Under H 0 : µ X = µ Y : X n Ȳ m (d) n J N(0, 1) 1+m/n n m m γ n { X n Y } m Test: ψ α = 1I n J > q α/2 1+m/n 36/37

37 Two-sample T-test If the variances are unknown but we know that X i N(µ X,σX 2 ), Y i N(µ Y,σY 2 ). Then σx 2 σy 2 X n Y m N ( ) µ X µ Y, + n m Under H 0 : X n Y J m N(0, 1) σx 2 /n + σy 2 /m For unknown variance: X n Y J m t N SX 2 /n + SY 2 /m where ( S 2 /n + S 2 /m ) 2 N = X Y SX 4 SY + 4 n 2 (n 1) m 2 (m 1) 37/37

38 MIT OpenCourseWare / Statistics for Applications Fall 2016 For information about citing these materials or our Terms of Use, visit:

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.