Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Similar documents
14.30 Introduction to Statistical Methods in Economics Spring 2009

Summary of Chapters 7-9

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Statistics. Statistics

Introductory Econometrics

Institute of Actuaries of India

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

1 Hypothesis testing for a single mean

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

simple if it completely specifies the density of x

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Statistics for Applications. Chapter 1: Introduction 1/43

Chapter 9: Hypothesis Testing Sections

Detection theory. H 0 : x[n] = w[n]

Chapter 10. Hypothesis Testing (I)

Probability and Statistics Notes

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Mathematical Statistics

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Chapters 10. Hypothesis Testing

A Very Brief Summary of Statistical Inference, and Examples

INTERVAL ESTIMATION AND HYPOTHESES TESTING

1 Statistical inference for a population mean

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

TUTORIAL 8 SOLUTIONS #

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Lecture 21: October 19

Hypothesis Testing: The Generalized Likelihood Ratio Test

Topic 15: Simple Hypotheses

Composite Hypotheses and Generalized Likelihood Ratio Tests

Economics 583: Econometric Theory I A Primer on Asymptotics: Hypothesis Testing

F79SM STATISTICAL METHODS

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

Introductory Econometrics. Review of statistics (Part II: Inference)

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

Confidence Intervals for Normal Data Spring 2014

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Some General Types of Tests

Central Limit Theorem ( 5.3)

Confidence Intervals, Testing and ANOVA Summary

Class 26: review for final exam 18.05, Spring 2014

8: Hypothesis Testing

STAT 461/561- Assignments, Year 2015

[y i α βx i ] 2 (2) Q = i=1

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Asymptotic Tests and Likelihood Ratio Tests

Lecture 17: Likelihood ratio and asymptotic tests

Statistical Inference

Basic Concepts of Inference

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Exam 2 Practice Questions, 18.05, Spring 2014

Chapter 7. Hypothesis Testing

Probability Theory and Statistics. Peter Jochumzen

2014/2015 Smester II ST5224 Final Exam Solution

Math 494: Mathematical Statistics

Asymptotic Statistics-VI. Changliang Zou

Lecture 10: Generalized likelihood ratio test

STAT 4385 Topic 01: Introduction & Review

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Bias Variance Trade-off

Comparing two independent samples

Maximum Likelihood Large Sample Theory

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Statistical Hypothesis Testing

Statistics 3858 : Maximum Likelihood Estimators

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

Topic 17: Simple Hypotheses

δ -method and M-estimation

Review and continuation from last week Properties of MLEs

2.6.3 Generalized likelihood ratio tests

Chapters 10. Hypothesis Testing

Hypothesis Testing One Sample Tests

ECE 275A Homework 7 Solutions

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

8. Hypothesis Testing

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer

ML Testing (Likelihood Ratio Testing) for non-gaussian models

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

Spring 2012 Math 541B Exam 1

Classical Inference for Gaussian Linear Models

Practice Problems Section Problems

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Detection theory 101 ELEC-E5410 Signal Processing for Communications

Hypothesis testing: theory and methods

Ch. 5 Hypothesis Testing

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

The Multinomial Model

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

Transcription:

18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were 14974 participants Average running time was 103.5 minutes. Were runners faster in 2012? To answer this question, select n runners from the 2012 race at random and denote by X 1,...,X n their running time. 2/37

Cherry Blossom run (2) We can see from past data that the running time has Gaussian distribution. The variance was 373. 3/37

Cherry Blossom run (3) We are given i.i.d r.v X 1,...,X n and we want to know if X 1 N(103.5, 373) This is a hypothesis testing problem. There are many ways this could be false: 1. IE[X 1 ] = 103.5 2. var[x 1 ] = 373 3. X 1 may not even be Gaussian. We are interested in a very specific question: is IE[X 1 ] < 103.5? 4/37

Cherry Blossom run (4) We make the following assumptions: 1. var[x 1 ] = 373 (variance is the same between 2009 and 2012) 2. X 1 is Gaussian. The only thing that we did not fix is IE[X 1 ] = µ. Now we want to test (only): Is µ = 103.5 or is µ < 103.5? By making modeling assumptions, we have reduced the number of ways the hypothesis X 1 N(103.5, 373) may be rejected. The only way it can be rejected is if X 1 N(µ, 373) for some µ < 103.5. We compare an expected value to a fixed reference number (103.5). 5/37

Cherry Blossom run (5) Simple heuristic: If X n < 103.5, then µ < 103.5 This could go wrong if I randomly pick only fast runners in my sample X 1,...,X n. Better heuristic: If X n < 103.5 (something that 0), then µ < 103.5 n To make this intuition more precise, we need to take the size of the random fluctuations of X n into account! 6/37

Clinical trials (1) Pharmaceutical companies use hypothesis testing to test if a new drug is efficient. To do so, they administer a drug to a group of patients (test group) and a placebo to another group (control group). Assume that the drug is a cough syrup. Let µ control denote the expected number of expectorations per hour after a patient has used the placebo. Let µ drug denote the expected number of expectorations per hour after a patient has used the syrup. We want to know if µ drug < µ control We compare two expected values. No reference number. 7/37

Clinical trials (2) Let X 1,...,X ndrug denote n drug i.i.d r.v. with distribution Poiss(µ drug ) Let Y 1,...,Y ncontrol denote n control i.i.d r.v. with distribution Poiss(µ control ) We want to test if µ drug < µ control. Heuristic: If X drug < X control (something that 0), then conclude that µ drug < µ control n drug n control 8/37

Heuristics (1) Example 1: A coin is tossed 80 times, and Heads are obtained 54 times. Can we conclude that the coin is significantly unfair? iid n = 80, X 1,...,X n Ber(p); X n = 54/80 =.68 If it was true that p =.5: By CLT+Slutsky s theorem, X n.5 n N(0, 1). J.5(1.5) X n.5 n 3.22 J.5(1.5) Conclusion: It seems quite reasonable to reject the hypothesis p =.5. 9/37

Heuristics (2) Example 2: A coin is tossed 30 times, and Heads are obtained 13 times. Can we conclude that the coin is significantly unfair? iid n = 30,X 1,...,X n Ber(p); X n = 13/30.43 If it was true that p =.5: By CLT+Slutsky s theorem, X n.5 n N(0, 1). J.5(1.5) X n.5 Our data gives n J.77.5(1.5) The number.77 is a plausible realization of a random variable Z N(0, 1). Conclusion: our data does not suggest that the coin is unfair. 10/37

Statistical formulation (1) Consider a sample X 1,...,X n of i.i.d. random variables and a statistical model (E, (IP θ ) θ Θ ). Let Θ 0 and Θ 1 be disjoint subsets of Θ. Consider the two hypotheses: { H 0 : θ Θ 0 H 1 : θ Θ 1 H 0 is the null hypothesis, H 1 is the alternative hypothesis. If we believe that the true θ is either in Θ 0 or in Θ 1, we may want to test H 0 against H 1. We want to decide whether to reject H 0 (look for evidence against H 0 in the data). 11/37

Statistical formulation (2) H 0 and H 1 do not play a symmetric role: the data is is only used to try to disprove H 0 In particular lack of evidence, does not mean that H 0 is true ( innocent until proven guilty ) A test is a statistic ψ {0, 1} such that: If ψ = 0, H 0 is not rejected; If ψ = 1, H 0 is rejected. Coin example: H 0 : p = 1/2 vs. H 1 : p = 1/2. { X n.5 ψ = 1I n J.5(1.5) } > C, for some C > 0. How to choose the threshold C? 12/37

Statistical formulation (3) Rejection region of a test ψ: R ψ = {x E n : ψ(x) = 1}. Type 1 error of a test ψ (rejecting H 0 when it is actually true): α ψ : Θ 0 IR θ IP θ [ψ = 1]. Type 2 error of a test ψ (not rejecting H 0 although H 1 is actually true): β ψ : Θ 1 IR θ IP θ [ψ = 0]. Power of a test ψ: π ψ = inf (1 β ψ (θ)). θ Θ 1 13/37

Statistical formulation (4) A test ψ has level α if α ψ (θ) α, θ Θ 0. A test ψ has asymptotic level α if In general, a test has the form lim α ψ (θ) α, θ Θ 0. n ψ = 1I{T n > c}, for some statistic T n and threshold c IR. T n is called the test statistic. The rejection region is R ψ = {T n > c}. 14/37

Example (1) iid Let X 1,...,X n Ber(p), for some unknown p (0, 1). We want to test: H 0 : p = 1/2 vs. H 1 : p = 1/2 with asymptotic level α (0, 1). pˆn 0.5 Let T n = n J.5(1.5), where pˆn is the MLE. If H 0 is true, then by CLT and Slutsky s theorem, Let ψ α = 1I{T n > q α/2 }. IP[T n > q α/2 ] 0.05 n 15/37

Example (2) Coming back to the two previous coin examples: For α = 5%, q α/2 = 1.96, so: In Example 1, H 0 is rejected at the asymptotic level 5% by the test ψ 5% ; In Example 2, H 0 is not rejected at the asymptotic level 5% by the test ψ 5%. Question: In Example 1, for what level α would ψ α not reject H 0? And in Example 2, at which level α would ψ α reject H 0? 16/37

p-value Definition The (asymptotic) p-value of a test ψ α is the smallest (asymptotic) level α at which ψ α rejects H 0. It is random, it depends on the sample. Golden rule p-value α H 0 is rejected by ψ α, at the (asymptotic) level α. The smaller the p-value, the more confidently one can reject H 0. Example 1: p-value = IP[ Z > 3.21].01. Example 2: p-value = IP[ Z >.77].44. 17/37

Neyman-Pearson s paradigm Idea: For given hypotheses, among all tests of level/asymptotic level α, is it possible to find one that has maximal power? Example: The trivial test ψ = 0 that never rejects H 0 has a perfect level (α = 0) but poor power (π ψ = 0). Neyman-Pearson s theory provides (the most) powerful tests with given level. In 18.650, we only study several cases. 18/37

The χ 2 distributions Definition For a positive integer d, the χ 2 (pronounced Kai-squared ) distribution with d degrees of freedom is the law of the random iid variable Z2 1 + Z2 2 +... + Zd 2, where Z 1,...,Z d N(0, 1). Examples: If Z N d (0,I d ), then IZI 2 2 χ 2 d. Recall that the sample variance is given by S n = 1 n (Xi X n) 2 = 1 n 2 Xi n n i=1 i=1 (X n) 2 Cochran s theorem implies that for X 1,...,X n N(µ, σ 2 ), if S n is the sample variance, then iid χ 2 2 = Exp(1/2). ns n χ 2 n 1. σ 2 19/37

Student s T distributions Definition For a positive integer d, the Student s T distribution with d degrees of freedom (denoted by t d ) is the law of the random Z variable J V/d independent of V ). Example:, where Z N(0, 1), V χ 2 and Z V (Z is d iid Cochran s theorem implies that for X 1,...,X n N(µ, σ 2 ), if S n is the sample variance, then X n µ n 1 t n 1. S n 20/37

Wald s test (1) Consider an i.i.d. sample X 1,...,X n with statistical model (E, (IP θ ) θ Θ ), where Θ IR d (d 1) and let θ 0 Θ be fixed and given. Consider the following hypotheses: { H 0 : θ = θ 0 H 1 : θ = θ 0. Let θˆ MLE be the MLE. Assume the MLE technical conditions are satisfied. If H 0 is true, then ) (d) n I(θˆMLE ) 1/2 (θˆmle θ 0 N d (0,I d ) w.r.t. IP θ0. n n 21/37

Wald s test (2) Hence, ( ) ( ) θˆmle θ MLE (d) n n θ 0 I(ˆ ) θˆmle n θ 0 χd 2 w.r.t. IP θ0. n }{{} Tn Wald s test with asymptotic level α (0, 1): ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 (see tables). d Remark: Wald s test is also valid if H 1 has the form θ > θ 0 or θ < θ 0 or θ = θ 1... 22/37

Likelihood ratio test (1) Consider an i.i.d. sample X 1,...,X n with statistical model (E, (IP θ ) θ Θ ), where Θ IR d (d 1). Suppose the null hypothesis has the form (0) (0) H 0 : (θ r+1,...,θ d ) = (θ r+1,...,θ d ), (0) (0) for some fixed and given numbers θ r+1,...,θ d. Let and ˆθ n = argmax l n (θ) (MLE) θ Θ θˆc = argmax l n (θ) ( constrained MLE ) n θ Θ 0 23/37

Likelihood ratio test (2) Test statistic: ( ) T n = 2 l n (θˆn) l n (θˆc n). Theorem Assume H 0 is true and the MLE technical conditions are satisfied. Then, (d) T n χ d 2 r w.r.t. IP θ. n Likelihood ratio test with asymptotic level α (0, 1): ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 d r (see tables). 24/37

Testing implicit hypotheses (1) Let X 1,...,X n be i.i.d. random variables and let θ IR d be a parameter associated with the distribution of X 1 (e.g. a moment, the parameter of a statistical model, etc...) Let g : IR d IR k be continuously differentiable (with k < d). Consider the following hypotheses: { H 0 : g(θ) = 0 H 1 : g(θ) = 0. E.g. g(θ) = (θ 1,θ 2 ) (k = 2), or g(θ) = θ 1 θ 2 (k = 1), or... 25/37

Testing implicit hypotheses (2) Suppose an asymptotically normal estimator θˆn is available: ( ) (d) n θˆ n θ N d (0, Σ(θ)). n Delta method: ( ) (d) n g(θˆn) g(θ) N k (0, Γ(θ)), n where Γ(θ) = g(θ) Σ(θ) g(θ) IR k k. Assume Σ(θ) is invertible and g(θ) has rank k. So, Γ(θ) is invertible and ( ) (d) n Γ(θ) 1/2 g(θˆn) g(θ) N k (0,I k ). n 26/37

Testing implicit hypotheses (3) Then, by Slutsky s theorem, if Γ(θ) is continuous in θ, ( ) (d) n Γ(θˆn ) 1/2 g(θˆn) g(θ) N k (0,I k ). Hence, if H 0 is true, i.e., g(θ) = 0, n ) Γ 1 (d) ng(θˆn (ˆ θ )g(ˆ χ 2 n θ n ) k }{{}. n T n Test with asymptotic level α: ψ = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 (see tables). k 27/37

The multinomial case: χ 2 test (1) Let E = {a 1,...,a K } be a finite space and (IP p ) family of all probability distributions on E: p Δ K be the n K Δ = p = (p 1,...,p K ) (0, 1) K K : p j = 1. j=1 For p Δ K and X IP p, IP p [X = a j ] = p j, j = 1,..., K. 28/37

The multinomial case: χ 2 test (2) iid Let X 1,...,X n IP p, for some unknown p Δ K, and let p 0 Δ K be fixed. We want to test: H 0 : p = p 0 vs. H 1 : p = p 0 with asymptotic level α (0, 1). Example: If p 0 = (1/K, 1/K,..., 1/K), we are testing whether IP p is the uniform distribution on E. 29/37

The multinomial case: χ 2 test (3) Likelihood of the model: N 1 N 2 N K L n (X 1,...,X n, p) = p p...p, 1 2 K where N j = #{i = 1,...,n : X i = a j }. Let pˆ be the MLE: N j pˆj =, j = 1,..., K. n pˆ maximizes logl n (X 1,...,X n, p) under the constraint nk pj = 1. j=1 30/37

The multinomial case: χ 2 test (4) If H 0 is true, then n(pˆ p 0 ) is asymptotically normal, and the following holds. Theorem ( ) 2 K 0 pˆj p j (d) n n χ 2 j=1 p j 0 }{{} T n n K 1. χ 2 test with asymptotic level α: ψ α = 1I{T n > q α }, where q α is the (1 α)-quantile of χ 2 K 1. Asymptotic p-value of this test: p value = IP[Z > T n T n ], where Z χ 2 and Z T n. K 1 31/37

The Gaussian case: Student s test (1) iid Let X 1,...,X n N(µ, σ 2 ), for some unknown µ IR,σ 2 > 0 and let µ 0 IR be fixed, given. We want to test: H 0 : µ = µ 0 vs. H 1 : µ = µ 0 with asymptotic level α (0, 1). X n µ 0 If σ 2 is known: Let T n = n. Then, T n N(0, 1) σ and ψ α = 1I{ T n > q α/2 } is a test with (non asymptotic) level α. 32/37

The Gaussian case: Student s test (2) If σ 2 is unknown: Xn µ 0 Let T n = n 1, where S n is the sample variance. S n Cochran s theorem: X n S n ; ns n χ 2 σ 2 n 1. Hence, T T n t n 1 : Student s distribution with n 1 degrees of freedom. 33/37

The Gaussian case: Student s test (3) Student s test with (non asymptotic) level α (0, 1): ψ α = 1I{ T T n > q α/2 }, where q α/2 is the (1 α/2)-quantile of t n 1. If H 1 is µ > µ 0, Student s test with level α (0, 1) is: ψ = 1I{T T n > q α }, α where q α is the (1 α)-quantile of t n 1. Advantage of Student s test: Non asymptotic Can be run on small samples Drawback of Student s test: It relies on the assumption that the sample is Gaussian. 34/37

Two-sample test: large sample case (1) Consider two samples: X 1,...,X n and Y 1,...,Y m, of independent random variables such that IE[X 1 ] = = IE[X n ] = µ X, and IE[Y 1 ] = = IE[Y m ] = µ Y Assume that the variances of are known so assume (without loss of generality) that var(x 1 ) = = var(x n ) = var(y 1 ) = = var(y m ) = 1 We want to test: H 0 : µ X = µ Y vs. H 1 : µ X = µ Y with asymptotic level α (0, 1). 35/37

Two-sample test: large sample case (2) From CLT: (d) n(x n µ X ) N(0, 1) n and (d) (d) m(ȳ m µ Y ) N(0, 1) n(ȳ m µ Y ) N(0,γ) n m m m γ n Moreover, the two samples are independent so (d) n(x n Y m )+ n(µ X µ Y ) N(0, 1+γ) n m m γ n Under H 0 : µ X = µ Y : X n Ȳ m (d) n J N(0, 1) 1+m/n n m m γ n { X n Y } m Test: ψ α = 1I n J > q α/2 1+m/n 36/37

Two-sample T-test If the variances are unknown but we know that X i N(µ X,σX 2 ), Y i N(µ Y,σY 2 ). Then σx 2 σy 2 X n Y m N ( ) µ X µ Y, + n m Under H 0 : X n Y J m N(0, 1) σx 2 /n + σy 2 /m For unknown variance: X n Y J m t N SX 2 /n + SY 2 /m where ( S 2 /n + S 2 /m ) 2 N = X Y SX 4 SY + 4 n 2 (n 1) m 2 (m 1) 37/37

MIT OpenCourseWare http://ocw.mit.edu 18.650 / 18.6501 Statistics for Applications Fall 2016 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.