Lecture Notes 15 Hypothesis Testing (Chapter 10)

Similar documents
Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Last Lecture. Wald Test

5. Likelihood Ratio Tests

Summary. Recap ... Last Lecture. Summary. Theorem

Composite Hypotheses

Introductory statistics

Last Lecture. Unbiased Test

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Frequentist Inference

STAT Homework 7 - Solutions

Lecture 12: September 27

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

32 estimating the cumulative distribution function

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Problem Set 4 Due Oct, 12

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

1.010 Uncertainty in Engineering Fall 2008

Stat 319 Theory of Statistics (2) Exercises

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

STATISTICAL INFERENCE

Data Analysis and Statistical Methods Statistics 651

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Sample Size Determination (Two or More Samples)

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Statistics 3858 : Likelihood Ratio for Multinomial Models

Math 140 Introductory Statistics

Topic 9: Sampling Distributions of Estimators

x = Pr ( X (n) βx ) =

Common Large/Small Sample Tests 1/55

Statistical Theory MT 2009 Problems 1: Solution sketches

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Chapter 5: Hypothesis testing

Statistical Theory MT 2008 Problems 1: Solution sketches

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Random Variables, Sampling and Estimation

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Efficient GMM LECTURE 12 GMM II

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Notes on Hypothesis Testing, Type I and Type II Errors

tests 17.1 Simple versus compound

Topic 18: Composite Hypotheses

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 12: November 13, 2018

Lecture 2: Monte Carlo Simulation

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

STAT431 Review. X = n. n )

Lecture 33: Bootstrap

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

6 Sample Size Calculations

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

Expectation and Variance of a random variable

HOMEWORK I: PREREQUISITES FROM MATH 727

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Output Analysis and Run-Length Control

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

SDS 321: Introduction to Probability and Statistics

Module 1 Fundamentals in statistics

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Simulation. Two Rule For Inverting A Distribution Function

6.3 Testing Series With Positive Terms

Statistical inference: example 1. Inferential Statistics

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Properties and Hypothesis Testing

Asymptotic distribution of the first-stage F-statistic under weak IVs

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Bayesian Methods: Introduction to Multi-parameter Models

Lecture 7: October 18, 2017

Chapter 6 Sampling Distributions

TAMS24: Notations and Formulas

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Machine Learning Brett Bernstein

STAC51: Categorical data Analysis

Stat 421-SP2012 Interval Estimation Section

4.1 Sigma Notation and Riemann Sums

Homework 5 Solutions

This is an introductory course in Analysis of Variance and Design of Experiments.

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Power and Type II Error

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Testing Statistical Hypotheses for Compare. Means with Vague Data

Parameter, Statistic and Random Samples

MAT1026 Calculus II Basic Convergence Tests for Series

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

LECTURE 8: ASYMPTOTICS I

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15

Lecture 12: Hypothesis Testing

Transcription:

1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we may wat to kow if the coi is fair; this correspods to p = 1/2. If we are testig the effect of two drugs whose meas effects are θ 1 ad θ 2 we may be iterested to kow if there is o differece, which correspods to θ 1 θ 2 = 0. We formalize this by statig a ull hypothesis H 0 ad a alterative hypothesis H 1. For example: H 0 : θ = θ 0 versus θ θ 0. More geerally, cosider a parameter space Θ. We cosider H 0 : θ Θ 0 versus H 1 : θ Θ 1 where Θ 0 Θ 1 =. If Θ 0 cosists of a sigle poit, we call this a simple ull hypothesis. If Θ 0 cosists of more tha oe poit, we call this a composite ull hypothesis. Example 1 X 1,..., X Beroullip). H 0 : p = 1 2 H 1 : p 1 2. The questio is ot whether H 0 is true or false. The questio is whether there is sufficiet evidece to reject H 0, much like a court case. Our possible actios are: reject H 0 or retai do t reject) H 0. H 0 true H 1 true Decisio Retai H 0 Reject H 0 Type I error false positive) Type II error false egative) Warig: Hypothesis testig should oly be used whe it is appropriate. Ofte times, people use hypothesis testig whe it would be much more appropriate to use cofidece itervals. 1

Notatio: Let Φ be the cdf of a stadard Normal radom variable Z. For 0 < α < 1, let z α = Φ 1 1 α). Hece, P Z > z α ) = α. Also, P Z < z α ) = α. I these otes we sometimes write px; θ) istead of p θ x). 2 Costructig Tests Hypothesis testig ivolves the followig steps: 1. Choose a test statistic T = T X 1,..., X ). 2. Choose a rejectio regio R. 3. If T R we reject H 0 otherwise we retai H 0. Example 2 Let X 1,..., X Beroullip). Suppose we test H 0 : p = 1 2 H 1 : p 1 2. Let T = 1 i=1 X i ad R = {x 1,..., x : T x 1,..., x ) 1/2 > δ}. So we reject H 0 if T 1/2 > δ. We eed to choose T ad R so that the test has good statistical properties. We will cosider the followig tests: 1. The Neyma-Pearso Test 2. The Wald test 3. The Likelihood Ratio Test LRT) 4. The permutatio test. Before we discuss these methods, we first eed to talk about how we evaluate tests. 3 Error Rates ad Power Suppose we reject H 0 whe X 1,..., X ) R. Defie the power fuctio by βθ) = P θ X 1,..., X R). We wat βθ) to be small whe θ Θ 0 ad we wat βθ) to be large whe θ Θ 1. The geeral strategy is: 1. Fix α [0, 1]. 2

2. Now try to maximize βθ) for θ Θ 1 subject to βθ) α for θ Θ 0. We eed the followig defiitios. A test is size α if sup βθ) α. θ Θ 0 Example 3 X 1,..., X Nθ, σ 2 ) with σ 2 kow. Suppose we test H 0 : θ = θ 0, H 1 : θ > θ 0. This is called a oe-sided alterative. Suppose we reject H 0 if T > c where The βθ) = P θ X θ 0 = P T = X θ 0. ) > c Z > c + θ where Φ is the cdf of a stadard Normal ad Z Φ. Now To get a size α test, set 1 Φc) = α so that X θ = P θ > c + θ ) ) 1 Φ c + θ ) sup βθ) = βθ 0 ) = 1 Φc). θ Θ 0 c = z α where z α = Φ 1 1 α). Our test is: reject H 0 whe T = X θ 0 > z α. Example 4 X 1,..., X Nθ, σ 2 ) with σ 2 kow. Suppose H 0 : θ = θ 0, H 1 : θ θ 0. This is called a two-sided alterative. We will reject H 0 if T > c where T is defied as before. Now βθ) = P θ T < c) + P θ T > c) ) ) X θ 0 = P θ < c X θ 0 + P θ > c = P Z < c + θ ) σ/ + P Z > c + θ ) = Φ c + θ ) σ/ + 1 Φ c + θ ) = Φ c + θ ) σ/ + Φ c θ ) 3

sice Φ x) = 1 Φx). The size is βθ 0 ) = 2Φ c). To get a size α test we set 2Φ c) = α so that c = Φ 1 α/2) = Φ 1 1 α/2) = z α/2. The test is: reject H 0 whe T = X θ 0 > z α/2. 4 The Neyma-Pearso Test Not i the book.) Let C α deote all level α tests. A test i C α with power fuctio β is uiformly most powerful UMP) if the followig holds: if β is the power fuctio of ay other test i C α the βθ) β θ) for all θ Θ 1. Cosider testig H 0 : θ = θ 0 versus H 1 : θ = θ 1. Simple ull ad simple alterative.) Theorem 5 Let Lθ) = px 1,..., X ; θ) ad T = Lθ 1) Lθ 0 ). Suppose we reject H 0 if T > k where k is chose so that This test is a UMP level α test. P θ0 X R) = α. The Neyma-Pearso test is quite limited because it ca be used oly for testig a simple ull versus a simple alterative. So it does ot get used i practice very ofte. But it is importat from a coceptual poit of view. 5 The Wald Test Let T = θ θ 0 se where θ is a asymptotically Normal estimator ad se is the estimated stadard error of θ or the stadard error uder H 0 ). Uder H 0, T N0, 1). Hece, a asymptotic level α test is to reject whe T > z α/2. That is P θ0 T > z α ) α. For example, with Beroulli data, to test H 0 : p = p 0, T = p p 0 4 p1 p).

You ca also use T = p p 0 p 0 1 p 0 ) I other words, to compute the stadard error, you ca replace θ with a estimate θ or by the ull value θ 0. 6 The Likelihood Ratio Test LRT) This test is simple: reject H 0 if λx 1,..., x ) c where where θ 0 maximizes Lθ) subject to θ Θ 0. Example 6 X 1,..., X Nθ, 1). Suppose After some algebra, So λx 1,..., x ) = sup θ Θ 0 Lθ) sup θ Θ Lθ) = L θ 0 ) L θ) H 0 : θ = θ 0, H 1 : θ θ 0. λ = exp { } 2 X θ 0 ) 2. R = {x : λ c} = {x : X θ 0 c } where c = 2 log c/. Choosig c to make this level α gives: reject if T > z α/2 where T = X θ 0 ) which is the test we costructed before. Example 7 X 1,..., X Nθ, σ 2 ). Suppose The H 0 : θ = θ 0, H 1 : θ θ 0. λx 1,..., x ) = Lθ 0, σ 0 ) L θ, σ) where σ 0 maximizes the likelihood subject to θ = θ 0. Exercise: Show that λx 1,..., x ) < c correspods to rejectig whe T > k for some costat k where T = X θ 0 S/. Uder H 0, T has a t-distributio with 1 degrees of freedom. So the fial test is: reject H 0 if T > t 1,α/2. This is called Studet s t-test. It was iveted by William Gosset workig at Guiess Breweries ad writig uder the pseudoym Studet. 5.

We ca simplify the LRT by usig a asymptotic approximatio. First, some otatio: Notatio: Let W χ 2 p. Defie χ 2 p,α by P W > χ 2 p,α) = α. Theorem 8 Cosider testig H 0 : θ = θ 0 versus H 1 : θ θ 0 where θ R. Uder H 0, 2 log λx 1,..., X ) χ 2 1. Hece, if we let T = 2 log λx ) the P θ0 T > χ 2 1,α) α as. Proof. Usig a Taylor expasio: ad so lθ) l θ) + l θ)θ θ) + l 2 θ θ) θ) 2 = l θ) + l 2 θ θ) θ) 2 P 2 log λx 1,..., x ) = 2l θ) 2lθ 0 ) 2l θ) 2l θ) l θ)θ θ) 2 = l θ)θ θ) 2 = l θ) I θ 0 ) I θ 0 ) θ θ 0 )) 2 = A B. Now A 1 by the WLLN ad B N0, 1). The result follows by Slutsky s theorem. Example 9 X 1,..., X Poissoλ). We wat to test H 0 : λ = λ 0 versus H 1 : λ λ 0. The 2 log λx ) = 2[λ 0 λ) λ logλ 0 / λ)]. We reject H 0 whe 2 log λx ) > χ 2 1,α. Now suppose that θ = θ 1,..., θ k ). Suppose that H 0 : θ Θ 0 fixes some of the parameters. The, uder coditios, where T = 2 log λx 1,..., X ) χ 2 ν ν = dimθ) dimθ 0 ). Therefore, a asymptotic level α test is: reject H 0 whe T > χ 2 ν,α. 6

Example 10 Cosider a multiomial with θ = p 1,..., p 5 ). So Suppose we wat to test Lθ) = p y 1 1 p y 5 5. H 0 : p 1 = p 2 = p 3 ad p 4 = p 5 versus the alterative that H 0 is false. I this case ν = 4 1 = 3. The LRT test statistic is 5 i=1 λx 1,..., x ) = j 0j 5 i=1 py j j where p j = Y j /, p 10 = p 20 = p 30 = Y 1 + Y 2 + Y 3 )/, p 40 = p 50 = 1 3 p 10 )/2. These calculatios are o p 491. Make sure you uderstad them. Now we reject H 0 if 2λX 1,..., X ) > χ 2 3,α. 7 p-values Whe we test at a give level α we will reject or ot reject. It is useful to summarize what levels we would reject at ad what levels we woud ot reject at. The p-value is the smallest α at which we would reject H 0. I other words, we reject at all α p. So, if the pvalue is 0.03, the we would reject at α = 0.05 but ot at α = 0.01. Hece, to test at level α whe p < α. Theorem 11 Suppose we have a test of the form: reject whe T X 1,..., X ) > c. The the p-value is p = sup θ Θ 0 P θ T X 1,..., X ) T x 1,..., x )) where x 1,..., x are the observed data ad X 1,..., X p θ0. Example 12 X 1,..., X Nθ, 1). Test that H 0 : θ = θ 0 versus H 1 : θ θ 0. We reject whe T is large, where T = X θ 0 ). Let t be the obsrved value of T. Let Z N0, 1). The, p = P θ0 X θ 0 ) > t ) = P Z > t ) = 2Φ t ). Theorem 13 Uder H 0, p Uif0, 1). Importat. Note that p is NOT equal to P H 0 X 1,..., X ). The latter is a Bayesia quatity which we will discuss later. 7

8 The Permutatio Test This is a very cool test. approximatios. Suppose we have data ad We wat to test: Let Create labels It is distributio free ad it does ot ivolve ay asymptotic X 1,..., X F Y 1,..., Y m G. H 0 : F = G versus H 1 : F G. Z = X 1,..., X, Y 1,..., Y m ). L = 1,..., 1, 2,..., 2). }{{}}{{} values m values A test statistic ca be writte as a fuctio of Z ad L. For example, if T = X Y m the we ca write N i=1 T = Z iil i = 1) N i=1 N i=1 IL Z iil i = 2) i = 1) N i=1 IL i = 2) where N = + m. So we write T = gl, Z). Defie p = 1 IgL π, Z) > gl, Z)) N! π where L π is a permutatio of the labels ad the sum is over all permutatios. Uder H 0, permutig the labels does ot chage the distributio. I other words, gl, Z) has a equal chace of havig ay rak amog all the permuted values. That is, uder H 0, Uif0, 1) ad if we reject whe p < α, the we have a level α test. Summig over all permutatios is ifeasible. But it suffices to use a radom sample of permutatios. So we do this: 1. Compute a radom permutatio of the labels ad compute W. Do this K times givig values T 1),..., T K). 2. Compute the p-value 1 K K IT j) > T ). j=1 8