Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Similar documents
BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Hypothesis Testing Chap 10p460

Lecture 12 November 3

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

STAT 830 Hypothesis Testing

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

STAT 830 Hypothesis Testing

Topic 19 Extensions on the Likelihood Ratio

Math 152. Rumbos Fall Solutions to Exam #2

Lecture 21. Hypothesis Testing II

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Summary of Chapters 7-9

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Composite Hypotheses and Generalized Likelihood Ratio Tests

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004

Mathematical Statistics

Chapters 10. Hypothesis Testing

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Institute of Actuaries of India

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Parameter Estimation and Fitting to Data

Mathematical statistics

If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

Topic 17: Simple Hypotheses

Lecture 16 November Application of MoUM to our 2-sided testing problem

Hypothesis Testing One Sample Tests

14.30 Introduction to Statistical Methods in Economics Spring 2009

simple if it completely specifies the density of x

Topic 15: Simple Hypotheses

Mathematical statistics

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Hypothesis Testing - Frequentist

A Very Brief Summary of Statistical Inference, and Examples

Testing Statistical Hypotheses

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

F79SM STATISTICAL METHODS

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

Chapters 10. Hypothesis Testing

Hypothesis Testing: The Generalized Likelihood Ratio Test

Mathematical statistics

Hypothesis testing: theory and methods

Chapter 9: Hypothesis Testing Sections

8: Hypothesis Testing

Spring 2012 Math 541B Exam 1

STAT 801: Mathematical Statistics. Hypothesis Testing

Hypothesis testing (cont d)

ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters

4.5.1 The use of 2 log Λ when θ is scalar

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

TUTORIAL 8 SOLUTIONS #

Ch. 5 Hypothesis Testing

Math 494: Mathematical Statistics

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

MATH5745 Multivariate Methods Lecture 07

STAT 514 Solutions to Assignment #6

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

ECE531 Lecture 8: Non-Random Parameter Estimation

Importance Sampling and. Radon-Nikodym Derivatives. Steven R. Dunbar. Sampling with respect to 2 distributions. Rare Event Simulation

Central Limit Theorem ( 5.3)

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Exercises Chapter 4 Statistical Hypothesis Testing

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Chapter 7. Hypothesis Testing

A Very Brief Summary of Statistical Inference, and Examples

ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations

Completeness. On the other hand, the distribution of an ancillary statistic doesn t depend on θ at all.

Testing Statistical Hypotheses

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Lecture 15. Hypothesis testing in the linear model

Space Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses

DETECTION theory deals primarily with techniques for

Detection theory. H 0 : x[n] = w[n]

Lecture 3. Inference about multivariate normal distribution

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

14.30 Introduction to Statistical Methods in Economics Spring 2009

Statistics Ph.D. Qualifying Exam: Part II November 9, 2002

Lecture notes on statistical decision theory Econ 2110, fall 2013

Topic 10: Hypothesis Testing

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Statistical hypothesis testing The parametric and nonparametric cases. Madalina Olteanu, Université Paris 1

Transcription:

Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem within the Neyman-Pearson framework. Recall that we have a random variable X, with PDF/PMF f X x;θ), and we have a null and alternative hypothesis: H 0 : θ Θ 0, H a : θ Θ c 0. We need to construct a test statistic T X ) and a critical region C T, such that we reject the null hypothesis if T X ) C T. The power function of a test is defined as βθ) = Pr θ T X ) C T ) Given a prespecified significance level α for example α =.05), we require our test to satisfy, for all θ Θ 0, βθ) α. Subject to this restriction, we want βθ) for θ Θ c 0 to be as large as possible. Now we define a criterion that will measure optimality of a test. It requires that the probability of a type II error is minimized for all values of the parameter consistent with the alternative hypothesis. Definition Consider all tests of level α for the null hypothesis θ Θ 0 against the alternative θ Θ c 0. A test with power function βθ) is uniformly most powerful if, for all alternative tests with level α and power function β θ), βθ) β θ) for all θ Θ c 0. There is no guarantee that uniformly most powerful tests actually exist. We first study a simple case where such tests are easy to find. We focus on the case where both the null hypothesis and the alternative hypothesis are simple, that is, where the sets Θ 0 and Θ c 0 contain a single element each: H 0 : θ = θ 0, H a : θ = θ.

If a hypothesis contains more than a single point, we say that it is a composite hypothesis.) Result Neyman Pearson lemma) Consider testing the null hypothesis H 0 : θ = θ 0 against the alternative H a : θ = θ using a critical region of the form Let C X = x : f X x;θ ) k f X x;θ 0 ). α = f X x;θ 0 )dx. C X This test is the uniformly most powerful test of level α. Proof: Let βθ) denote the power function of the test proposed. Consider any other test with a critical region C X and power function β θ). Define φx) = x C X, and Consider φ x) = x C X. φx) φ x)) f X x;θ ) k f X x;θ 0 )). If this expression differs from zero, we must either have φx) φ x) = or φx) φ x) =. If φx) φ x) =, f X x;θ ) k f X x;θ 0 )) must be nonnegative by the form of the critical region C X, so the entire expression is nonnegative. If φx) φ x) =, the second factor must be 0, and the product again is nonnegative. Hence, and therefore x φx) φ x)) f X x;θ ) k f X x;θ 0 )) 0, φx) φ x)) f X x;θ ) k f X x;θ 0 ))dx x φx) φ x)) f X x;θ )dx k φx) φ x)) f X x;θ 0 )dx = βθ ) β θ ) k βθ 0 ) β θ 0 )) 0. x If both tests are level α tests, βθ 0 ) = β θ 0 ) = α, and so it must be the case that βθ ) β θ ) 0, and the second test cannot be the most powerful test. 2

Example Let us consider some examples of applications of the Neyman-Pearson Lemma. Suppose X has an exponential distribution with arrival rate λ. We wish to test the hypothesis that λ = against the alternative that λ = 2: H 0 : λ = ; H a : λ = 2. By the Neyman-Pearson lemma, we should use a critical region of the form C X = x : f X x;2) k f X x;) = x : 2 exp 2x) k exp x) = x : exp 2x) expk )exp x) = x : 2x k x = x : x k. All that is left to determine is k. Suppose we wish to test at the 0.05 level. Then we choose k to satisfy 0.05 = Pr X k H 0 ) = k 0 exp x)d x = exp k ), or and the critical region is k = ln0.95) 0.053, C X = [0, ln0.95)]. Example 2 Suppose X,..., X n are iid normal with mean µ and unit variance. We wish to test the null hypothesis µ = µ 0 against the alternative hypothesis that µ = µ, for some µ and µ 0 with µ > µ 0 : H 0 : µ = µ 0 ; H a : µ = µ. 3

By Neyman-Pearson, we want the test to reject the null if f x,..., x n ;µ ) k f x,..., x n ;µ 0 ) or equivalently: This ratio of likelihood functions is L µ ) L µ 0 ) f x,..., x n ;µ ) f x,..., x n ;µ 0 ) k. = exp 2 i x i µ ) 2) exp 2 i x i µ 0 ) 2) = exp 2 i [x 2 i 2x i µ + µ 2 ]) exp 2 i [x 2 i 2x i µ 0 + µ 2 0 ]) = exp µ µ 0 ) i x i ) C, where C is a constant which does not depend on x. Since µ µ 0 > 0, this ratio is larger than k if and only if or equivalently, x n The critical region is therefore of the form x i k, i x i k. i C X = x,..., x n ) : x k. Suppose we wish to test at the 0.05 level. Then 0.05 = Pr x k µ = µ 0 ). Under the null the distribution of x is normal with mean µ 0 and variance /n: x N µ 0, n ), so x µ 0 /n N 0,). Using a table for the standard normal distribution, we can determine that Pr ) x µ0.645 = 0.05. /n 4

So and Pr Pr x µ 0.645 n ) = 0.05, x µ 0 +.645 n ) = 0.05. Hence the critical region should be C X = x,..., x n ) : x > µ 0 +.645/ n. Example 2 also illustrates an important phenomenon. There, the critical region does not depend on the value of the parameter under the alternative hypothesis, µ. Whether the alternative is µ = µ 0 + or µ = µ 0 + 4 leads to exactly the same critical region. Thus, we can use the same test if we are testing the composite alternative hypothesis H a : µ > µ 0. Moreover, since the test is most powerful for each specific point in the alternative, the test is uniformly most powerful against the composite alternative. Uniformly most powerful tests do not always exist. They exist for some special models like the normal model, when the alternative is one-sided i.e. H a : µ > µ 0 or H a : µ < µ 0 ). What if we consider the same normal model, and test H 0 : µ = µ 0, against the two-sided alternative H : µ µ 0. If the alternative is µ = µ > µ 0 the critical region for the most powerful test is of the form C X = x,..., x n ) : x k. If the alternative is µ = µ < µ 0 the critical region of the most powerful test is of the form C X = x,..., x n ) : x k. There is therefore no test that is most powerful for all values under the alternative. In other words, there is no uniformly most powerful test. One way to get around this problem, is to impose some additional restrictions on the test, and look for uniformly most powerful tests within the restricted set of tests. A test is unbiased if the power function βθ ) βθ 0 ) for all θ Θ c 0 and all θ 0 Θ 0. That is, the probability of rejecting the null hypothesis, or of an observation in the critical region, is at least 5

as large for values of the parameters consistent with the alternative θ Θ c 0 ) as for values of the parameters consistent with the null hypothesis θ Θ 0 ). Let us consider this approach in detail for the case with a normal distribution with unknown mean and known variance. Let X,..., X n be independent and normally distributed with unknown mean µ and known variance σ 2. We are interested in testing the null hypothesis H 0 : µ = µ 0, against the alternative H : µ µ 0. Let us consider the ratio of density functions to determine the critical region: f x,..., x n µ ) f x,..., x n µ 0 ) = 2πσ2)n/2 exp n 2σ 2 i= x2 i 2µ n i= x i + nµ 2 )) 2πσ 2 ) n/2 exp n 2σ 2 i= x2 i 2µ n 0 i= x i + nµ 2 )) 0 = exp σ 2 µ µ 0 ) ) x i exp µ 2 µ2 0 )n/2σ2 )). Hence if we are looking for a uniformly most powerful test against the alternative hypothesis H : µ > µ 0, the critical region ought to be of the form C X = x,..., x n ) : x k. If we were to test against the alternative hypothesis H : µ < µ 0, the critical region ought to be of the form C X = x,..., x n ) : x k. It therefore appears sensible to base a test on the value of x, the sample average, which is a sufficient statistic for µ. It seems fairly clear that the critical region should be of the form C X = x,..., x n ) : x a or x b. Unbiasedness of the test implies that b βµ) = a 2πσ 2 /n exp ) 2σ 2 x µ)2 d x, /n is maximized at µ 0. The function is maximized at µ = a +b)/2, so that for unbiasedness we must have b µ 0 = µ 0 a. Hence the critical region is C X = x,..., x n ) : x µ 0 c or x µ 0 + c, 6

with the value of c determined by the size of the test. Under the null hypothesis the distribution of x is normal with mean µ 0 and variance σ 2 /n. Hence, if we wish to test at the 0% level, recalling that for a standard normal random variable Z Pr.645 < Z <.645) = 0.90, the critical region is C X = x,..., x n ) : x µ 0.645 σ/ n, x µ 0 +.645 σ/ n. This is the uniformly most powerful unbiased test. If we wish to test at the 5% level, the critical region is C X = x,..., x n ) : x µ 0.96 σ/ n, x µ 0 +.96 σ/ n. Equivalently we can use the critical region C X = x,..., x n ) : n x µ 0 ) 2 /σ 2 3.84, which uses the Chi squared distribution for the square of a standard normal random variable. In fact a common way of doing the test is to calculate the test statistic, here n x µ 0 ) 2 /σ 2 which under the null hypothesis has a known distribution, in this case a χ 2 ) distribution. We reject the null hypothesis if the test statistic exceeds the critical value, in this case 3.84 at the 5% level or 2.706 at the 0% level. 7