BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Similar documents
Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

SUFFICIENT STATISTICS

Hypothesis Testing: The Generalized Likelihood Ratio Test

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Importance Sampling and. Radon-Nikodym Derivatives. Steven R. Dunbar. Sampling with respect to 2 distributions. Rare Event Simulation

February 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM

Masters Comprehensive Examination Department of Statistics, University of Florida

Hypothesis Testing Chap 10p460

Lecture 12 November 3

STAT 830 Hypothesis Testing

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Lecture 21. Hypothesis Testing II

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Spring 2012 Math 541B Exam 1

Chapter 4. Continuous Random Variables

STAT 830 Hypothesis Testing

Math 494: Mathematical Statistics

Chapter 7. Hypothesis Testing

HOMEWORK ASSIGNMENT 6

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Review. December 4 th, Review

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

Mathematical statistics

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Lecture 16 November Application of MoUM to our 2-sided testing problem

A Very Brief Summary of Statistical Inference, and Examples

8. Limit Laws. lim(f g)(x) = lim f(x) lim g(x), (x) = lim x a f(x) g lim x a g(x)

Hypothesis testing: theory and methods

f (1 0.5)/n Z =

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Hypothesis Testing - Frequentist

If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

Ch. 5 Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

TUTORIAL 8 SOLUTIONS #

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

On the Inefficiency of the Adaptive Design for Monitoring Clinical Trials

Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci

MAT 271E Probability and Statistics

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Chapters 10. Hypothesis Testing

ECE 275B Homework # 1 Solutions Version Winter 2015

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004


Summary of Chapters 7-9

THE DIVISION THEOREM IN Z AND R[T ]

Exercises Chapter 4 Statistical Hypothesis Testing

ST5215: Advanced Statistical Theory

Topic 19 Extensions on the Likelihood Ratio

Brownian Motion and Stochastic Calculus

Problem Selected Scores

Chapter 4. Theory of Tests. 4.1 Introduction

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Limiting Distributions

Chapter 9: Hypothesis Testing Sections

ECE 275B Homework # 1 Solutions Winter 2018

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

THE INVERSE FUNCTION THEOREM

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

2 Random Variable Generation

Problem 1 (20) Log-normal. f(x) Cauchy

Section 21. The Metric Topology (Continued)

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

First Year Examination Department of Statistics, University of Florida

8: Hypothesis Testing

Lecture 4 Lebesgue spaces and inequalities

Asymptotic Statistics-III. Changliang Zou

STAT 801: Mathematical Statistics. Hypothesis Testing

Answers to the 8th problem set. f(x θ = θ 0 ) L(θ 0 )

Numerical Sequences and Series

1 Complete Statistics

Upper Bounds for Partitions into k-th Powers Elementary Methods

Lecture notes on statistical decision theory Econ 2110, fall 2013

Probability Theory and Statistics. Peter Jochumzen

Lecture 12: Interactive Proofs

LECTURE NOTES 57. Lecture 9

In any hypothesis testing problem, there are two contradictory hypotheses under consideration.

Primer on statistics:

MATH 103 Pre-Calculus Mathematics Test #3 Fall 2008 Dr. McCloskey Sample Solutions

Analysis II: The Implicit and Inverse Function Theorems

Limiting Distributions

Chapters 10. Hypothesis Testing

Non-parametric Inference and Resampling

MATH 425, HOMEWORK 3 SOLUTIONS

Exercises and Answers to Chapter 1

Interval Estimation. Chapter 9

Mathematical statistics

Chapter 8 of Devore , H 1 :

PRIME NUMBERS YANKI LEKILI

Lecture 7 Introduction to Statistical Decision Theory

Transcription:

BEST TESTS Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. 1. Most powerful test Let {f θ } θ Θ be a family of pdfs. We will consider the simple case where Θ = {θ 0, θ 1 }, so that the family of pdfs only contain two pdfs. Let θ Θ, where θ is unknown. Consider the following simple hypothesis test with null hypothesis H 0 : θ = θ 0 and critical region C; so that we reject H 0 if X C. Suppose α = P θ0 (X C). We say that the test is a best test, at size α, if for any other possible regions, A, with P θ0 (X A) α, we have P θ1 (X C) P θ1 (X A). Sometimes best tests are also called most powerful tests. In these notes, sometimes, we will write P 0 = P θ0 and P 1 = P θ1. Exercise 1. Suppose that f 0 is the pdf for a N(0, 1) random variable and f 1 is the pdf for a N(1, 1) random variable. We wish to test the hypothesis H 0 : µ = 0 versus H 1 : µ = 1. We have only one single sample X. Consider the set C = {x R : 1 x 2}. Show that as a critical region, the set C does not correspond to a best test. Solution. Let b so that P 0 (X b) = P 0 (X C). Set A := {x R : x b}. In fact, P 0 (X C) 0.1359 and b 1.099. We claim that A := {x R : x b} gives a rejection region with more power. We have that P 1 (X C) = P 0 (0 X 1) 0.34134 whereas, P 1 (X A) = P 0 (X b 1) 0.4606. 2. Randomization Consider the following simple test. We have one sample X N(µ, 1) and we want to test H 0 : µ = 0 against H 1 : µ = 1. It is easy to define to a test with exact size α, for any α (0, 1); in particular, we even have notation for this: P 0 (X > z α ) = α, so that we can consider the 1

2 BEST TESTS test φ(x) = 1[x > z α ], where we reject H 0 if φ(x) = 1. This is possible since X is a continuous random. When X is a discrete random variable, this is no longer possible, without additional randomization. Consider the following randomized test. We have one sample X Bern(p), and we want to test H 0 : p = 1/2 against p = 3/4. Any regular non-randomized test will have reject H 0, when H 0 is true, with probability 1/2. However, we can get different values of α in the following way. Suppose α < 1/2. Let φ(x) = 1[x = 1]2α. Let U be independent of X. We reject H 0 if U φ(x), in other words, if X = 1, we reject H 0 with probability 2α, so that E 0 φ(x) = α is the probability that we reject H 0, when H 0 is true. Let X = (X 1,..., X n ) be a random sample from f θ, where θ {θ 0, θ 1 }. Consider the hypothesis test of H 0 : θ = θ 0 with critical function φ. In the randomized setting, the power function is given by β φ (θ) = E θ φ(x). The power of a test is given by β φ (θ 1 ). We say that a critical function φ defines a best test at level α, if E 0 φ(x) α and for all critical functions φ with E 0 φ α we have that β φ (θ 1 ) β φ (θ 1 ). Theorem 2 (Neyman-Pearson). Let X = (X 1,..., X n ) be a random sample for f θ, where θ {θ 0, θ 1 }. Consider the null hypothesis θ = θ 0. Let α (0, 1). Set R(X) := L(X; θ 0) L(X; θ 1 ). There exists a critical function φ and a constant k > 0 such that (a) E 0 φ(x) = α and (b) φ(x) = 1 when R(X) < k, and φ(x) = 0, when R(X) > k. Moreover, if a critical function satisfies both conditions, then it is a most powerful (randomized) test at level α. In addition, if φ is (another) most powerful (randomized) test at level α, then it satisfies second condition, and it also satisfies the first condition, except in the case where there is a test of size α < α with power 1. Note that if x is such that L(x; θ 1 ) = 0 and L(x; θ 0 ) > 0, then it does not make any practical sense to reject H 0, if x is observed. Similarly, if L(x; θ 0 ) = 0 and L(x; θ 1 ) > 0, then we should reject H 0, if x is observed.

BEST TESTS 3 Let us remark that in Theorem 2, we do not specify what happens to φ when R(X) = k; it is on this event that the randomization is necessary on this event φ will take values between (0, 1). Often, when the random variables involved are continuous, R(X) = k happens with probability zero and when the random variables involved are discrete we may required additional randomization. The idea of the proof of Theorem 2 is nice. Consider the case where the X i take values in A; we want to find C A n that maximizes L(x; θ 1 ) subject to x C P 1 (X = x) = x C L(x; θ 0 ) α. P 0 (X = x) = x C x C Which elements of A n should be allowed to be in the set C? Think of each element x C as having a cost L(x; θ 0 ) and a value L(x; θ 1 ). One guess would be elements x A n that have high relative value; that is, x A n where R(x) is small how small, well that depends on α. So, one way to build the set C is to order the elements of A n in terms of R(x), and we add elements to C starting from high to low value. However, as the cost approaches α, we may be forced to either break the order, that is, choose an element of lower relative value and/or stop before reaching spending limit α. Randomization solves this problem, as it allows us to spend the maximum limit α. Exercise 3. Let X be an integer-valued random variable with pdf f f 0, f 1, where f 0 is the discrete uniform distribution on the 13 numbers, {0, 1, 2,..., 12} and f 1 is the tent function given by f 1 (x) = x/36 for all x {0, 1,..., 6} and f 1 (x) = 1/3 x/36 for all x {7, 8,..., 12}. Consider the null hypothesis H 0 : f = f 0. On the basis of one single observation X, find the best test at significance level α = 3/13 0.23. Define a randomized best test at level α = 0.25. Find the power of your tests; that is, compute the power function on the alternate hypothesis. Solution. Consider the set C = {5, 6, 7} and critical function given by 1[X C], so that reject H 0 if X C. We have that P 0 (X C) = 3/13 = α. The power of this test is given P 1 (X C) = 5/36 + 6/36 + 5/36 = 4/9. In order to show that it is a best test, we will appeal to Theorem 2. We need to find a k such that R(X) < k if and only if X C. Note that R(6) = (1/13)/(6/36) 0.461, R(5) = R(7) = (1/13)/(5/36) 0.553

4 BEST TESTS and R(4) = R(8) = (1/13)/(4/36) 0.69. Take k = 0.6, then R(X) < k if and only if X {5, 6, 7}. If α = 0.25, then we can apply randomization in the following way. Note that P 0 (X = 4) = 1/13. So that if we expanded our critical set to contain 4, P 0 (X (4, 5, 6, 7)) = 4/13 0.308 > 0.25. Moreover, if we wanted apply the Theorem 2, we would be force to include 8, since R(4) = R(8). Consider the test that is exactly the same as before, except that when X = 4, we reject H 0 with probability 1/4; that is, set φ(x) = 1 if x {5, 6, 7}, φ(x) = 0 if x {0, 1, 2, 3, 8, 9, 10, 11, 12}, and φ(4) = 1/4. Clearly, E 0 φ(x) = α. The power is given by 4/9 + 1/9(1/4) = 17/36. Notice that we can take k = R(4), then Theorem 2 applies. Let us remark that referring to Exercise 3, in practice, one would prefer the non-randomized best test at level α = 3/13 over the randomized best test at level α = 0.25. Exercise 4. Let X be a continuous random variable with pdf f {f 0, f 1 }, where f 0 is the pdf of a uniform distribution on [0, 1] and f 1 is the pdf of a uniform distribution on [0, 2]. Consider the null hypothesis H 0 : f = f 0. On the basis of one single observation X, find the best test at significance level α. Exercise 5. Let X be a continuous random variable with pdf given by f(x; θ) = θx θ 1 1[x (0, 1)], where θ {1, 2} Consider the null hypothesis H 0 : θ = 1. On the basis of one single observation X, find the best test at significance level α. Solution. By Theorem 2, we want to find k > 0 so that P 0 (R < k) = α. We reject H 0 if R < k. Notice that under H 0, we have that X is uniformly distributed in [0, 1]. We have that R(X) = 1, so that 2X thus k = 1 2(1 α). P 0 (R < k) = P 0 (1/2k < X) = 1 1/2k = α; Exercise 6. Let X = (X 1,..., X n ) be a random sample where X 1 N(µ, 1), where µ {0, 1}. Consider the null hypothesis H 0 : µ = 0. On the basis of the random sample X, find the best test at significance level α. Solution. By Theorem 2, we want to find k > 0 so that P 0 (R < k) = α. Let T = X 1 + + X n. We know that T N(nµ, n) is a sufficient

statistic for µ; in particular, we know that BEST TESTS 5 L(x; µ) = g(t(x); µ)h(x), where h does not depend on µ and g(t, µ) is the pdf for T. So that R(X) = g(t ; 0)/g(T ; 1) = exp[ T + n 2 ]. Thus R(X) < k if and only if T + n < log k if and only if 2 Z := T n > n 2 log k =: c(k). n Notice that under H 0, we have that Z N(0, 1). Choose k so that c(k) = z α. Exercise 7. Let X = (X 1,..., X n ) be a random sample where X 1 is an exponential random variable with mean µ {2, 3} Consider the null hypothesis H 0 : µ = 2. On the basis of the random sample X, find the best test at significance level α. Proof of Theorem 2. Let F (t) = P 0 (R(X) t) be the cdf for R(X) under H 0. We have that lim t F (t) = 0 and lim t F (t) = 1 (assuming that P 0 (R(X) = ) = 0 ) Recall that F is right continuous, so that lim t a + F (t) = F (a). However, it may not be left continuous. We set F (a ) := lim x a F (t) = P 0(R(X) < a). We have that F (a ) F (a). and P 0 (R(X) = a) = F (a) F (a ). Given α (0, 1), let k > 0 be a point such that that F (k ) α F (k). (We may be forced to take k = if P 0 (R(X) = ) > 0). Set φ(x) := 1[R(x) < k] + if P 0 (R(X) = k) > 0; otherwise, set α F (k ) 1[R(x) = k], P 0 (R(X) = k) φ(x) := 1[R(x) < k]. Clearly, E 0 (φ(x)) = α, so that we constructed a critical function with the required two properties. Moreover, our φ has the property that it is a constant when R(x) = k.

6 BEST TESTS Suppose now that φ is a critical function that satisfies the two properties, we will show that φ is a best test at level α. Let φ be another critical function with E 0 φ (X) α. Note that [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )] 0, since the two terms being multiplied always have the different signs. Write dx 1 dx n = dx. In the case that X i are continuous random variables, we have that [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )]dx 0. This gives us a bound on the difference of the powers, since it implies that β φ (θ 1 ) β φ (θ 1 ) = [φ(x) φ (x)]l(x; θ 1 )dx 1 [φ(x) φ (x)]l(x; θ 0 )dx k = 1 k [α P 0(φ(X))] 0. In the discrete case, one replaces the integrals by sums. Finally, suppose φ is a most powerful test. Let φ be a most powerful test satisfying the two conditions. We will show that φ and φ are equal on the set {x : R(x) k}. Towards a contradiction, let D = {x : φ(x) φ (x) 0} {x : R(x) k}. For the continuous case, assume that D has positive Lebesgue measure; that is, 1[x D]dx > 0. Now we have that for x D [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )] < 0, from which we deduce that φ is more powerful than φ, a contradiction. In the discrete case, we need to only assume that D is non-empty for a similar contradiction. Thus φ satisfies the two conditions. In order to argue that E 0 φ (X) = α, we note that if E 0 φ (X) < α, then we could include more points to be (randomly) rejected, thereby

BEST TESTS 7 increasing the power. Thus we must have that the power is 1 or the size is α. Exercise 8. Find an example where you have that the power is 1 and the size is not 1. Exercise 9. Let X = (X 1,..., X n ) be a random sample for f θ, where θ {θ 0, θ 1 }. Consider the null hypothesis θ = θ 0. Let α (0, 1). Suppose φ is a critical function which gives E 0 φ(x) < α and E 1 φ(x) < 1. Show that there exists a critical function φ with φ > φ, E 0 φ(x) α, and E 1 φ (X) > E 1 φ(x). Corollary 10. In the context of Theorem 2, if b is the power of a most powerful test at level α (0, 1), then α < b, unless we are in the trivial case that f θ0 = f θ1. Proof of Corollary 10. Consider the test which ignores the data, where D(x) = α for all x. Clearly, E 0 D(X) = α and E 1 D(X) = α. So the critical function D gives a test of size α with power α. So, we must have that α b. Moreover, if α = b, then D is also a most powerful test, and we have that D satisfies the second condition of Theorem 2 since α (0, 1), this forces the condition that L(X; θ 0 ) = kl(x; θ 1 ), for some k, which also forces the condition that k = 1 from which we deduce that f θ0 = f θ1. Sometimes, a test with the property that the significance level is no greater than power is called unbiased. Corollary 10 gives that a best test is unbiased.