Lecture 26: Likelihood ratio tests

Similar documents
Stat 710: Mathematical Statistics Lecture 27

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 32: Asymptotic confidence sets and likelihoods

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Stat 710: Mathematical Statistics Lecture 12

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Lecture 23: UMPU tests in exponential families

Lecture 13: p-values and union intersection tests

Stat 710: Mathematical Statistics Lecture 40

Lecture 17: Minimal sufficiency

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets

Lecture 20: Linear model, the LSE, and UMVUE

Stat 710: Mathematical Statistics Lecture 31

Some General Types of Tests

A Very Brief Summary of Statistical Inference, and Examples

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

ST5215: Advanced Statistical Theory

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Lecture 16: Sample quantiles and their asymptotic properties

Hypothesis testing: theory and methods

Chapter 4. Theory of Tests. 4.1 Introduction

STAT 461/561- Assignments, Year 2015

TUTORIAL 8 SOLUTIONS #

Nuisance parameters and their treatment

simple if it completely specifies the density of x

Lecture 28: Asymptotic confidence sets

Composite Hypotheses and Generalized Likelihood Ratio Tests

Exercises Chapter 4 Statistical Hypothesis Testing

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Lecture 34: Properties of the LSE

Lecture 12 November 3

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

Mathematical Statistics

Master s Written Examination

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Probability and Statistics qualifying exam, May 2015

Mathematical statistics

Math 152. Rumbos Fall Solutions to Assignment #12

Chapter 9: Hypothesis Testing Sections

Lecture 21. Hypothesis Testing II

A Very Brief Summary of Statistical Inference, and Examples

Review and continuation from last week Properties of MLEs

Stat 8931 (Aster Models) Lecture Slides Deck 8

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Goodness-of-Fit Testing with. Discrete Right-Censored Data

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

1. Fisher Information

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Hypothesis Testing: The Generalized Likelihood Ratio Test

Chapter 3 : Likelihood function and inference

LECTURE NOTES 57. Lecture 9

Lecture 14: Multivariate mgf s and chf s

On the GLR and UMP tests in the family with support dependent on the parameter

STAT 512 sp 2018 Summary Sheet

Non-parametric Inference and Resampling

ST5215: Advanced Statistical Theory

Part III. A Decision-Theoretic Approach and Bayesian testing

2.6.3 Generalized likelihood ratio tests

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Lecture 21: Convergence of transformations and generating a random variable

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Hypothesis Testing - Frequentist

Empirical Likelihood

Semiparametric posterior limits

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Chapter 7. Hypothesis Testing

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Statistics 3858 : Maximum Likelihood Estimators

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Estimation of Dynamic Regression Models

1 Hypothesis Testing and Model Selection

Lecture 21: October 19

Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

STAT 830 Hypothesis Testing

Math 494: Mathematical Statistics

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Chapter 11. Hypothesis Testing (II)

SOLUTION FOR HOMEWORK 6, STAT 6331

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

Stat 512 Homework key 2

Master s Written Examination - Solution

Fundamentals of Statistics

Testing Algebraic Hypotheses

A Very Brief Summary of Bayesian Inference, and Examples

Bootstrap and Parametric Inference: Successes and Challenges

Central Limit Theorem ( 5.3)

Lecture 16 November Application of MoUM to our 2-sided testing problem

Lecture 3. Inference about multivariate normal distribution

Transcription:

Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for some c 0 > 0. The following definition is a natural extension of this idea. Definition 6.2 Let l(θ) = f θ (X) be the likelihood function. For testing H 0 : θ Θ 0 versus H 1 : θ Θ 1, a likelihood ratio (LR) test is any test that rejects H 0 if and only if λ(x) < c, where c [0,1] and λ(x) is the likelihood ratio defined by / λ(x) = sup l(θ) θ Θ 0 supl(θ). θ Θ UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 1 / 10

Lecture 26: Likelihood ratio tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f θ0 (X) > c 0 for some c 0 > 0. The following definition is a natural extension of this idea. Definition 6.2 Let l(θ) = f θ (X) be the likelihood function. For testing H 0 : θ Θ 0 versus H 1 : θ Θ 1, a likelihood ratio (LR) test is any test that rejects H 0 if and only if λ(x) < c, where c [0,1] and λ(x) is the likelihood ratio defined by / λ(x) = sup l(θ) θ Θ 0 supl(θ). θ Θ UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 1 / 10

Discussions If λ(x) is well defined, then λ(x) 1. The rationale behind LR tests is that when H 0 is true, λ(x) tends to be close to 1, whereas when H 1 is true, λ(x) tends to be away from 1. If there is a sufficient statistic, then λ(x) depends only on the sufficient statistic. LR tests are as widely applicable as MLE s in 4.4 and, in fact, they are closely related to MLE s. If θ is an MLE of θ and θ 0 is an MLE of θ subject to θ Θ 0 (i.e., Θ 0 is treated as the parameter space), then λ(x) = l( θ 0 ) / l( θ). For a given α (0,1), if there exists a c α [0,1] such that sup P θ (λ(x) < c α ) = α, θ Θ 0 then an LR test of size α can be obtained. Even when the c.d.f. of λ(x) is continuous or randomized LR tests are introduced, it is still possible that such a c α does not exist. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 2 / 10

Optimality When a UMP or UMPU test exists, an LR test is often the same as this optimal test. Proposition 6.5 Suppose that X has a p.d.f. in a one-parameter exponential family: f θ (x) = exp{η(θ)y(x) ξ(θ)}h(x) w.r.t. a σ-finite measure ν, where η is a strictly increasing and differentaible function of θ. (i) For testing H 0 : θ θ 0 versus H 1 : θ > θ 0, there is an LR test whose rejection region is the same as that of the UMP test T given in Theorem 6.2. (ii) For testing H 0 : θ θ 1 or θ θ 2 versus H 1 : θ 1 < θ < θ 2, there is an LR test whose rejection region is the same as that of the UMP test T given in Theorem 6.3. (iii) For testing the other two-sided hypotheses, there is an LR test whose rejection region is equivalent to Y(X) < c 1 or Y(X) > c 2 for some constants c 1 and c 2. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 3 / 10

Optimality When a UMP or UMPU test exists, an LR test is often the same as this optimal test. Proposition 6.5 Suppose that X has a p.d.f. in a one-parameter exponential family: f θ (x) = exp{η(θ)y(x) ξ(θ)}h(x) w.r.t. a σ-finite measure ν, where η is a strictly increasing and differentaible function of θ. (i) For testing H 0 : θ θ 0 versus H 1 : θ > θ 0, there is an LR test whose rejection region is the same as that of the UMP test T given in Theorem 6.2. (ii) For testing H 0 : θ θ 1 or θ θ 2 versus H 1 : θ 1 < θ < θ 2, there is an LR test whose rejection region is the same as that of the UMP test T given in Theorem 6.3. (iii) For testing the other two-sided hypotheses, there is an LR test whose rejection region is equivalent to Y(X) < c 1 or Y(X) > c 2 for some constants c 1 and c 2. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 3 / 10

Proof We prove (i) only. Let θ be the MLE of θ. Note that l(θ) is increasing when θ θ and decreasing when θ > θ. Thus, { 1 θ θ 0 λ(x) = l(θ 0 ) l( θ) θ > θ 0. Then λ(x) < c is the same as θ > θ 0 and l(θ 0 )/l( θ) < c. From the property of exponential families, θ is a solution of the likelihood equation logl(θ) θ = η (θ)y(x) ξ (θ) = 0 and ψ(θ) = ξ (θ)/η (θ) has a positive derivative ψ (θ). Since η ( θ)y ξ ( θ) = 0, θ is an increasing function of Y and d θ dy > 0. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 4 / 10

Proof (continued) Consequently, for any θ 0 Θ, d [ ] logl( θ) logl(θ 0 ) dy = d [ ] η( θ)y ξ( θ) η(θ 0 )Y + ξ(θ 0 ) dy = d θ dy η ( θ)y + η( θ) d θ dy ξ ( θ) η(θ 0 ) = d θ dy [η ( θ)y ξ ( θ)]+η( θ) η(θ 0 ) = η( θ) η(θ 0 ), which is positive (or negative) if θ > θ 0 (or θ < θ 0 ), i.e., logl( θ) logl(θ 0 ) is strictly increasing in Y when θ > θ 0 and strictly decreasing in Y when θ < θ 0. Hence, for any d R, θ > θ 0 and l(θ 0 )/l( θ) < c is equivalent to Y > d for some c (0,1). UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 5 / 10

Example 6.20 Consider the testing problem H 0 : θ = θ 0 versus H 1 : θ θ 0 based on i.i.d. X 1,...,X n from the uniform distribution U(0,θ). We now show that the UMP test with rejection region X (n) > θ 0 or X (n) θ 0 α 1/n given in Exercise 19(c) is an LR test. Note that l(θ) = θ n I (X(n), )(θ). Hence { (X(n) /θ λ(x) = 0 ) n X (n) θ 0 0 X (n) > θ 0 and λ(x) < c is equivalent to X (n) > θ 0 or X (n) /θ 0 < c 1/n. Taking c = α ensures that the LR test has size α. Example 6.21 Consider normal linear model X = N n (Z β,σ 2 I n ) and the hypotheses H 0 : Lβ = 0 versus H 1 : Lβ 0, where L is an s p matrix of rank s r and all rows of L are in R(Z). UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 6 / 10

Example 6.20 Consider the testing problem H 0 : θ = θ 0 versus H 1 : θ θ 0 based on i.i.d. X 1,...,X n from the uniform distribution U(0,θ). We now show that the UMP test with rejection region X (n) > θ 0 or X (n) θ 0 α 1/n given in Exercise 19(c) is an LR test. Note that l(θ) = θ n I (X(n), )(θ). Hence { (X(n) /θ λ(x) = 0 ) n X (n) θ 0 0 X (n) > θ 0 and λ(x) < c is equivalent to X (n) > θ 0 or X (n) /θ 0 < c 1/n. Taking c = α ensures that the LR test has size α. Example 6.21 Consider normal linear model X = N n (Z β,σ 2 I n ) and the hypotheses H 0 : Lβ = 0 versus H 1 : Lβ 0, where L is an s p matrix of rank s r and all rows of L are in R(Z). UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 6 / 10

Example 6.21 (continued) The likelihood function in this problem is l(θ) = ( 1 2πσ 2 ) n/2 exp { 1 2σ 2 X Z β 2 }, θ = (β,σ 2 ). Since X Z β 2 X Z β 2 for any β and the LSE β, ( ) n/2 { l(θ) 1 2πσ exp 1 X Z β } 2. 2 2σ 2 Treating the right-hand side of this expression as a function of σ 2, it is easy to show that it has a maximum at σ 2 = σ 2 = X Z β 2 /n and supl(θ) = (2π σ 2 ) n/2 e n/2. θ Θ Similarly, let β H0 be the LSE under H 0 and σ H 2 0 = X Z β H0 2 /n. Then sup l(θ) = (2π σ H 2 0 ) n/2 e n/2. θ Θ 0 Thus, ( λ(x) = ( σ 2 / σ H 2 0 ) n/2 X Z = β ) n/2 2 X Z β. H0 2 UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 7 / 10

Example 6.21 (continued) For a two-sample problem, we let n = n 1 + n 2, β = (µ 1, µ 2 ), and ( Jn1 0 Z = 0 J n2 Testing H 0 : µ 1 = µ 2 versus H 1 : µ 1 µ 2 is the same as testing H 0 : Lβ = 0 versus H 1 : Lβ 0 with L = ( 1 1 ). Since β H0 = X and β = ( X 1, X 2 ), where X 1 and X 2 are the sample means based on X 1,...,X n1 and X n1 +1,...,X n, respectively, we have and n σ 2 = n 1 i=1 (X i X 1 ) 2 + n i=n 1 +1 ). (X i X 2 ) 2 = (n 1 1)S 2 1 +(n 2 1)S 2 2 n σ 2 H 0 = (n 1)S 2 = n 1 n 1 n 2 ( X 1 X 2 ) 2 +(n 1 1)S 2 1 +(n 2 1)S 2 2. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 8 / 10

Example 6.21 (continued) Therefore, λ(x) < c is equivalent to t(x) > c 0, where t(x) = ( X 2 X 1 ) / n 1 1 + n 1 2, [(n 1 1)S1 2 +(n 2 1)S2 2]/(n 1 + n 2 2) and LR tests are the same as the two-sample two-sided t-tests in 6.2.3. Asymptotic tests It is often difficult to construct tests (such as LR tests) with exactly size α or level α. Asymptotic approximation can be used Statistical inference based on asymptotic criteria and approximations is called asymptotic statistical inference or simply asymptotic inference. We now focus on asymptotic hypothesis tests. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 9 / 10

Example 6.21 (continued) Therefore, λ(x) < c is equivalent to t(x) > c 0, where t(x) = ( X 2 X 1 ) / n 1 1 + n 1 2, [(n 1 1)S1 2 +(n 2 1)S2 2]/(n 1 + n 2 2) and LR tests are the same as the two-sample two-sided t-tests in 6.2.3. Asymptotic tests It is often difficult to construct tests (such as LR tests) with exactly size α or level α. Asymptotic approximation can be used Statistical inference based on asymptotic criteria and approximations is called asymptotic statistical inference or simply asymptotic inference. We now focus on asymptotic hypothesis tests. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 9 / 10

Definition 2.13 Let X = (X 1,...,X n ) be a sample from P P and T n (X) be a test for H 0 : P P 0 versus H 1 : P P 1. (i) If limsup n α Tn (P) α for any P P 0, then α is an asymptotic significance level of T n. (ii) If lim n sup P P0 α Tn (P) exists, it is called the limiting size of T n. (iii) T n is consistent iff the type II error probability converges to 0. Discussion If P 0 is not a parametric family, it is likely that the limiting size of T n is 1 (see, e.g., Example 2.37). This is the reason why we consider the weaker requirement in Definition 2.13(i). If α (0,1) is a pre-assigned level of significance for the problem, then a consistent test T n having asymptotic significance level α is called asymptotically correct, and a consistent test having limiting size α is called strongly asymptotically correct. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 10 / 10

Definition 2.13 Let X = (X 1,...,X n ) be a sample from P P and T n (X) be a test for H 0 : P P 0 versus H 1 : P P 1. (i) If limsup n α Tn (P) α for any P P 0, then α is an asymptotic significance level of T n. (ii) If lim n sup P P0 α Tn (P) exists, it is called the limiting size of T n. (iii) T n is consistent iff the type II error probability converges to 0. Discussion If P 0 is not a parametric family, it is likely that the limiting size of T n is 1 (see, e.g., Example 2.37). This is the reason why we consider the weaker requirement in Definition 2.13(i). If α (0,1) is a pre-assigned level of significance for the problem, then a consistent test T n having asymptotic significance level α is called asymptotically correct, and a consistent test having limiting size α is called strongly asymptotically correct. UW-Madison (Statistics) Stat 710, Lecture 26 Jan 2012 10 / 10