Chapter 4. Theory of Tests. 4.1 Introduction

Similar documents
Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Lecture 21. Hypothesis Testing II

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Ch. 5 Hypothesis Testing

STAT 830 Hypothesis Testing

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Testing Statistical Hypotheses

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Some General Types of Tests

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

Lecture 17: Likelihood ratio and asymptotic tests

Chapter 7. Hypothesis Testing

Composite Hypotheses and Generalized Likelihood Ratio Tests

DA Freedman Notes on the MLE Fall 2003

Chapter 3. Point Estimation. 3.1 Introduction

STAT 830 Hypothesis Testing

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

LECTURE NOTES 57. Lecture 9

λ(x + 1)f g (x) > θ 0

2014/2015 Smester II ST5224 Final Exam Solution

Testing Statistical Hypotheses

Hypothesis testing: theory and methods

Theory of Statistical Tests

Lecture 12 November 3

simple if it completely specifies the density of x

10. Composite Hypothesis Testing. ECE 830, Spring 2014

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

A Very Brief Summary of Statistical Inference, and Examples

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

8 Testing of Hypotheses and Confidence Regions

Non-parametric Inference and Resampling

Statistical hypothesis testing The parametric and nonparametric cases. Madalina Olteanu, Université Paris 1

Statistics and econometrics

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Exercises Chapter 4 Statistical Hypothesis Testing

Chapter 9: Hypothesis Testing Sections

Mathematical statistics

STAT 801: Mathematical Statistics. Hypothesis Testing

On the GLR and UMP tests in the family with support dependent on the parameter

f(y θ) = g(t (y) θ)h(y)

Lecture 26: Likelihood ratio tests

A Very Brief Summary of Statistical Inference, and Examples

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

To appear in The American Statistician vol. 61 (2007) pp

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

2.1.3 The Testing Problem and Neave s Step Method

Lecture 23: UMPU tests in exponential families

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Greene, Econometric Analysis (6th ed, 2008)

Derivation of Monotone Likelihood Ratio Using Two Sided Uniformly Normal Distribution Techniques

Introduction to Estimation Methods for Time Series models Lecture 2

ECE 275B Homework # 1 Solutions Winter 2018

More Empirical Process Theory

Hypothesis Testing - Frequentist

ECE 275B Homework # 1 Solutions Version Winter 2015

1. Fisher Information

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Central Limit Theorem ( 5.3)

ML Testing (Likelihood Ratio Testing) for non-gaussian models

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Define characteristic function. State its properties. State and prove inversion theorem.

Chapter 6 Testing. 1. Neyman Pearson Tests. 2. Unbiased Tests; Conditional Tests; Permutation Tests

STAT 461/561- Assignments, Year 2015

Asymptotics for Nonlinear GMM

4 Invariant Statistical Decision Problems

Advanced Quantitative Methods: maximum likelihood

Review. December 4 th, Review

The outline for Unit 3

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero

Lecture 7 Introduction to Statistical Decision Theory

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Mathematical statistics

Mathematical Statistics

Hypothesis Testing (May 30, 2016)

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

14.30 Introduction to Statistical Methods in Economics Spring 2009

Brief Review on Estimation Theory

Lecture 2 Machine Learning Review

Optimal Tests of Hypotheses (Hogg Chapter Eight)

Lecture 16 November Application of MoUM to our 2-sided testing problem

5.1 Uniformly Most Accurate Families of Confidence

Math 494: Mathematical Statistics

Math 152. Rumbos Fall Solutions to Assignment #12

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Summary of Chapters 7-9

Assumptions of classical multiple regression model

Testing and Model Selection

Transcription:

Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule d, where d K, if x K d(x) = d A, if x A is called a non randomised test. One tries to choose K in such a way that the number of wrong decisions becomes as small as possible. We distinguish: Type I error: H 0 is correct, but is rejected (decision d K ). Type II error: H 1 is correct, but decision for H 0 (decision d A ). Decision for H 0 H 1 is correct is correct H 0 correct Type II error H 1 Type I error correct Given a boundary α (significance level) for the probability of committing an error of I. kind one tries to find a test which minimizes the probability of an error of II. kind. 37

38 CHAPTER 4. THEORY OF TESTS Def. 4.1.1: (a) A measurable function ϕ : X [0, 1] is called a test function (a test). ϕ(x) is the probability for the decision d K, if x is the sample outcome. (b) ϕ is called an α-level test, if sup E θ [ϕ(x)] α. (4.1) θ H 0 (c) The probability of rejecting H 0 if P θ is the underlying distribution, β ϕ (θ) := P θ (d K ) = E θ [ϕ(x)] is called the power of the test. β ϕ : Θ [0, 1] is the power function of the test ϕ. The left hand side of (4.1) is called the size of the test ϕ. (d) If φ α is the set of all α level tests for the test problem (H 0, H 1 ), then ϕ 0 φ α is a most powerful test (MP-test) for an alternative θ H 1, if β ϕ0 (θ) β ϕ (θ) ϕ φ α and ϕ φ α is a uniformly most powerful test (UMP-test) for H 0 against H 1 of level α, if β ϕ (θ) = sup ϕ φ α β ϕ (θ) θ H 1. (4.2) (e) A test ϕ φ α is called unbiased, if β ϕ (θ) α θ H 1. (4.3) (f) A solution of (4.1), (4.2) and (4.3) is called a uniformly most powerful unbiased (UMPU-) α level test.

4.2. TEST OF A SIMPLE HYPOTHESIS AGAINST A SIMPLE ALTERNATIVE39 4.2 Test of a Simple Hypothesis against a Simple Alternative In this section Θ = {θ 0, θ 1 }, H 0 = {θ 0 }, H 1 = {θ 1 }. In this case there always exists a dominating measure, e.g. the measure µ = P θ0 + P θ1. The densities are denoted by f( ; θ 0 ) = f 0, f( ; θ 1 ) = f 1. Theorem 4.2.1 (Foundamental Lemma of Neyman and Pearson): (a) Any test of the form 1, if f 1 (x) > kf 0 (x) ϕ(x) = γ(x), if f 1 (x) = kf 0 (x) 0, if f 1 (x) < kf 0 (x) (4.4) with k 0, 0 γ(x) 1, is a most powerful test of its size with 0 α 1 for H 0 : θ = θ 0 against H 1 : θ = θ 1. For k =, the test 1, if f 0 (x) = 0 ϕ(x) = 0, if f 0 (x) > 0 (4.5) is an M.P. test of its size α = 0 for H 0 against H 1. (b) For each level α, with 0 α 1, there exists a test of the form (4.4) or (4.5) with E θ0 [ϕ(x)] = α. Here γ(x) = γ (a constant). The constants k and γ, 0 γ 1, are determined by α = E θ0 [ϕ(x)] = P θ0 (f 1 (X) > kf 0 (X)) + γp θ0 (f 1 (X) = kf 0 (X))(4.6) Remark: A test of type (4.4) is called a Neyman-Pearson test with accompanying number k. Remarks:

40 CHAPTER 4. THEORY OF TESTS As the reasoning in the proof shows, the case µ{f 1 = kf 0 } = 0 leads to a non randomized test. Since the trivial α level test ϕ α with E θ0 [ϕ (X)] = E θ1 [ϕ (X)] = α does not have the form (4.4), it follows that E θ1 [ϕ(x)] α which means that a Neyman-Pearson test is unbiased. If there exists a sufficient statistic S for the family {f 0, f 1 }, then the NP test is a function of S. 4.3 Families With Monotone Likelihood Ratio In this section we consider the problem of testing one-sided hypotheses for Θ IR an interval. In the sequel let P = {P θ θ Θ} µ and assume, that for the µ densities f( ; θ) > 0 µ a.e. for all θ Θ holds. Def. 4.3.1: We say that the family P has a monotone likelihood ratio (MLR) in the statistic T (X), if for θ 1 < θ 2, f( ; θ 1 ) f( ; θ 2 ), the ratio f(x; θ 2 )/f(x; θ 1 ) is a nondecreasing (nonincreasing) function of T (x) on {x X f(x; θ 1 ) > 0 f(x; θ 2 ) > 0}. Theorem 4.3.1: The familiy E 1 with density f(x; θ) = C(θ) exp{q(θ)t (x)}h(x) has for Q nondecreasing (nonincreasing) a monotone likelihood ratio in T (X). Remark: With the reparametrization λ := Q(θ) this property can always be achieved. Theorem 4.3.2: Let the family F = {f( ; θ) θ Θ} have a monotone likelihood ration in T (x). For testing H 0 : θ θ 0 against H 1 : θ > θ 0 any test of the form

4.4. UNBIASED TESTS 41 1, if T (x) > c ϕ(x) = γ, if T (x) = c 0, if T (x) < c (4.7) has a nondecreasing power function and is UMP of its size E θ0 [ϕ(x)] = α (provided the size α > 0). Remark: For symmetry reasons Theorem 4.3.2 yields also a UMP test for the test problem H 0 : θ θ 0 against H 1 : θ < θ 0. In general, the results of the above Theorem cannot be extended to two-sided problems. One exception is the family E 1 : Theorem 4.3.3: For the family E 1 there exists a UMP test of the hypothesis H 0 : θ θ 1 or θ θ 2 (θ 1 < θ 2 against H 1 : θ 1 < θ < θ 2 that is of the form 1, if c 1 < T (X) < c 2 ϕ(x) = γ i, if T (X) = c i, i = 1, 2 0, if T (X) < c 1 or T (X) > c 2, where the c s and the γ s are given by E θ1 [ϕ(x)] = E θ2 [ϕ(x)] = α. Remark: UMP tests for H 0 : θ 1 θ θ 1 or H 0 : θ = θ 0 do not exist, even in the family E 1. 4.4 Unbiased Tests Unbiased tests we encountered already in Def. 4.1.1. They have the property that β ϕ (θ) α for θ Θ 0 and β ϕ (θ) α for θ Θ 1. 4.4.1 α Similar Tests Def. 4.4.1: (1) Let U α φ α be the class of all unbiased sized tests of H 0.

42 CHAPTER 4. THEORY OF TESTS (2) A test ϕ is said to be α similar on a subset Θ Θ, if β ϕ (θ) = E θ [ϕ(x)] = α for θ Θ. (3) A test is said to be similar on a set Θ Θ, if its α similar for some α [0, 1]. Theorem 4.4.1: Let β ϕ (θ) be continuous in θ for any ϕ. If ϕ U α for H 0 against H 1, then it is α similar on the boundary Λ = Θ 0 Θ 1. Def. 4.4.2: A test ϕ that is UMP among all α similar tests on the boundary Λ is said to ba a UMP α similar test. Theorem 4.4.2: Let the power function β of every test ϕ of H 0 against H 1 be continuous in θ. Then a UMP α similar test is UMP unbiased, provided its size is α. Remark: The continuity of β ϕ is not always easy to show. 4.4.2 Local MP Unbiased Tests To test the hypothesis H 0 : θ θ 0 we try to find a locally optimal unbiased test which, in a neighbourhood of θ 0 fulfils the following conditions: (0) β ϕ is twice continuously differentiable with respect to θ (1) β ϕ (θ 0 ) = α (2) β ϕ(θ 0 ) = 0 (3) β ϕ(θ 0 ) max. Theorem 4.4.3 (Locally MP Unbiased Tests): Let f θ F = {f θ θ Θ} be twice continuously differentiable in θ. If the power function of a test ϕ, given by 1, f(x; θ0 ) > k 0 f(x; θ 0 ) + k 1 f(x; θ 0 ) ϕ(x; k 0, k 1, c) = γ, f(x; θ0 ) = k 0 f(x; θ 0 ) + k 1 f(x; θ 0 ) 0, f(x; θ0 ) < k 0 f(x; θ 0 ) + k 1 f(x; θ 0 )

4.4. UNBIASED TESTS 43 fulfils the conditions (0), (1) and (2), then also (3) is fulfilled. The question whether one can always find constants k 0, k 1 and γ such that (1) and (2) holds, remains open, excepting exponential families. Theorem 4.4.4: Let P = E 1 with µ density f(x; θ) = C(θ)e θt (x) h(x). If the power function of the test 1, T (X) / [T 1, T 2 ] ϕ(x; T 1, T 2, c) = c, T (X) = T 1 or T (X) = T 2 ] 0, T (X) [T 1, T 2 ] fulfils (1) and (2), then also (3). (4.8) 4.4.3 UMP Unbiased Tests in One-Parameter Exponential Families Ref. Lehmann, Testing... (1997), pp. 134 ff. In 4.3 we have seen that UMP-test for hypotheses (i) H 0 : θ θ 0 against H 1 : θ > θ 0 or (ii) H 0 : θ θ 1 or θ θ 2 against H 1 : θ 1 < θ < θ 2 exist, but not for (iii) H 0 : θ 1 θ θ 2 against H 1 : θ < θ 1 or θ > θ 2. Theorem 4.4.5: Let P = E 1 with µ density f(x; θ) = C(θ)e θt (x) h(x). Then there exists a UMP Unbiased test, which is given by (4.8), where the constants T 1 and T 2 and γ are given by E θ1 [ϕ(x)] = E θ2 [ϕ(x)] = α. (4.9) 4.4.4 Invariant Tests Def. 4.4.3:

44 CHAPTER 4. THEORY OF TESTS (1) A group G of transformations on X leves the hypothesis testing problem invariant if it leaves both {P θ θ Θ 0 } and {P θ θ Θ 1 }invariant. (2) We say that ϕ is invariant under G if ϕ(g(x)) = ϕ(x) for all x X and g G. (3) A statistic T is (a) invariant under G, if T (g(x)) = T (x) x X and g G. (b) maximal invariant, if T (x 1 ) = T (x 2 ) x 1 g G. = g(x 2 ) for some Def. 4.4.4: Let ΦI α donote the set of all invariant tests of size α with respect to G for H 0 : θ Θ 0 against H 1 : θ Θ 1. If there exists a UMP test in ΦI α, then we call it a UMP invariant test of H 0 against H 1. Theorem 4.4.6: Let T be maximal invariant with respect to G. ϕ is invariant under G if and only if ϕ is a function of T. Then Remark: If a hypothesis testing problem is invariant under a group G, it suffices to restrict attention to functions of maximal invariant statistics T. 4.5 Likelihood Ratio Tests Let P = {P θ θ Θ} µ and Θ = Θ 0 + Θ 1. In many cases UMP tests do not exist, and where they exist, the approach can only be applied to particular families of distributions. The Likelihood Ratio test (LR) is an intuitive and plausible procedure which often leads to UMPU tests. Def. 4.5.1: For testing H 0 against H 1, a test of the form: reject H 0 if and only if λ(x) > c, where c is some constant and λ(x) = sup f(x 1,..., x n ; θ) θ Θ sup f(x 1,..., x n ; θ) = θ Θ 0 f(x 1,..., x n ; ˆθ ML ) f(x 1,..., x n ; θ)

4.6. ASYMPTOTIC TESTS 45 is called a likelihood ratio test. Here ˆθ ML is the unrestricted Maximum Likelihood estimator, θ is the MLestimator under the restriction θ Θ 0. The constant c is determined from the size restriction sup P θ (x λ(x) > c) = α. θ Θ 0 It can easily be seen that for testing a simple hypthesis against a simple alternative to a given size α(0 α 1) nonrandomized Neyman-Pearson tests and LR tests are equivalent, if they exist; the LR test for θ Θ 0 against θ Θ 1 is a function of every sufficient statistic S for θ (see Theorem 2.2.1 resp. 2.2.2). Theorem 4.5.1: Let the regularity conditions of Theorem 3.2.5 (Cramér- Rao inequality) hold. Then under H 0 the statistic2 ln λ(x) is asymptotically distributed as a χ 2 random variable with degrees of freedom equal to the difference between the number of independent parameters in Θ and the number in Θ 0. 4.6 Asymptotic Tests For 4.6.1-4.6.3 see Buse, The American Statistician, 1982, 36, pp. 153-157. Let Θ IR k. H 0 : h(θ) = 0 where h : IR k IR r (r k) 4.6.1 Wald-Test See: Transactions of the American Mathematical Society 1943, pp. 426-482. Let R θ := h(θ) with rank R θ T θ = r and W = h(ˆθ ML ) T [ RˆθML [I(ˆθ ML )] 1 RˆθML ] h(ˆθml ),

46 CHAPTER 4. THEORY OF TESTS where ˆθ ML is the unrestricted ML estimator. Under H 0 W is asymptotically χ 2 (r) distributed and the test is of the form 1, W > c ϕ(x) = for a certain constant c. 0, W < c 4.6.2 Lagrange Multiplier Test It is based on the Lagrange multiplier approach: Φ(θ; η) = l(θ; η) + η T h(θ), where η is the Lagrange multiplier. Let ˆθ (r) = arg sup θ H 0 L(θ; X 1,..., X n ). The test statistic is where LM = (ˆθ (r) ) T [I(ˆθ (r) )] 1 (ˆθ (r) ) = Ψ(ˆθ (r) ) T I(ˆθ (r) ) 1 Ψ(ˆθ (r) ), (ˆθ (r) ) = Ψ(ˆθ (r) ) is the score function log f θ at θ = ˆθ (r). Under H 0 LM has an asymptotic χ 2 (r) distribution and the test statistic is distributed as in 4.6.1. 4.6.3 Likelihood Ratio Test It works as described in 4.5, where for determining the constant c the asymptotic χ 2 (r) distribution is being used. 4.7 Goodness of Fit Tests We consider the general testing problem H 0 : P P 0 against H 1 : P P 1, where P = P 0 +P 1. Def. 4.7.1 A sequence of tests (ϕ n (X)) n IN is called consistent for the testing problem H 0 : P P 0 against H 1 : P P 1, if lim β 1, P ϕ n n (x) = X P 1 0, P X P 0

4.7. GOODNESS OF FIT TESTS 47 For consistent tests the power function converges to the ideal power function, which for the onesided problem H 0 : θ θ 0 against H 1 : θ > θ 0 is given by the Heaviside function 1, θ > θ 0 H θ0 (θ) = 0, θ θ 0 At the first instant we look at the (two-sided) testing problem H 0 : P X = P 0 with c.d.f. F 0 against H 1 : P X P 0. Here the class P is the class of all distributions with densities with respect to Lebesque measure. In H 0 the density F 0 has to be specified completely. According to the Gliwenko-Cantinelli Lemma (Theorem 1.2) the empirical c.d.f. F n converges almost surely univormly to F 0. For the maximal difference n := sup F n (x) F 0 (x) x IR the following result holds: Theorem 4.7.1 (Kolmogorov): Let the c.d.f. F 0 be continuous. Then lim P ( n n z) = H(z), n where ( 1) k e 2k2 z 2 z > 0 H(z) = k= 0 z 0. The limit distribution H does obviously not depend on F 0. Hence the asymptotic test 1 n n > k ϕ(x) = 0 n n < k with k equal to the (1 α) Quantil of H is distribution free. The Gliwenko- Cantelli lemma ensures consistency.

48 CHAPTER 4. THEORY OF TESTS A further asymptotic goodness of fit test is the so-called χ 2 goodness of fit test. It is based on a comparison between observed and under H 0 expected frequencies. Starting point is the following asymptotic result. Theorem 4.7.2: Let the random vector (X 1,..., X k ) have a polynomial M(n; p 1,..., p k ) distribution with 0 < p i < 1, p i = 1. Then the k statistic i=1 k X 2 (X i np i ) 2 =, X 1 +... + X k = n, np i i=1 is asymptotically χ 2 (k 1) distributed. The χ 2 -test is 1 X 2 > c ϕ(x) = 0 X 2 < c where c is the (1 α) Quantile of the χ 2 (k 1) distribution. Some rules of prudence are adequate when this test is applied in practice. For F 0 continuous an appropriate division of IR into k classes is necessary such that p i = F 0 (x i ) F 0 (x i 1). The p i should be approximately equal and as a rule of thumb np i 5 for all i = 1,..., k is recommended. The test is also applicable if the parameteres of some F θ are estimated by ML (with a corresponding reduction of degrees of freedom).