Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Similar documents
Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Mathematical Statistics

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Ch. 5 Hypothesis Testing

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

A Very Brief Summary of Statistical Inference, and Examples

simple if it completely specifies the density of x

Some General Types of Tests

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Chapter 7. Hypothesis Testing

Chapter 4. Theory of Tests. 4.1 Introduction

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

Lecture 21: October 19

14.30 Introduction to Statistical Methods in Economics Spring 2009

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Statistical Inference

Central Limit Theorem ( 5.3)

Lecture 17: Likelihood ratio and asymptotic tests

A Very Brief Summary of Statistical Inference, and Examples

F79SM STATISTICAL METHODS

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

Hypothesis testing: theory and methods

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

DA Freedman Notes on the MLE Fall 2003

Statistics and econometrics

Loglikelihood and Confidence Intervals

Topic 19 Extensions on the Likelihood Ratio

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Lecture 32: Asymptotic confidence sets and likelihoods

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline)

TUTORIAL 8 SOLUTIONS #

Summary of Chapters 7-9

Hypothesis Testing: The Generalized Likelihood Ratio Test

Review. December 4 th, Review

Hypothesis Testing Chap 10p460

Notes on the Multivariate Normal and Related Topics

Mathematical statistics

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Lecture 26: Likelihood ratio tests

Statistics 3858 : Maximum Likelihood Estimators

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

Topic 10: Hypothesis Testing

Composite Hypotheses and Generalized Likelihood Ratio Tests

Lecture 21. Hypothesis Testing II

Lecture 10: Generalized likelihood ratio test

Interval Estimation. Chapter 9

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Methods for Statistical Prediction Financial Time Series I. Topic 1: Review on Hypothesis Testing

8. Hypothesis Testing

STAT 514 Solutions to Assignment #6

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Chapter 9: Hypothesis Testing Sections

Practice Problems Section Problems

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Mathematical statistics

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Institute of Actuaries of India

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Introduction to Estimation Methods for Time Series models Lecture 2

Econ 325: Introduction to Empirical Economics

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Math 152. Rumbos Fall Solutions to Assignment #12

The outline for Unit 3

Math 494: Mathematical Statistics

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Lecture 12 November 3

Testing Restrictions and Comparing Models

Topic 17: Simple Hypotheses

Hypothesis Testing (May 30, 2016)

Testing Statistical Hypotheses

Stat 710: Mathematical Statistics Lecture 27

MTMS Mathematical Statistics

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Primer on statistics:

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Introductory Econometrics. Review of statistics (Part II: Inference)

Chapter 8 Hypothesis Testing

LECTURE 5 HYPOTHESIS TESTING

Political Science 236 Hypothesis Testing: Review and Bootstrapping

HT Introduction. P(X i = x i ) = e λ λ x i

Introductory Econometrics

Master s Written Examination - Solution

Principles of Statistics

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Topic 10: Hypothesis Testing

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Exercises and Answers to Chapter 1

Transcription:

Maura Department of Economics and Finance Università Tor Vergata

Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet

Outline 1 The Power Function Methods of evaluating tests 2 Hypothesis Testing: N(µ, σ 2 ) Method of inversion 3 Wald Test Score Test Application to Pareto Distribution 4 Application to Poisson Distribution

s Outline The Power Function Methods of evaluating tests Hypothesis testing is an inferential statistics procedure used to offer proof of a hypothesis (guess) A test of a hypothesis is a process that uses sample statistics to test a claim about the value of a population parameter

What s a Hypothesis? The Power Function Methods of evaluating tests A hypothesis is a statement about a population parameter (for example, Population mean, Population proportion, or Population variance) Two complementary hypothesis in a hypothesis testing problem are called the null and the alternative hypothesis. The claim about the parameter need to be translated from a verbal statement to a mathematical statement then write its complement.

What s a Hypothesis? The Power Function Methods of evaluating tests Null Hypothesis indicated with H 0, is a hypothesis that is held to be true unless sufficient evidence to the contrary is obtained Alternative Hypothesis indicated with H 1, is a new hypothesis to be tested. A hypothesis against which the null hypothesis is tested and which will be held to be true if the null is held false. Simple hypothesis determines completely distribution of X, specifies only one possible distribution for the sample. Composite hypothesis specify more that one possible distributions for the sample.

What s a Hypothesis Test? The Power Function Methods of evaluating tests Definition A hypothesis procedure or hypothesis test is a rule that specifies: For which sample values the decision is made to accept H 0 as true For which sample values H 0 is rejected and H 1 is accepted as true

Rejection Region Outline The Power Function Methods of evaluating tests The subset of the sample space for which H 0 will be rejected is called the rejection region or critical region. The complement of the rejection region is called the acceptance region. The rejection region R of a hypothesis test is usually defined by a test statistic W (X ), a function of the sample R = {X : W (X ) > c} reject H 0 A = {X : W (X ) c} accept H 0

Types of Errors Outline The Power Function Methods of evaluating tests Since the decision is based on a sample and not on the entire population, there is always the possibility you ll make the wrong decision. There are two types of errors. A type I error occurs if H 0 is rejected when it is actually true. A type II error occurs if H 0 is not rejected when it is actually false.

How can we go wrong? The Power Function Methods of evaluating tests H 0 is true H 0 is false Accept H 0 No error Type II error Reject H 0 Type I error No error Type I error is considered MORE SERIOUS, we do our best to not COMMIT Type I error

Errors in making Decisions The Power Function Methods of evaluating tests Type I Error: Reject true null hypothesis Has serious consequences Probability of Type I Error is α Called Level of Significance. Type II Error Do not reject false null hypothesis Probability of Type II Error is β.

Thought Questions Outline The Power Function Methods of evaluating tests In the courtroom, juries must make a decision about the guilt or innocence of a defendant. Suppose you are on the jury in a murder trial. It is obviously a mistake if the jury claims the suspect is guilty when in fact he or she is innocent. What is the other type of mistake the jury could make? Which is more serious?

Decision Results Outline The Power Function Methods of evaluating tests

Example N(µ, σ = 10), n=100 The Power Function Methods of evaluating tests H 0 : µ = 0 vs H 1 : µ = 2 R = {x : x > 0.8}

Example N(µ, σ = 10), n=100 The Power Function Methods of evaluating tests

Example N(µ, σ = 10), n=100, α The Power Function Methods of evaluating tests

Example N(µ, σ = 10), n=100, β The Power Function Methods of evaluating tests

Example N(µ, σ = 100) The Power Function Methods of evaluating tests

The Power Function Outline The Power Function Methods of evaluating tests Definition: The power function of a hypothesis test with rejection region R is the function of θ defined by: β(θ) = P θ (X R) Example: Let X 1,..., X 10 be a random sample from a population N(µ, σ 2 = 4) and let H 0 : µ 0 and H 1 : µ > 0 and let consider a test that reject H 0 when X > 1, the power function has the following expression: β(µ) = P( X > 1 µ = µ) = P ( Z > 1 µ 2 10 )

The Power Function Outline The Power Function Methods of evaluating tests

α and β Outline The Power Function Methods of evaluating tests Definition For 0 α 1, a test with power function β(θ) is a size α test if sup θ Θ0 β(θ) = α Definition For 0 α 1, a test with power function β(θ) is a level α test if sup θ Θ0 β(θ) α

Example: Computation of α and β The Power Function Methods of evaluating tests Let X 1,..., X 3 be a random sample from a population N(µ, σ 2 = 4) and let H 0 : µ = 0 and H 1 : µ = 1 and let consider a test with the following rejection region: Calculate α and β R = {(x 1, x 2, x 3 ) : X 1 + X 2 X 3 > 0.8}

α and β Outline The Power Function Methods of evaluating tests α = P(X 1 + X 2 X 3 > 0.8 µ = 0) and β = P(X 1 + X 2 X 3 0.8 µ = 1) X 1 + X 2 X 3 N(µ, 3 σ 2 ) Let Y = X 1 + X 2 X 3 Under null hypothesis: Y N(0, 12) Under alternative hypothesis: Y N(1, 12)

Example Outline The Power Function Methods of evaluating tests

α and β Outline The Power Function Methods of evaluating tests X 1 + X 2 X 3 N(µ, 3 σ 2 ) Let Y = X 1 + X 2 X 3 Under null hypothesis: Y N(0, 12) Under alternative hypothesis: Y N(1, 12) ( ) α = P(Y > 0.8 µ = 0) = P Y 0 12 > 0.8 0 12 = P (Z > 0.23) = 0.41 ( ) β = P(Y 0.8 µ = 1) = P Y 1 12 > 0.8 1 12 = P (Z 0.058) = 0.477

Example: Computation of α and β The Power Function Methods of evaluating tests Let X 1,..., X 3 be a random sample from a population N(µ, σ 2 = 4) and let H 0 : µ = 0 and H 1 : µ = 1 and let consider a test with the following rejection region: R = Calculate α and β { (x 1, x 2, x 3 ) : X 1 + X 2 + X 3 3 } > 0.8

α and β Outline The Power Function Methods of evaluating tests ( ) α = P X1 +X 2 +X 3 3 > 0.8 µ = 0 and ( ) β = P X1 +X 2 +X 3 3 0.8 µ = 1 X 1 +X 2 +X 3 3 N(µ, σ2 3 ) Let X 1+X 2 +X 3 3 Under null hypothesis: Y N ( 0, 4 ) 3 Under alternative hypothesis: Y N ( 1, 4 3 )

Example Outline The Power Function Methods of evaluating tests

α and β Outline The Power Function Methods of evaluating tests ( ) Y 0 α = P(Y > 0.8 µ = 0) = P > 0.8 0 = 0.2442 4/3 4/3 ( ) Y 1 β = P(Y 0.8 µ = 1) = P > 0.8 1 = 0.312 4/3 4/3

Most Powerful Tests Outline The Power Function Methods of evaluating tests Definition: The region C is a uniformly most powerful critical region of size α for testing the simple hypothesis H 0 against a composite alternative hypothesis H 1 if C is a best critical region of size α for testing H 0 against each simple hypothesis in H 1. The resulting test is said to be uniformly most powerful.

Most Powerful Tests Outline The Power Function Methods of evaluating tests Definition: Let C be a class of tests for testing H 0 : θ = θ 0 versus H 1 : θ = θ 1 of size α. A test in class C, with power function β(θ), is a uniformly most powerful (UMP) class C test if β(θ) β (θ) for every θ Θ c 0 and for every β (θ) that is a power function of a test in class C.

Hypothesis Testing: N(µ, σ 2 ) Method of inversion Theorem: Consider testing H 0 : θ = θ 0 versus H 1 : θ = θ 1 where the pdf or pmf corresponding to θ i is f (x θ i ), i = 0, 1 using a test with rejection region R that satisfies x R if f (x θ 1 ) > kf (x θ 0 ) and and x R C if f (x θ 1 ) kf (x θ 0 ) α = P θ0 (X R) Any test that satisfies previous conditions is a UMP level α test.

Corollary of Neyman-Pearson Hypothesis Testing: N(µ, σ 2 ) Method of inversion Corollary: Consider testing H 0 : θ = θ 0 versus H 1 : θ = θ 1. Suppose T (X) is a sufficient statistics for θ and g(x θ i ), i = 0, 1 is the pdf or pmf corresponding to θ i. Then any test statistics based on T with rejection region S is a UMP level α test if it satisfies t S if g(t θ 1 ) > kg(t θ 0 ) and and t S C if g(t θ 1 ) kg(t θ 0 ) α = P θ0 (T S)

Application Hypothesis Testing: N(µ, σ 2 ) Method of inversion Suppose X has an exponential distribution with arrival rate λ. We wish to test the hypothesis that λ = 1 against the alternative that λ = 2. Suppose X has a Gaussian distribution with mean µ and known variance. We wish to test the hypothesis that µ = µ 0 against the alternative that µ = µ 1.

LRT and UMP test Outline Hypothesis Testing: N(µ, σ 2 ) Method of inversion Theorem (UMP tests) If the LR test of size α for H 0 : θ = θ 0 versus the simple alternative H 0 : θ = θ 1 is the same test for all θ 1 Θ 1, (for example Θ 1 = (θ 1, ) then this is the UMP test of size α for H 0 : θ = θ 0 against the compound alternative H 0 : θ Θ 1. Otherwise no UMP test of these hypotheses of size α exists. Remark So whenever an UMP test exists for a simple H 0 it is a LR test.

Generalized s Hypothesis Testing: N(µ, σ 2 ) Method of inversion We have been looking at really simple problems, typically with just one parameter, but in the real world of applied statistics we constantly deal with models having many parameters. In that context we never have a simple null hypothesis. Our null hypotheses often assert that one of the parameters takes a specific value, but such a hypothesis is not simple, since that would entail specifying the values of all the parameters.

Outline Hypothesis Testing: N(µ, σ 2 ) Method of inversion Definition: The likelihood ratio test statistics for testing H 0 : θ Θ 0 versus H 1 : θ Θ C 0 is λ(x) = sup Θ 0 L(θ x) sup Θ L(θ x), Θ = Θ 0 Θ C 0 Definition: The likelihood ratio test (LRT) is any test that has a rejection region of the form : {x : λ(x) c} where c is any number satisfying 0 c 1. This should be reduced to the simplest possible form.

s Hypothesis Testing: N(µ, σ 2 ) Method of inversion The rationale behind LRTs may best be understood in the situation in which f (x θ) is the probability distribution of a discrete random variable. In this case, the numerator of λ(x) is the maximum probability of the observed sample, maximum being computed under the null hypothesis. On the other hand, the denominator of λ(x) is the maximum probability of the observed sample over all possible parameters.

s Hypothesis Testing: N(µ, σ 2 ) Method of inversion The ratio of these two maxima is small if there are parameter points in H 1 for which the observed sample is much more likely than for any parameter in H 0. In this situation, the LRT criterion says H 0 should be rejected and H 1 accepted as true.

Relation between LRT and MLE Hypothesis Testing: N(µ, σ 2 ) Method of inversion Suppose MLE of θ exists. Let ˆθ 0 be the MLE of θ in the null set Θ 0 (restricted maximization) and ˆθ be the MLE of θ in the full set Θ (unrestricted maximization). Then the LRT statistic, a function of x (not θ) is λ(x) = sup Θ 0 L(θ x) sup Θ L(θ x) = L(ˆθ 0 x) L(ˆθ x) In R{x : λ(x) c}, different c gives different rejection region and hence different tests.

LRT and sufficiency Outline Hypothesis Testing: N(µ, σ 2 ) Method of inversion Theorem If T (X ) is a sufficient statistic for θ, λ (t) is the LRT statistic based on T, and λ(x) is the LRT statistic based on x. Then λ (T (x)) = λ(x) for every x in the sample space. Thus the simplified expression for λ(x) should depend on x only through T (x) if T (X ) is a sufficient statistic for θ.

Example: Likelihood ratio test Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let X 1, X 2,..., X n be a random sample from an exponential population with pdf: { exp ( (x θ)) x θ f (x θ) = 0 x < θ The likelihood function is: { exp ( L(θ x) = i x i + nθ) θ x (1) 0 θ > x (1)

Example: Likelihood ratio test Hypothesis Testing: N(µ, σ 2 ) Method of inversion Consider testing H 0 : θ θ 0 versus H 1 : θ > θ 0 where θ 0 is a value specified. The likelihood function is an increasing function of θ on < θ x (1). The likelihood ratio test statistics is { λ(x) = 1 x (1) θ 0 exp ( n(x (1) θ 0 ) ) x (1) > θ 0 An LRT, a test that rejects H 0 if λ(x) c, is a test with rejection region {x : x (1) θ 0 logc n }. Note that the rejection region depends on the sample only through the sufficient statistics X (1).

Hypothesis Testing: N(µ, σ 2 ) Method of inversion s and Newman Pearson lemma The asserts that, in general a best critical region can be found by finding the n dimensional points in the sample space for which the likelihood ratio is smaller than some constant.

Monotone Likelihood Ratio (MLR) Hypothesis Testing: N(µ, σ 2 ) Method of inversion Definition: A family of pdfs or pmfs {g(t; θ); θ Θ} for a univariate random variable T with real-valued parameter θ has a monotone likelihood ratio (MLR) if, for every θ 2 > θ 1, g(t, θ 2 )/g(t, θ 1 ) is an increasing function of t on {t : g(t, θ 1 ) > 0 or g(t, θ 2 ) > 0}.

Monotone Likelihood Ratio (MLR) Hypothesis Testing: N(µ, σ 2 ) Method of inversion Theorem Suppose that X has density f (x; θ) with MLR in T (x), then: 1 There exists a UMP level α test of H 0 : θ θ 0 versus H 1 : θ > θ 0 which is of the form C(x) = {T (x) k} 2 β(θ) is increasing in θ 3 For all θ this same test is the UMP level α = β(θ ) to test H 0 : θ θ versus H 1 : θ > θ 4 For all θ < θ 0, the test of (1) minimizes β(θ) among all tests of size α

N(µ, σ 2 ), with σ 2 known Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 known H 0 : µ = µ 0 vs H 1 : µ = µ 1 µ 0 > µ 1 { } X µ 0 R = z 1 α σ n H 0 : µ = µ 0 vs H 1 : µ < µ 0 { X µ 0 R = σ n z 1 α } H 0 : µ > µ 0 vs H 1 : µ < µ 0 { X µ0 R = σ n z 1 α }

N(µ, σ 2 ), with σ 2 known Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 known H 0 : µ = µ 0 vs H 1 : µ = µ 1 µ 0 < µ 1 { } X µ 0 R = z 1 α σ n H 0 : µ = µ 0 vs H 1 : µ > µ 0 { X µ 0 R = σ n z 1 α } H 0 : µ < µ 0 vs H 1 : µ > µ 0 { X µ0 R = σ n z 1 α }

N(µ, σ 2 ), with σ 2 known Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 known H 0 : µ = µ 0 vs H 1 : µ µ 0 R = { X µ 0 σ n } z 1 α/2

N(µ, σ 2 ), with σ 2 unknown Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 unknown H 0 : µ = µ 0 vs H 1 : µ = µ 1 µ 0 > µ 1 { } X µ 0 R = t1 α n 1 S n H 0 : µ = µ 0 vs H 1 : µ > µ 0 { X µ 0 R = S n t n 1 1 α } H 0 : µ > µ 0 vs H 1 : µ < µ 0 { X µ 0 R = S n t n 1 1 α }

N(µ, σ 2 ), with σ 2 unknown Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 unknown H 0 : µ = µ 0 vs H 1 : µ = µ 1 µ 0 < µ 1 { } X µ 0 R = t1 α n 1 S n H 0 : µ = µ 0 vs H 1 : µ > µ 0 { X µ 0 R = S n t n 1 1 α } H 0 : µ < µ 0 vs H 1 : µ > µ 0 { X µ 0 R = S n t n 1 1 α }

N(µ, σ 2 ), with σ 2 unknown Hypothesis Testing: N(µ, σ 2 ) Method of inversion Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as N(µ, σ 2 ) σ 2 unknown H 0 : µ = µ 0 vs H 1 : µ µ 0 { X µ 0 R = tn 1 S n 1 α/2 }

Method of inversion Outline Hypothesis Testing: N(µ, σ 2 ) Method of inversion One to one correspondence between tests and confidence intervals. Hypothesis testing: Fix the parameter: asks what sample values (in the appropriate region) are consistent with that fixed value. Confidence set: Fix the sample value: asks what parameter values make this sample most plausible. For each θ 0 Θ, let A(θ 0 ) be the acceptance region of a level α test H 0 : θ = θ 0. Define a set C(x) = {θ 0 : x A(θ 0 )}. Then C(x) is a (1 α)-confidence set.

Hypothesis Testing: N(µ, σ 2 ) Method of inversion Hypothesis Testing and Confidence Intervals In general, inverting acceptance region of a two sided test will give two sided interval and inverting acceptance region of a one sided test will give an open end interval on one side. 1 Theorem Let acceptance region of a two sided test be of the form A(θ) = {x : c 1 (θ) T (x) c 1 (θ)} and let the cutoff be symmetric, that is, P θ (T (X ) > c 2 (θ)) = α/2 and P θ (T (X ) < c 1 (θ)) = α/2. If T has MLR property then both c 1 (θ) and c 2 (θ) are increasing in θ.

Outline Wald Test Score Test Application to Pareto Distribution Suppose we wish to test H 0 : θ = θ 0 against H 0 : θ θ 0. The likelihood-based give rise to several possible tests. Let logl(θ) denote the loglikelihood and ˆθ n the consistent root of the likelihood equation. Intuitively, the farther θ 0 is from ˆθ n, the stronger the evidence against the null hypothesis. But how far is far enough? Note that if θ 0 is close to ˆθ n, then logl(θ 0 ) should also be close to logl(ˆθ n ) and logl (θ 0 ) should also be close to logl (ˆθ n ) = 0.

Outline Wald Test Score Test Application to Pareto Distribution If we base a test upon the value of (ˆθ n θ 0 ), we obtain a Wald test. If we base a test upon the value of logl(ˆθ n ) logl(θ 0 ), we obtain a likelihood ratio test. If we base a test upon the value of logl (θ 0 ), we obtain a (Rao) score test.

Three asymptotic tests Wald Test Score Test Application to Pareto Distribution

Outline Wald Test Score Test Application to Pareto Distribution We argued that under certain regularity conditions, the following facts are true under H 0 : n(ˆθ n θ 0 ) N ( ) 1 0, I (θ 0 ) 1 n (logl (θ 0 )) N (0, I (θ 0 )) 1 n (logl (θ 0 )) I (θ 0 )

Wald Test Outline Wald Test Score Test Application to Pareto Distribution Under certain regularity conditions, the maximum likelihood estimator ˆθ has approximately in large samples a (multivariate) normal distribution with mean equal to the true parameter value and variance-covariance matrix given by the inverse of the information matrix, so that ˆθ N ( θ 0, ) 1 ni (θ 0 )

Wald Test Outline Wald Test Score Test Application to Pareto Distribution Wald test statistics (parameter θ scalar) (ˆθ θ 0 ) ni (θ 0 ) N(0, 1) If Var(ˆθ) is unknown a consistent estimator con be used (ˆθ θ 0 ) ni (ˆθ) N(0, 1) More generally, ˆθ θ 0 N(0, 1) Var(ˆθ)

Wald Test Outline Wald Test Score Test Application to Pareto Distribution When θ is a parameter of dimension p. This result provides a basis for constructing tests of hypotheses and confidence regions. For example under the hypothesis H 0 : θ = θ 0 for a fixed value θ 0, the quadratic form W = (ˆθ θ 0 ) var 1 (ˆθ)(ˆθ θ 0 ) has approximately in large samples a chi-squared distribution with p degrees of freedom.

Score Test Outline Wald Test Score Test Application to Pareto Distribution Under some regularity conditions the score itself has an asymptotic normal distribution with mean 0 and variance equal to the information matrix, so that u(θ) N(0, I n (θ))

Score Test Statistcs Outline Wald Test Score Test Application to Pareto Distribution u(θ 0 ) N(0, I n (θ 0 )) logl(θ x 1,...,x n) θ ni (θ0 ) N(0, 1)

Score Test Outline Wald Test Score Test Application to Pareto Distribution When θ is a parameter of dimension p. Under some regularity conditions the score itself has an asymptotic normal distribution with mean 0 and variance-covariance matrix equal to the information matrix, so that u(θ) N p (0, I n (θ))

Score Test Outline Wald Test Score Test Application to Pareto Distribution This result provides another basis for constructing tests of hypotheses and confidence regions. For example under H 0 : θ = θ 0 for a fixed value θ 0, the quadratic form Q = (u(θ 0 )) I 1 (θ 0 )u(θ 0 ) has approximately in large samples a chi-squared distribution with p degrees of freedom.

s Wald Test Score Test Application to Pareto Distribution Under certain regularity conditions, minus twice the log of the likelihood ratio has approximately in large samples a chi-square distribution 2logλ = 2logL(ˆθ, x) 2logL(θ 0, x) In case θ is a scalar, 2logλ has a chi-square distribution with 1 degree of freedom.

s Wald Test Score Test Application to Pareto Distribution Under certain regularity conditions, minus twice the log of the likelihood ratio has approximately in large samples a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between the two models. Thus, 2logλ = 2logL(ˆθ ω2, x) 2logL(ˆθ ω1, x) where the degrees of freedom are ν = dim(ω 2 ) dim(ω 1 ), the number of parameters in the larger model ω 2 minus the number of parameters in the smaller model ω 1.

Outline Wald Test Score Test Application to Pareto Distribution The LR, Wald, and Score tests (the trinity of test statistics) require different models to be estimated. LR test requires both the restricted and unrestricted models to be estimated Wald test requires only the unrestricted model to be estimated. Score test requires only the restricted model to be estimated. Applicability of each test then depends on the nature of the hypotheses. For H 0 : θ = θ 0, the restricted model is trivial to estimate, and so LR or Score test might be preferred. For H 0 : θ > θ 0, restricted model is a constrained maximization problem, so Wald test might be preferred.

Outline Wald Test Score Test Application to Pareto Distribution The score test does not require ˆθ n whereas the other two tests do. The Wald test is most easily interpretable and yields immediate confidence intervals. The score test and likelihood ratio test are invariant under reparameterization, whereas the Wald test is not.

Pareto Outline Wald Test Score Test Application to Pareto Distribution Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as a Pareto distribution with parameters α and x m f (x; α, x m ) = α x α m x α+1 for x x m.

Pareto Outline Wald Test Score Test Application to Pareto Distribution The Pareto distribution is used to describe the distribution of income in the top portion of the income distribution. Therefore, if x is income, the pdf is defined for incomes in excess of x m. It is usually used to describe the income in higher income groups. E(X ) = αx m α 1 Consider a sample of people of 100 family with incomes in excess of 50, 000$ and suppose we want to test H 0, α = 5 versus an alternative H 0, α 5. Suppose (x 1,..., x 100 ) has an observed sample mean of 60, 000$ and observed mean of logarithm equal to 11.0698.

Pareto Outline Wald Test Score Test Application to Pareto Distribution Observe that the joint pdf of X = (X 1,..., X n ) f (x; α, x m ) = n α xm α x α+1 i=1 i = α n xm nα n i=1 x α+1 i = g(t, α)h(x) where t = n i=1 x i g(t, α) = cα n xm nα t (α+1) and h(x) = 1. By the factorization theorem, T (X ) = n i=1 X i is sufficient for α.

Pareto Outline Wald Test Score Test Application to Pareto Distribution The log-likelihood function is Thus l(α, x m ) = nlogα + nαlog(x m ) (α + 1) δl(α, x m ) δα = n α + nlog(x m) n logx i i=1 n logx i Solving for δl(α,xm) δα = 0, the mle of α is given by ˆα = n n i=1 logx i nlog(x m ) = 1 logx i log(x m ) ˆα = 4 i=1

Pareto Outline Wald Test Score Test Application to Pareto Distribution δl(α, x m ) δα δ 2 l(α, x m ) δα 2 = n α 2 = n α + nlog(x m)) n logx i i=1

Pareto Outline Wald Test Score Test Application to Pareto Distribution δl(α, x m ) δα δ 2 l(α, x m ) δα 2 = n α 2 Var(ˆα) = = α2 n 1 I n (α 0 ) = 0.25 1 I n (ˆα) = 0.16 = n α + nlog(x m)) n logx i i=1

Pareto- Wald Test Outline Wald Test Score Test Application to Pareto Distribution ˆα N (ˆα α 0 ) 2 I n (θ 0 ) χ 1 ( ) 1 α 0, ni (α 0 ) Observed value of test statistics 4 5 0.25 = 2 p value = 0.0455 (4 5) 2 0.25 = 4 p value = 0.0455

Pareto- score Test Outline Wald Test Score Test Application to Pareto Distribution l (α 0 ) N (0, ni (α 0 )) Observed value of test statistics δl(α, x m ) δα = n α + nlog(x m)) n logx i l (α 0 ) = 100 + 100log(50000) 100 11.069 = 4.922 5 l (α 0 ) = 4.922 = 2.461 pvalue = 0.014 ni (α0 ) 2 i=1

Pareto- Likelihood ratio Test Wald Test Score Test Application to Pareto Distribution l(α) = nlogα + nαlog(x m ) (α + 1) l(5) = 1071.0 l(4) = 1068.4 2 (l(ˆα) l(α 0 )) = 5.3756 p value = 0.0204 n logx i i=1

Pareto- Large sample test Wald Test Score Test Application to Pareto Distribution WALD Test statistics n n i=1 logx i nlog(x m) α 0 α 2 0 n approximate distribution N(0, 1) Score test statistics n α 0 + nlog(x m ) n i=1 logx i n α 2 0 approximate distribution N(0, 1)

Pareto- Large sample test Wald Test Score Test Application to Pareto Distribution Statistics ( ( )) 2 nlog ˆα n + (ˆα α 0 ) nlog(x m ) logx i α 0 i=1 approximate distribution χ 1

Outline Application to Poisson Distribution A common complaint about Hypothesis Testing is the he choice of the significance level α, that is essentially arbitrary. To make matters worse, the conclusions (Reject or Dont Reject H ) depend on what value of α is used.

Outline Application to Poisson Distribution Definition: A p(x) is a test statistic satisfying 0 p(x) 1 for every sample point x. Small values of p(x) give evidence that H 1 is true. Theorem: Let W (X ) be a test statistic such that large value of W give evidence that H 1 is true. For each sample point x, define p(x) = sup θ Θ 0 P θ (W (X ) W (x)) then p(x ) is a valid.

Steps to find Outline Application to Poisson Distribution 1 Let T be the test statistic. 2 Compute the value of T using the sample x 1,..., x n. Say it is a. 3 The is given by p value = P(T < a H 0 ), P(T > a H 0 ), P( T > a H 0 ), if lower tail test if upper tail test if two tail test

Outline Application to Poisson Distribution The smallest level at which H 0 would be rejected. The the probability that you would obtain evidence against H 0 which is at least as strong as that actually observed, if H 0 were true. The smaller the p value, the stronger the evidence that H 0 is false.

Example: psychiatrist problem Application to Poisson Distribution A psychiatrist wants to determine if a small amount of coke decreases the reaction time in adults. The reaction time for a specified test is assumed normally distributed and has mean equal to µ = 0.15 seconds and σ = 0.2. A sample of 81 people were given a small amount of coke and their reaction time averaged 0.11 seconds. Does this sample result support the psychiatrist s hypothesis?

Example: psychiatrist problem Application to Poisson Distribution The reaction time is expected to decrease What are null and alternative hypothesis? H 0 : µ = 0.15 H 1 : µ < 0.15 If the null hypothesis were true, could I observe a sample mean of 0.11 from a sampling distribution centered in 0.15 and standard deviation σ = 0.2? P ( X < 0.11 µ = 0.15 ) = P ( X 0.15 0.2 81 < ) 0.11 0.15 = 0.0359 0.2 81

Example: psychiatrist problem Application to Poisson Distribution Population Hypothesis Testing I believe the population mean age is 50 (hypothesis). Random sample Mean X = 20 Reject hypothesis! Not close.

Example: psychiatrist problem Application to Poisson Distribution Is I believe the population reaction time is 0.15 (Hypothesis) x = 0.11< µ = 0.15? Hypothesis Testing Process The Sample Mean Is 0.11 Population yes no REJECT Hypothesis ACCEPT Sample

Example: psychiatrist problem Application to Poisson Distribution Sampling Distribution It is unlikely that we would get a sample mean of this value...... if in fact this were the population mean.... Therefore, we reject the null hypothesis that µ = 0.15 0.11 µ =0.15 H 0 Sample Mean

Application to Poisson Distribution : Example: psychiatrist problem 0.0359 represents the probability to observe a sample mean with a more extreme value than 0.11. Probability of obtaining a test statistic as extreme as, or more extreme, ( or, ) than the actual sample value, given H 0 is true If we think the is too low to believe the observed test statistic is obtained by chance only, then we would reject chance (reject the null hypothesis). Otherwise, we fail to reject chance and do not reject the null hypothesis.

Application to Poisson Distribution Example: psychiatrist problem: Decision = 0.0359 represents the probability to observe a sample mean with a more extreme value than 0.11 If α = 0.05, we reject null hypothesis, coke does have effect on time reaction (coke decreases reaction time) If α = 0.01, we DO NOT reject null hypothesis, coke does NOT have effect on time reaction

Application to Poisson Distribution Application to Poisson Distribution Let (X 1,..., X n ) be a random sample of i.i.d. random variables distributed as a Poisson distribution with parameter λ f (x; λ) = λx exp( λ) x! We wish to test H 0 : λ = 6 versus H 1 : λ 6 We observe a random sample (X 1,..., X 100 ), such that i x i = 550

Application to Poisson Distribution Application to Poisson Distribution Observe that the joint pdf of X = (X 1,..., X n ) is give by f (x 1,..., x : n; λ) = i λ x i exp( λ) x i! L(λ) = λ i x i exp( nλ) i x i! l(λ) = x i log(λ) nλ log i dl(λ) i = x i n λ λ d 2 l(λ) i λ 2 = x i λ 2 ˆλ = x = 5.5 ( ) x i! i

Application to Poisson Distribution Application to Poisson Distribution l(λ) = ( ) x i log(λ) nλ log x i! i i dl(λ) i = x i n λ λ d 2 l(λ) i λ 2 = x i λ ( 2 d 2 ) l(λ) I n (λ) = E λ 2 = n λ ˆλ = x

Poisson- Large sample test Application to Poisson Distribution WALD Test statistics x λ 0 λ 0 n approximate distribution N(0, 1) =0.0412 x λ 0 λ 0 n = 5.5 6 6 100 = 2.0412

Poisson- Large sample test Application to Poisson Distribution Score test statistics i x i λ 0 n λ0 n approximate distribution N(0, 1)

Poisson- Large sample test Application to Poisson Distribution Statistics ( ( ) 2 l(ˆλ) l(λ 0 ) = 2 x i log ˆλ ) n(ˆλ λ 0 ) λ 0 i ( ) Λ(x) = 2 x i log x n( x λ 0 ) λ 0 = 4.2875 p value = 0.0384 i approximate distribution χ 1