Statistics 3858 : Likelihood Ratio for Multinomial Models

Similar documents
Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Topic 9: Sampling Distributions of Estimators

5. Likelihood Ratio Tests

Theorem. Assume the following (Cramér) conditions 1. θ θ

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Last Lecture. Wald Test

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Problem Set 4 Due Oct, 12

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Summary. Recap ... Last Lecture. Summary. Theorem

Frequentist Inference

Ma 530 Introduction to Power Series

Properties and Hypothesis Testing

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

INFINITE SEQUENCES AND SERIES

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Efficient GMM LECTURE 12 GMM II

General IxJ Contingency Tables

4. Partial Sums and the Central Limit Theorem

Common Large/Small Sample Tests 1/55

Infinite Sequences and Series

x = Pr ( X (n) βx ) =

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

STAT431 Review. X = n. n )

4. Hypothesis testing (Hotelling s T 2 -statistic)

1. Hydrogen Atom: 3p State

Introductory statistics

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

STATISTICAL INFERENCE

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Curve Sketching Handout #5 Topic Interpretation Rational Functions

Stat 319 Theory of Statistics (2) Exercises

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

Chi-Squared Tests Math 6070, Spring 2006

Composite Hypotheses

Statistical inference: example 1. Inferential Statistics

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

Statistics 511 Additional Materials

Exponential Families and Bayesian Inference

32 estimating the cumulative distribution function

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Mathematical Foundations -1- Sets and Sequences. Sets and Sequences

Lecture 7: Properties of Random Samples

1 General linear Model Continued..

tests 17.1 Simple versus compound

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Section 11.8: Power Series

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Lecture 13: Maximum Likelihood Estimation

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Math 203A, Solution Set 8.

Stat410 Probability and Statistics II (F16)

Singular Continuous Measures by Michael Pejic 5/14/10

Chapter 4. Fourier Series

A statistical method to determine sample size to estimate characteristic value of soil parameters

1.010 Uncertainty in Engineering Fall 2008

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)*

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

ECE 901 Lecture 13: Maximum Likelihood Estimation

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium

Lecture 33: Bootstrap

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

STAC51: Categorical data Analysis

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

The Riemann Zeta Function

Bayesian Methods: Introduction to Multi-parameter Models

Data Analysis and Statistical Methods Statistics 651

Output Analysis and Run-Length Control

6 Sample Size Calculations

Simulation. Two Rule For Inverting A Distribution Function

The Random Walk For Dummies

NCSS Statistical Software. Tolerance Intervals

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

U8L1: Sec Equations of Lines in R 2

Zeros of Polynomials

1 Generating functions for balls in boxes

Polynomial Functions and Their Graphs

This is an introductory course in Analysis of Variance and Design of Experiments.

COMM 602: Digital Signal Processing

Assignment 2 Solutions SOLUTION. ϕ 1 Â = 3 ϕ 1 4i ϕ 2. The other case can be dealt with in a similar way. { ϕ 2 Â} χ = { 4i ϕ 1 3 ϕ 2 } χ.

11 THE GMM ESTIMATION

Stat 200 -Testing Summary Page 1

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Progressions. ILLUSTRATION 1 11, 7, 3, -1, i s an A.P. whose first term is 11 and the common difference 7-11=-4.

Some remarks for codes and lattices over imaginary quadratic

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA STATISTICAL THEORY AND METHODS PAPER I

Transcription:

Statistics 3858 : Likelihood Ratio for Multiomial Models Suppose X is multiomial o M categories, that is X Multiomial, p), where p p 1, p 2,..., p M ) A, ad the parameter space is A {p : p j 0, p j 1 } The dimesio of this parameter space is M 1. It is a simplex of dimesio M 1. The likelihood fuctio is M Lp) c, X 1,..., X M ) where the data is X X 1, X 2,..., X M ). Notice that X j 0 ad X 1 + X 2 +... + X M ad )! c, x 1,..., x M ) x 1... x M x 1!x 2!... x M! is the multiomial coefficiet. The MLE is easily foud usig the log-likelihood ad Lagrage multipliers ad is X1 ˆp,..., X ) M A special multiomial model i certai models is of the form p Xj j p pθ) p 1 θ),..., p M θ)) where the compoets are of a fuctioal form of some other parameter. For example i the Hardy- Weiberg model with M 3 p 1 θ, 2θ1 θ), θ 2 ) where θ Θ 0, 1). We ca view this as a particular 1 dimesioal subset or sub-maifold, say A 0 of the M 1 dimesioal simplex A that is the geeral parameter space for multiomials o M categoriess. This sectio costructs the geeralized likelihood ratio GLR) statistic for H 0 : p A 0 1

versus H A : p A A A \ A 0 We ofte write the alterative as H A : p is ot i A 0 or simply refer to is as the geeral alterative i this cotext). Let ˆθ be the MLE of θ. The the MLE of pθ) is give by pˆθ). Sice A A A 0 A, the deomiator for the GLR is the likelihood evaluated at the geeral or urestricted MLE of p. Thus the GLR is M ΛX) p jˆθ) Xj M ˆpXj j The rejectio regio is of the form where c is determied by Λx) < c α P 0 ΛX) < c) By Theorem 9.4A c is obtaied as c 1 2 logc) where α P 0 2 logλx)) > c 1 2 logc) ) The costat c 1 is approximately the upper 1 α quatile of a χ 2 d) freedom is d M 1 dima 0 ). distributio where the degrees of For the Hardy-Weiberg model this is M 1 dima 0 ) 3 1 1 1 A size α.05 test will have c 1 1.96 3.84 ad c e 3.84/2 e 1.92 0.146. Cosider the fuctio g : R + R give by gy) y logy/y 0 ) where y 0 is a give umber. The first two derivatives are ad g y) logy/y 0 ) + y 1 y logy/y 0 ) + 1 g y) 1 y gy 0 ) y 0 logy 0 /y 0 ) 0 g y 0 ) logy 0 /y 0 ) + 1 1 g y 0 ) 1 y 0 2

Whe we take the egative 1 times the log of the GLR ΛX) we see, after gatherig up some commo terms, that it cotais ) ˆp j logp j ˆθ)) + logˆp j ) log. p j ˆθ) Aside : We are iterested i a egative umber times the log GLR, sice the GLR 1, ad this will result i the egative log beig positive. If oe were to defie the GLR with the ratio reversed this would ot be the case, but by covetio GLR is defied as this ratio. Some text books ufortuately do ot follow this covetio. Thus for a give j, takig y 0 p j ˆθ) ad y ˆp j gˆp j ) gp j ˆθ)) + g p j ˆθ)) ) ˆp j p j ˆθ) + ) ˆp j p j ˆθ) ˆp j p j ˆθ) 2p j ˆθ) + 1 2 g p j ˆθ)) ˆp j p j ˆθ) Below cosider g j to be the fuctio g above with y 0 p j ˆθ). It the follows that 2 log ΛX)) 2 2 2 2 2 X j logp j ˆθ)/ˆp j ) X j logˆp j/p j ˆθ)) ˆp j logˆp j /p j ˆθ)) g j ˆp j ) ) 2 ˆp j p j ˆθ) + 2 1 1) + { } ˆp j p j ˆθ) + ˆp j p j ˆθ) ˆp j p j ˆθ) p j ˆθ) p j ˆθ) ˆp 2 j p j ˆθ) p j ˆθ) 3

ˆp j p j ˆθ) p j ˆθ) This last expressio is ofte writte as ˆp j X j O j where O j is the observed couts i the j-th category, ad p j ˆθ) Êj or sometimes E j ) as the expected couts for the best fit for the statistical model with parameter θ, that is the restricted multiomial model that correspods to the ull hypothesis. Whe doig this we obtai χ 2 ˆp j p j ˆθ) p j ˆθ) O j Êj Ê j This last formula is called the Pearso s chi-squared statistic. Thus i this multiomial settig the Pearso s chi-squared statistic is equivalet to the geeralized likelihood ratio test. It also has a very atural property of comparig the observed ad fitted model. We reject if the GLR Λ is very small, or equivaletly whe 2 logλ) χ 2 is very large. This of course is a measure which is large if O j is far from the expected couts for the best fitted model i the ull hypothesis. I order to assess whe the observed value of χ 2 is large, we eed to compute for a give α the critical value so that α P 0 χ 2 > c) By Theorem 9.4A Rice) whe the statistical model is oe accordig to the ull hypothesis, the samplig distributio of χ 2 coverges to χ 2 d) where the degrees of freedom is d M 1 dima 0). I the Hardy-Weiberg example, M 3 ad the ull hypothesis is that p A 0 i the otatio at the begiig this hadout, thus the degrees of freedom is 3 1 1 1. The size α.05 critical value to determie the rejectio regio is thus c 3.84. The decisio rule is thus to reject if ΛX) e 3.84/2 0.146 or equivaletly if χ 2 3 O j Êj > 3.84 Ê j 4

Alteratively we could observe the correspodig statistic ad calculate the p-value. If we observe χ 2 Obs the the p-value is p-value P Y > χ 2 Obs ). Remark : This is of course the value of the critical costat c so that χ 2 Obs falls o the boudary of the rejectio regio. 5