TESTS BASED ON MAXIMUM LIKELIHOOD

Similar documents
X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

Lecture Note to Rice Chapter 8

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Chapter 14 Logistic Regression Models

Qualifying Exam Statistical Theory Problem Solutions August 2005

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

Chapter 9 Jordan Block Matrices

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

4. Standard Regression Model and Spatial Dependence Tests

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

X ε ) = 0, or equivalently, lim

Econometric Methods. Review of Estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

STK4011 and STK9011 Autumn 2016

Point Estimation: definition of estimators

LINEAR REGRESSION ANALYSIS

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Dimensionality Reduction and Learning

Summary of the lecture in Biostatistics

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests. Soccer Goals in European Premier Leagues

ESS Line Fitting

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Lecture 3 Probability review (cont d)

6.867 Machine Learning

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Lecture 3. Sampling, sampling distributions, and parameter estimation

Maximum Likelihood Estimation

Functions of Random Variables

ρ < 1 be five real numbers. The

Chapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Chapter 5 Properties of a Random Sample

L(θ X) s 0 (1 θ 0) m s. (s/m) s (1 s/m) m s

Multivariate Transformation of Variables and Maximum Likelihood Estimation

MOLECULAR VIBRATIONS

MATH 247/Winter Notes on the adjoint and on normal operators.

Lecture 8: Linear Regression

1 Solution to Problem 6.40

Chapter 4 Multiple Random Variables

4 Inner Product Spaces

STK3100 and STK4100 Autumn 2018

ENGI 3423 Simple Linear Regression Page 12-01

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Simulation Output Analysis

Chapter 8. Inferences about More Than Two Population Central Values

CHAPTER VI Statistical Analysis of Experimental Data

STK3100 and STK4100 Autumn 2017

1 Lyapunov Stability Theory

Chapter 3 Sampling For Proportions and Percentages

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Third handout: On the Gini Index

3. Models with Random Effects

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Special Instructions / Useful Data

Convergence of the Desroziers scheme and its relation to the lag innovation diagnostic

ECON 5360 Class Notes GMM

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

8 Statistical Analysis of Multivariate Data

Multiple Linear Regression Analysis

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

ε. Therefore, the estimate

ENGI 4421 Propagation of Error Page 8-01

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Simple Linear Regression

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

Homework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

STATISTICAL INFERENCE

CHAPTER 4 RADICAL EXPRESSIONS

Training Sample Model: Given n observations, [[( Yi, x i the sample model can be expressed as (1) where, zero and variance σ

A tighter lower bound on the circuit size of the hardest Boolean functions

18.413: Error Correcting Codes Lab March 2, Lecture 8

The TDT. (Transmission Disequilibrium Test) (Qualitative and quantitative traits) D M D 1 M 1 D 2 M 2 M 2D1 M 1

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

PROPERTIES OF GOOD ESTIMATORS

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

CHAPTER 3 POSTERIOR DISTRIBUTIONS

Detection and Estimation Theory

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Fundamentals of Regression Analysis

Dr. Shalabh. Indian Institute of Technology Kanpur

Bayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3

9.1 Introduction to the probit and logit models

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

Introduction to Matrices and Matrix Approach to Simple Linear Regression

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

THE ROYAL STATISTICAL SOCIETY 2010 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2 STATISTICAL INFERENCE

Parameter, Statistic and Random Samples

Chapter 8: Statistical Analysis of Simulated Data

Generative classification models

Quiz 1- Linear Regression Analysis (Based on Lectures 1-14)

Transcription:

ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal dstrbuto wth kow varace, σ. Gve a radom sample x ( x,.., x ) from a ormal dstrbuto, N(, σ ), we frst derve the maxmum-lkelhood estmate of wth σ kow: (.) x σ L ( x, σ ) l e σ π x l σ π σ ( ) σ l σ π ( x ) So by solvg the frst-order codto for we obta: dl (.) ( x ) x d σ σ σ x ˆ x x

ESE 5 Toy E. Smth. Covarace of Maxmum Lkelhood Estmators For ths smple case, t s well kow that the varace of the sample mea, ˆ ( X ) s gve by var( ˆ ) σ /. But by takg the secod dervatve of L wth respect to, ad evaluatg at the true mea value, say we see that (.) dl( ) dl( ) var( ˆ ) d σ d Sce ths case, follows that dl/ d s depedet of the data, σ, so that fact Ed [ L( )/ d ] / ( x,.., x ), t (.) dl( ) var( ˆ ) E ( I ) d where I ( ) s desgated as Fsher Iformato about (ad s see to crease as the varace of the maxmum-lkelhood estmator, ˆ, decreases). Fally, sce the Law of Large Numbers shows ths case that ˆ X for suffcetly large, we may coclude from (.) that (.3) var( ˆ ) I ( ˆ ) More geerally, for ay parameter vector, θ ( θ,.., θ k ), defg a well behaved dstrbuto wth lkelhood fucto, L ( θ ), t ca be show that f θ deotes the true value of θ, the the covarace matrx of the maxmum-lkelhood estmator, ˆ θ, s well approxmated for large by

ESE 5 Toy E. Smth ( ) θθ (.3) ˆ E[ L ] cov( θ ) ( θ ) I ( θ ) Moreover, as wth the smple case of the ormal mea above, t ca also be show that ˆ θ s always a cosstet estmator of θ, so that ˆ θ θ for large. Thus as a exteso of (.3) we obta the sample approxmato: (.4) cov( ˆ θ ) I ( ˆ θ ) 3. Wald Tests of Parameters For the smple case of the ormal mea above t follows at oce (from the fact that lear combatos of depedet ormals are ormal) that: (3.) X ~ N(, σ ) Ths tur mples from (.) ad (.) that (3.) ˆ ~ (, ( ˆ N I ) ) More geerally t ca be show that for large the dstrbuto of the geeral maxmum-lkelhood estmator, ˆ θ, defed above s well approxmated by (3.3) ˆ θ ( ˆ ~ N θ, I ( θ) ) where θ aga deotes the true value of θ. Ths forms the bass for all stadard tests of hypotheses about the compoets of θ.

ESE 5 Toy E. Smth 4. Lkelhood-Rato Tests of Parameters Aother way to test the hypothess that, say, s the true value of s smply to compare the lkelhood of wth that of the most lkely value, ˆ. If the resultg rato of lkelhood values s close to oe, the ths mples that s a good caddate for the true value of. I terms of log lkelhoods, ths s tur equvalet to a dfferece, L ( ˆ ) L ( ), close to zero. ˆ L ( ) Fgure. Lkelhood Ratos If we observe from (.) that σ (4.) ( ) L ( x, σ ) l σ π ( x ) the ths dfferece ca be evaluated as: (4.) L ( ˆ ) ( ) ( ˆ L x ) ( x ) σ x xx + x x xo + σ ( ) ( ) x x + x x + σ. σ x x + x σ ( )

ESE 5 Toy E. Smth Hece t follows that (4.3) x ˆ L [ L ( ) ( )] σ / But uder the ull hypothess that H: t follows that x (4.4) ~ N(,) σ / ad hece by defto that (4.5) x σ / ~ χ Thus we see that uder H : (4.6) [ L ( ˆ ) L ( )] ~ χ More geerally, for ay partto of parameters, θ ( θ, θ ), the ull hypothess, H: θ θ, ca be tested the same way by lettg ˆ θ be defed by the codto, (4.7) L ( θ, ˆ θ ) max L ( θ, θ ) θ ad comparg L ( ˆ θ, ) θ wth the ucostraed maxmum, L ( ˆ θ ). If has k compoets, the uder H we ow have θ (4.8) [ L ( ˆ θ ) L ( θ, ˆ θ )] ~ χ k

ESE 5 Toy E. Smth 5. Lagrage Multpler (Score) Tests of Parameters A fal way to test the hypothess that s the true value of s to smply exame the slope of the lkelhood fucto at. If ths slope s close to zero, the t follows (by cotuty) that must be close to, ˆ, ad thus must aga be a good caddate for the true value. Hece oe may smply test whether the slope, L ( ), s sgfcatly dfferet from zero. But by (.) t follows that the score fucto, s( ) L( ) s gve by (5.) ( ) { σ } s ( ) l σ π ( x ) ( ) ( x ) x σ σ σ Hece t follows that ( x ) (5.) ( σ ) s x σ / x / ( ) ~ N(,) σ / As (4.4) ad (4.5) above, ths tur mples that ˆ L ( ) Fgure. Score Fucto

ESE 5 Toy E. Smth (5.3) ( ) σ / s ( ) ~ χ Fally, recallg that statstc: ˆ var( ) / σ, we obta the followg score test (5.4) var( ) ( ) ~ χ s where var( ) s formally the varace of ˆ uder the ull hypothess (whch s completely depedet of ths smple case). It s mportat to ote that ths score test s also called a Lagrage multpler test. To see the reaso for ths, observe that fdg the maxmum-lkelhood estmate of uder the ull hypothess,, s formally equvalet to the Lagraga maxmzato problem: (5.5) max ϕ( ) L ( ) + λ( ) wth soluto gve by (5.6) ϕ ( ) λ L (5.7) λϕ By combg these two codtos, we see that the optmal value, ˆλ, of the Lagrage multpler s gve by, (5.8) ˆ λ ( ) L ad thus that the score fucto, s( ) L( ), ca also be terpreted as a Lagrage multpler. However, the above slope terpretato seems to be smpler ad more tutve.

ESE 5 Toy E. Smth More geerally, for ay partto of parameters, θ ( θ, θ ), as secto 4 above, the ull hypothess, H: θ θ, ca be tested the same way by aga lettg ˆ θ be defed by (4.7) ad settg ˆ θ ( θ, θ). If θ s of dmeso k, the the score vector, s ( ) θ, also has dmeso k, ad the varace term var( ) (5.4) s ow replaced by the covarace submatrx, cov ( θ ), where (5.9) cov( θ ) cov ( θ ) cov ( θ ) cov ( θ) cov ( θ) If the Fsher formato matrx s gve by (5.) I ( θ ) I ( θ ) I ( θ ) I( θ) I( θ) the t ca easly be show (by parttoed verse dettes) that (5.) cov ( θ ) I ( θ ) [ I ( θ ) I ( θ ) I ( θ ) I ( θ )] so that the followg score test statstc s obtaed as a drect geeralzato of (5.4): (5.) s ( θ ) I ( θ ) s ( θ ) ~ χ k The sgle most mportat feature of ths score test s that t oly requres maxmum-lkelhood estmato uder the ull hypothess,.e.,

ESE 5 Toy E. Smth codtoal maxmum-lkelhood estmato of θ gve geerally much easer to calculate. For example, f θ model, or θ λ the SL model, the the remag parameters θ ( βσ, ) are drectly obtaable from OLS. θ θ. Ths s ρ the SAR 6. Mora s I as a Score Test for SAR Oe key feature of score tests for our preset purposes s that the score test statstc for ρ the SAR model turs out to be precsely Mora s I statstc (up to a scale factor). To see ths, observe frst (from secto 8.3 the BULKPACK) that (6.) (,, ) { l ( ) ( ) ρl ρβσ ρ cost+ Bρ yxβ B ρbρ y Xβ} σ { l( ρω ) ( ) ( ) y Xβ B B y Xβ σ } ρ ρ ρ where Bρ I ρw ad where ( ω :,.., ) are the egevalues of W, so that (6.) Wv ω v,,.., To aalyze (6.) we frst assume (for smplcty) that the egevectors (6.) are learly depedet so that the matrx, V ( v :,.., ), s osgular. Hece f Δ deotes the dagoal matrx of egevalues (6.) the by defto, ρ ρ ρ ρ (6.3) Wv [,.., v] [ v,.., v] Δ WV VΔ W VΔV

ESE 5 Toy E. Smth Next we observe that f the trace of a matrx A ( a :, j,.., ) s defed by tr( A) a the t follows that for ay matrces, A ( a,.., a ) ad B ( b,.., b ), tr( AB) a ( ) b tr BA. Moreover, sce tr( W ) for all weght matrces, t follows from (6.) ad (6.3) that j (6.4) Δ Δ Δ ω tr( W ) tr( V V ) tr( V V ) tr( ) Hece for the sgle most mportat ull hypothess, ρ, we have ω l( ) ω (6.5) ρ { ρω } ρ ρω ρ ad t follows that (6.) reduces to (6.6) L (,, ) { ( y X ) B B ( y X ) ρ βσ ρ β ρ ρ β} σ ρ {( y X )( I W )( I W)( y X )} σ ρ β ρ ρ β { } σ ρ ρ ρ ( yxβ ) I ρ( W + W) + ρ WW ( y Xβ) { } + + ( y Xβ ) ( W W) ρww ( y Xβ) σ ρ + ( y Xβ )( W W)( y Xβ ) σ { ( ) ( ) + ( ) ( )} y Xβ W y Xβ y Xβ W y Xβ σ. ( y Xβ ) W( y Xβ ) σ

ESE 5 Toy E. Smth Hece f we let ˆ ( β ˆ, σ ) deote the OLS estmates of ( β, ˆ σ ) [.e, the maxmum-lkelhood estmates uder the hypothess, ρ ], so that ths case, θ (, ˆ β, ˆ σ ), the the score fucto s gve by: (6.7) s ˆ ˆ ˆ ˆ ( θ) ρl(, β, σ) ˆ ( yxβ) W( y Xβ) σ But f we ow let ˆ ε ˆ y Xβ deote the OLS resduals, so that (6.8) ˆ σ ( yx ˆ β ) ( y X ˆ β ) ˆ ε ˆ ε the (6.7) takes the form; ( ) [ / ˆ ˆ ]( ˆ ) ( ˆ ) (6.9) s θ εε yxβ W y Xβ ˆ W ˆ s ε ε ˆ ε ˆ ε ( θ) where the expresso brackets s precsely Mora s I. To complete the argumet, recall from expresso (77) the BULKPACK that the off-dagoal ( ρ, σ ) term the Fsher formato matrx, I ( θ), s gve by (6.) k k ( ρ ) [ ( ρ ) ] [ ( ρ ] k tr G tr W I W tr W I + W k k ( ρ ρ W k ) tr W + Hece uder the ull hypothess, ρ, t follows that

ESE 5 Toy E. Smth (6.) tr( Gρ) ρ tr( W ) But ths mples that I ( θ) s block dagoal, so that ths smple case, (5.7) reduces to (6.) cov ( θ ) I ( θ ) I ( θ ) whch s gve from (77) the BULKPACK by ( ) [ ρ( ρ + ρ] ρ [ ( + )] (6.3) I θ ( ) tr G G G tr W W W Hece the fal score statstc s (6.4) ˆ ε W ˆ ε s( θ) I ( θ) s ( θ) ~ χ tr( WW + WW ) ˆ ε ˆ ε Note also that a equvalet test statstc, whch s proportoal to Mora s I, s obtaed by takg the square root of (6.4): ˆ ε ˆ Wε (6.5) ~ N(,) tr( WW + WW ) ˆ ε ˆ ε