Ouline Ouline Hypohesis Tes wihin he Maximum Likelihood Framework There are hree main frequenis approaches o inference wihin he Maximum Likelihood framework: he Wald es, he Likelihood Raio es and he Lagrange Muliplier es. Chrisian Julliard Deparmen of Economics and FMG London School of Economics Bayesian inference will no be presened. Ouline Ouline Key assumpions Ouline We have already seen ha even if observaions are dependen, he resuls derived for he MLE in he iid seing carry over for ergodic processes, and we ll be assuming ha: he MLE of a vecor of parameers ψ is consisen 2 And ha T ˆψ ψ D N 0, T Iψ where Iψ is he informaion marix. Wald Tes Linear resricions for he sandard linear regression Wald es for nonlinear consrains 2 The Likelihood Raio Tes 3 Lagrange Muliplier Tess Example: LM es in Nonlinear Leas Squares 4 Comparison beween he Wald, LR and LM ess 5 Durbin-Wason Tes
The Wald ess Idea: use he MLE of he unresriced model. Suppose we have a model wih k unknown parameers ψ ha delivers he log likelihood loglψ. We know ha T ˆψ ψ d N 0, IAψ where IAψ = T Iψ, Iψ = E 2 log Lψ. Suppose we wan o es a linear hypohesis H 0 : Rψ = q vs. H A : Rψ q, where R has r < k linearly independen rows r resricions hen under H 0 d T R ˆψ ψ = T R ˆψ q N 0, RIAψ R }{{} r r Recall: if he n dimensional vecor x N0, A x A x χ 2 n. This implies ha T R ˆψ q RIAψ R ] T R ˆψ q d χ 2 r. Bu: we do no observe IAψ. If we can find a consisen esimaor, he disribuion remains unchanged Possible esimaors are: he empirical informaion marix based, T I ˆψ, and he empirical hessian based, T 2 log L ˆψ ]. Assuming he firs is available, hen W = T R ˆψ q R T I ˆψ R ] T R ˆψ q d χ 2 r Wald es for he sandard linear regression Linear resricions for he sandard linear regression We showed las ime ha for he sandard linear regression model y i = x i β + ε i N 0, σ 2 ; i =,..., n; E x i ε s] = 0 s, i wih unknown parameers β, σ 2, we have ha T ˆβ d β N = 0, x x T ˆσ 2 Suppose β = β, β 2 ] and we wan o es he null: or equivalenly in marix form β β 2 = q β ], ] = q }{{} β 2 R }{{} β Wald es for he sandard linear regression Noing ha under he null T R ˆβ β = T We can consruc he Wald es as R ˆβ D q N = 0, R x x T ˆσ 2 R W = T R ˆβ q = R x x ] T ˆσ 2 R T R ˆβ q 2 = ˆβ = ˆβ 2 q R x x ] ˆσ 2 R d χ 2
Wald wih nonlinear consrains Wald es for nonlinear consrains Wald wih nonlinear consrains Example: nonlinear resricions for he SLR Consider H 0 : Rψ = 0, a se of r linear or nonlinear consrains. R is a column r-vecor. Le R = R, R 2,..., R be a well defined r k marix k is he number of parameers in ψ. k ] Then, under H 0 he saisic ] W = R ˆψ R ˆψ I ˆψ R ˆψ R ˆψ χ 2 r Suppose β = β, β 2 ] and we wan o es he null: 2 β2 + 2 β2 2 q = 0 }{{} Rψ in he sandard linear regression model Noicing ha R ψ R ψ = ; β ] R ψ = β ; β 2 ] β 2 Inuiion: Dela Mehod/Taylor Expansion. Than he Wald saisic becomes 2 ˆβ 2 + 2 ˆβ 2 2 q 2 { ˆβ ; ˆβ 2 ] = x x ˆσ 2 ˆβ ˆβ 2 ]} d χ 2 The Likelihood Raio Tes Again suppose he model can be expressed in erms of a likelihood funcion Lψ. Suppose we also have a se of r resricions, eiher linear or nonlinear i.e. Rψ = q or Rψ = 0. I can be shown ha under he null { } L ˆψ0 { } LR = 2 log = 2 log L ˆψ log L ˆψ 0 χ L ˆψ 2 r If he daa conforms wih he null you expec L ˆψ o be close o L ˆψ 0 and for LR o be close o 0. If he daa does no conform you expec L ˆψ >> L ˆψ 0 and LR >> 0. Hence he es is o rejec H 0 a he α level if LR > χ 2 αr. Idea: Esimae he unresriced model o obain ML esimaes, ˆψ and L ˆψ. 2 Esimae he model under he resricions o obain resriced esimaes ˆψ 0 and L ˆψ 0. 3 Then compare L ˆψ and L ˆψ 0
Lagrange Muliplier Tess Again suppose he model can be expressed in erms of a likelihood funcion Lψ and ha we have r resricions Rψ = 0. If he resricions are valid ˆψ 0 he MLE of he resriced model will be close o ˆψ he MLE of he unresriced model and he parial derivaives in he vecor log L ˆψ 0 also be close o zero noe: log L ˆψ = 0 by consrucion will I can be shown ha under he null, he quadraic form LM = T log L ˆψ 0 IAψ 0 log L ˆψ 0 d χ 2 r. As usual, we normally do no know IAψ 0 and his mus be replaced by a consisen esimae. Assuming ha T I ˆψ or a consisen alernaive is available, hen log L ˆψ 0 I ˆψ 0 log L ˆψ 0 d χ 2 r 2 and is referred o as a Lagrange Muliplier saisic. The LM es in Nonlinear Leas Squares LM es for nonlinear leas squares This resul can be specialized for nonlinear leas squares problems. Thus we have y = gx ; β + ε, ε iid N0, σ 2 x independen of ε, =,..., T. Then he unresriced log likelihood has he form log Lβ, σ 2 = T 2 log 2π T 2 log σ2 2σ 2 ε β = y gx ; β. T ε β 2 = The LM es in Nonlinear Leas Squares Assume ha he r resricions involve only β no σ 2 : Rβ = 0. Then log Lβ, σ 2 = β σ 2 z ε, and as before, Iψ = E T T z = ε β. 3 2 ] log Lψ Bu as σ 2 is no in he resricion he informaion marix is block diagonal. Consider only he sub marix associaed wih β. Since x is independen of ε, 2 ] log L I ββ ψ = E β β = σ 2 E z z. 4
The LM es in Nonlinear Leas Squares Evaluaing he LM-saisics a ˆβ 0, ˆσ 0 2, where ˆσ 0 2 = T ε2 ˆβ 0, and replacing he expecaions wih heir sample analog LM = ˆσ 0 2 z ε ] z z z ε. By inspecion, LM is relaed o he regression of ε on z i.e. ε = z γ + u, ˆγ = Σz z Σz ε. Define fied values for such a regression η = z ˆγ = z z z ] z ε. The LM es in Nonlinear Leas Squares Now consider he R 2 from his regression T R 2 = T η 2 ε 2 = η η T ε ε = z ε z z ] z z ] z z ] z ε ˆσ 2 0 = LM. Hence a valid LM saisic can always be obained by regressing ε ˆψ 0 on z ˆψ 0 and calculaing LM = T R 2. Then rejec H 0 a he α level if LM > χ 2 αr. Inuiion if ˆβ 0 is close o ˆβ, he ε ˆβ 0 shouldn be forecasable. All hree ess are asympoically equivalen. Warning: hese are asympoic disribuion resuls, so cauion should be used in small sample. In small sample bu here are excepions: In general he LR es is he bes, in he sense ha is finie sample behavior mos closely approximaes is expeced large sample properies. 2 The Wald es is second bes and he LM procedure wors. The Durbin-Wason Tes The Durbin Wason es is he only es for which we have small sample properies. Unforunaely he circumsances in which i is valid are so resriced ha i is almos always inappropriae. The model: y = x β + u u = φu + ε, ε iid N0, σ 2. We wan o es H 0 : φ = 0 agains H A : φ > 0.
Under he null, esimae he model by leas squares and calculae he es saisic =2 d = û û 2 T =2 = û2 =2 + û2 2 ûû =2. = û2 = û2 = û2 = û2 Noe: d 2 r, where r is he simple correlaion beween û and û. d lies in he inerval 0,4]. Unforunaely he exac disribuion of d depends on X Bu d is subjec o an upper d U and lower bound d L ha depend on boh he sample size and he number of regressors. We are esing agains posiive serial correlaion so we rejec if d is oo small. If d < d L rejec, if d > d U fail o rejec. If d L < d < d U inconclusive. Noe: o be valid, i he regression mus conain a consan, ii all RHS variables are processed independen of he errors