Lecture Note to Rice Chapter 8

Similar documents
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

TESTS BASED ON MAXIMUM LIKELIHOOD

Lecture 3 Probability review (cont d)

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

ρ < 1 be five real numbers. The

Introduction to Matrices and Matrix Approach to Simple Linear Regression

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Answer key to problem set # 2 ECON 342 J. Marcelo Ochoa Spring, 2009

Chapter 4 Multiple Random Variables

STK4011 and STK9011 Autumn 2016

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Functions of Random Variables

Econometric Methods. Review of Estimation

Qualifying Exam Statistical Theory Problem Solutions August 2005

Lecture Notes 2. The ability to manipulate matrices is critical in economics.

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Chapter 14 Logistic Regression Models

Multiple Linear Regression Analysis

X ε ) = 0, or equivalently, lim

Point Estimation: definition of estimators

Chapter 5 Properties of a Random Sample

Dimensionality Reduction and Learning

1 Solution to Problem 6.40

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

Chapter 9 Jordan Block Matrices

Dr. Shalabh. Indian Institute of Technology Kanpur

ECON 5360 Class Notes GMM

Simulation Output Analysis

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Chapter 4 Multiple Random Variables

Lecture 9: Tolerant Testing

GENERALIZED METHOD OF MOMENTS CHARACTERISTICS AND ITS APPLICATION ON PANELDATA

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

Lecture Notes to Rice Chapter 5

Special Instructions / Useful Data

MOLECULAR VIBRATIONS

MATH 247/Winter Notes on the adjoint and on normal operators.

Lecture 3. Sampling, sampling distributions, and parameter estimation

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

CHAPTER 3 POSTERIOR DISTRIBUTIONS

STK3100 and STK4100 Autumn 2017

STK3100 and STK4100 Autumn 2018

Maximum Likelihood Estimation

ESS Line Fitting

CHAPTER VI Statistical Analysis of Experimental Data

ENGI 3423 Simple Linear Regression Page 12-01

6.867 Machine Learning

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

4. Standard Regression Model and Spatial Dependence Tests

M2S1 - EXERCISES 8: SOLUTIONS

Probability and. Lecture 13: and Correlation

ENGI 4421 Propagation of Error Page 8-01

Chapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II

Simple Linear Regression

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

LINEAR REGRESSION ANALYSIS

Summary of the lecture in Biostatistics

Logistic regression (continued)

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Third handout: On the Gini Index

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

CHAPTER 4 RADICAL EXPRESSIONS

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Training Sample Model: Given n observations, [[( Yi, x i the sample model can be expressed as (1) where, zero and variance σ

Module 7. Lecture 7: Statistical parameter estimation

ε. Therefore, the estimate

Section 2 Notes. Elizabeth Stone and Charles Wang. January 15, Expectation and Conditional Expectation of a Random Variable.

Objectives of Multiple Regression

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Chapter 3 Sampling For Proportions and Percentages

Lecture 2: The Simple Regression Model

Introduction to F-testing in linear regression models

Bayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3

BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION

Class 13,14 June 17, 19, 2015

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

STA302/1001-Fall 2008 Midterm Test October 21, 2008

best estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best

4 Inner Product Spaces

STATISTICAL INFERENCE

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Derivation of 3-Point Block Method Formula for Solving First Order Stiff Ordinary Differential Equations

Lecture Notes Types of economic variables

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Point Estimation: definition of estimators

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

The Mathematical Appendix

18.413: Error Correcting Codes Lab March 2, Lecture 8

Matrix Algebra Tutorial With Examples in Matlab

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Lecture Notes Forecasting the process of estimating or predicting unknown situations

å 1 13 Practice Final Examination Solutions - = CS109 Dec 5, 2018

Lecture 07: Poles and Zeros

Extreme Value Theory: An Introduction

Transcription:

ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto, f ( y, y,, ym )). The expected value of Y s defed as () E( Y ) E( Y) E( Y ) E( Y ) E( Y ) E( Y ) E( Y ) def = E( Ym ) E( Ym) E( Ym) The expectato satsfes the followg rules (whch follows drectly from the defto () combed wth the correspodg lear propertes for the expectato the scalar case):. E( AY + C) = A E( Y) + C where A, C, are ay matrces of costats wth dmesos compatble wth Y (.e. A~ k m, ad C ~ k, where k s arbtrary).. E( AYB + C) = A E( Y ) B + C where A, B, C are ay costat matrces compatble wth Y dmeso so that the product ad sum s well defed... Y [ Y ] E( ') = E( ) ' where A ' deotes the trasposed matrx

Y If Y = s a -dmesoal radom vector, t s expectato, μ = E( Y ) (sometmes Y wrtte, μ Y ), s therefore the vector of dvdual expectatos, E( Y) μ μ = E( Y ) = = E( Y) μ Let σ = E ( Y μ )( Y μ ) = σ be the covarace betwee Y ad Y. I partcular we have σ = E ( Y μ ) = var( Y ). The covarace matrx of Y (deoted as cov( Y ) ) s defed as the matrx σ σ var( Y ) cov( Y, Y) Σ= = σ σ cov( Y, Y) var( Y) whch ca be expressed as Y μ cov( Y) = E [( Y μ)( Y μ)' ] = E ( Y μ,, Y μ) = Y μ ( Y μ) ( Y μ)( Y μ ) σ σ () = E = =Σ ( Y μ)( Y μ) ( Y μ) σ σ Example Suppose that vector Y, Y,, Y are d wth expectato E( Y ) Y' = ( Y,, Y ) has expectato = η ad var( Y ) σ =. The the η E( Y ) = = η η

3 ad covarace matrx (sce σ = cov( Y, Y ) = 0 for ): cov( ) σ 0 0 0 0 0 0 0 0 σ 0 0 0 0 Y σ = = σ = σ I where I s the -dmesoal detty matrx. (Ed of example) If Y' = ( Y,, Y ) s a radom vector, ad A a p costat matrx, we obta from.-. (ad the fact that ( BC)' = C' B' for matrces B ad C): () E( AY) = A E( Y) = Aμ ad (3) cov( AY) = A cov( Y) A' = AΣ A' (.e. a p p matrx) whch follows from [ μ μ ] [ μ μ ] [ μ μ ] cov( AY) = E ( AY A )( AY A )' = E A( Y )( Y )' A' = A E ( Y )( Y )' A' = AΣA' I partcular, f Z s a lear combato of Y,, Y,.e. Z = ay + + ay, the (4) var( Z ) = var( ay ' ) = a' Σ a where a' = ( a,, a ) ad Σ = cov( Y ). Y Z = ( a,, a ) = a' Y ca be cosdered a Y Z ' = Z, ad, therefore, cov( Z ) = var( Z) (.e., [Proof: Sce that [ ] matrx, we must have cov( Z) = E ( Z E( Z))( Z E( Z)) ' = E ( Z E( Z)) = var( Z) ). We the see that (4) s a specal case of (3) wth A= a' ]

4 Example Ordary least squares (OLS) Cosder the stadard multple regresso model wth oe respose, Y, ad p explaatory varables (5) β0 β βp p Y = + x + + x + u for =,,, where, for smplcty, all x are cosdered fxed, o radom quattes, ad the errors, u, u,, u are assumed to be d ad ormally dstrbuted wth expectato, E( u ) = 0 ad var( u ) = σ. We ca wrte (5) matrx form as follows Y Y β0 + βx + βx + + βpx p u x x p β0 u Y β 0 βx βx βpx p u x x p β + + + + = = + = u + Y β0 + βx + βx + + βpx p x u x p β p u The three matrces o the rght we deote by X, β,ad u respectvely. The model ca ow be wrtte as (6) Y = Xβ + u where X s the p (so called) desg matrx, β the ( p + ) vector of regresso coeffcets, ad u the vector of errors. Sce E( u) 0 E( u ) 0 = = = E( u) 0 E( u) 0 (where 0 deotes a vector of zeroes), we get from. (otg that X β s o radom) (7) E( Y) = Xβ + E( u) = Xβ The covarace matrx for Y becomes, sce Y Xβ = u, ad usg example, Σ = cov( Y) = E ( Y Xβ)( Y Xβ)' = E uu' = cov( u) = σ I (8) [ ] [ ] Y The OLS estmator, β, for β s obtaed by mmzg the sum of squares

5 = ( ) β0 β βp Q= Y x xp wth respect to β. Dfferetatg Q wth respect to all the β s, ad settg the dervatves equal to 0, leads to the followg system of equatos that the β s must satsfy β0 + ( ) ( ) x x β + + p βp = Y ( ) ( ) ( ) x β + x β + + x x β p p = x Y 0 x β + x x β + + x β = x Y ( ) 0 ( ) ( ) p p p p p Notg that the coeffcets of the left sde are exactly the elemets the ( p+ ) ( p+ ) matrx X ' X, ad that the rght sde, wrtte as a vector, smply s X ' Y, we ca wrte the system more compactly as X ' X β = X ' Y Assumg that X ' X s o sgular (whch ca be show to be the case f o sgle x- varable ca be wrtte exactly as a lear combato of the other x-varables,.e., there s o exact collearty betwee the explaatory varables), we obta the soluto (the OLS estmator) (9) ( ' ) ' β = X X X Y It s ow easy to prove that β s ubased sce, from. ad (7) (0). (7) E( β ) = E ( X ' X) X ' Y = ( X ' X) X 'E( Y) = ( X ' X) X ' Xβ = Ipβ = β Wrtg C = ( X ' X) X ', we have β = CY, ad obta the covarace matrx from (3) ad (8) [ad also usg the rule that the trasposed of a verse square matrx s the verse of the trasposed, A ' = ( A'), whch s see by takg the trasposed of the equato, A A = I. Remember also the AI = A for ay p -matrx A, ad that, f c s a scalar, the c as factor ca be take outsde a matrx product, A ( cb) = cab. ]. ( ) (3) (8) cov( β) = cov( CY ) = CΣ YC ' = C σ I C ' = σ CC ' = σ ( X ' X ) XX '( X ' X )

6 Hece () β = σ X X (Ed of example.) cov( ) ( ' ) Multormal dstrbutos We say that the vector X ' = ( X,, X ) s (mult)ormally dstrbuted wth expectato μ = E( X ), ad covarace matrx, Σ = cov( X ) (wrtte shortly X ~ N( μ, Σ )), f the ot pdf s gve by () ( )' x μ Σ ( x μ ) f( x,, x μ, Σ ) = e where ( π ) det( Σ) meas the determat of Σ. x x = x ad det( Σ ) Ths dstrbuto has a lot of coveet mathematcal propertes (see e.g. Greee, Ecoometrc Aalyss, chapter 3, for a summary), but here we oly eed the followg: (3). If X ~ N( μ, Σ ) ad A s a p costat matrx ( p ) ad b a p costat vector, the Y = A X + b~ N(E( Y), cov( Y)) = N( Aμ + b, AΣ A') [For proof see e.g. Greee chapter 3. ] I partcular, ths shows that all margal dstrbutos are also ormal. For example, the margal dstrbuto of X, X s ormal sce X X = AX where 0 0 A = 0 0 whch gves (check!) (4) X X X μ σ σ ~ N(E, cov ) = N(, ), X X X μ σ σ.e. a bvarate ormal dstrbuto Exercse. Show that the pdf (4) as defed (), reduces to the bvarate ormal desty as defed Example F Rce secto 3.3 (both edtos). [Ht:

7 Itroduce the correlato, ρ, betwee X ad X, ρ = σ σσ, mplyg X σ = σσρ, ad the determat, det(cov ) = σσ σ = σσ( ρ ) X etc. ] Example 3 (Cotuato of example ) The error vector, u, (6) has expectato 0 ad covarace, Σ =cov( u u ) = σ I. We see from () that sayg that u ~ N(0, σ I ) s the same as sayg that u, u,, u are d ad ormally dstrbuted wth expectato, E( u ) = 0 ad var( u ) = σ. I fact, we have the determat σ 0 0 0 σ 0 det( Σ u) = det( σ ) = det = 0 0 σ ad the expoet () reduces to I σ. ( ) ( u E( u))' Σ ( u E( u)) = u' σ I u = u' I u = u' u = u σ σ σ u Substtutg (), shows that the ot dstrbuto () reduces to the product of oedmesoal N(0, σ ) -dstrbutos as the d statemet would mply. By (3), (7), ad (8) we obta that Y s ormally dstrbuted, Y ~ N(E( Y), cov( Y)) = N( Xβ, σ I ), ad, by (3) aga, that ormally dstrbuted ( ' ) ' β = X X X Y s β β β = β σ X X ~ N(E( ),cov( )) N(, ( ' ) ) (Ed of example.)

8 3 O the asymptotc dstrbuto for mle estmators (the mult parameter case) I ths secto we wll oly descrbe how to determe the asymptotc dstrbuto for the mle estmator case there are several ukow parameters the model, wthout gog to detals of dervatos ad proofs. A good summary of the theory ca be foud chapter 4 of Greee s book, Ecoometrc Aalyss. See also Rce at the ed of secto 8.5. (both edtos). Suppose that X, X,, X are d wth X ~ f( x θ ) (pdf), where θ ' = ( θ, θ,, θ r ) s a r-dmesoal vector of ukow parameters. The the ot pdf s f( x θ ) ad the log lkelhood s = l( θ ) = l f( x θ ) = The mle estmator, θ, solves r equatos = l f ( x θ ) = 0, =,,, r θ I order to defe the r r Fsher formato matrx that s eeded the asymptotc dstrbuto of θ, we troduce l f( X θ ) m ( θ ) = E, =,,, r θ θ The the Fsher formato matrx for oe observato s defed as m( θ ) m r ( θ ) I( θ ) = mr( θ ) mr r( θ ) Uder regularty codtos smlar to the oe-parameter case (see Greee for detals), we have that the mle satsfes D ) ( θ θ) N(0, I( θ) The defto of covergece dstrbuto for radom vectors s smlar but slghtly more techcal tha the defto for the oe-dmesoal case, ad we skp the detals

9 here (see Greee for a precse defto). However, the terpretato of the result s the same as the oe-dmesoal case,.e., that for large, approxmately ~, θ Nθ I ( θ ) Hece we ca say that θ s asymptotcally ubased wth asymptotc covarace matrx, ( Iθ ) ( ). Ths matrx s ukow sce θ s ukow, but ca be cosstetly estmated by replacg θ by θ (or ay other cosstet estmator of θ ). [That θ s cosstet meas smply that θ P θ for all =,,, r]. A geeralzato of Slutsk s lemma to the multvarate case (detals omtted), ow allow us to coclude that, for large approxmately (5) θ ~ N θ, I( θ ) whch s the mportat result that you should kow. Usg (3) we also have (6) approxmately Aθ ~ N Aθ, A I( θ) A' for ay costat, p r matrx A. From ths we get the followg: Let k ( θ ) deote elemet, I( θ ). The the estmated asymptotc varace of θ s the -th elemet o the ma dagoal the estmated covarace matrx,.e. k ( )/. θ [Follows from (6). I fact, let a ' = (0,,,,0) where the s posto ad zeroes elsewhere. The from (6) approx. k ( ) θ θ = a' θ ~ Na' θ, a' I( θ) a= N θ, ] Hece, we obta a approxmate α CI for θ : θ ± zα k( θ) where z α s the upper α -pot N(0,).

0 Example 4. Assume we wat a CI for the trasformed parameter, η = θ θ. Ths we obta from (6): Let b ' = (,,0,,0). The, by (6), approx. ' ( ) ' ~ ( ', bi θ b η = θ θ = b θ N b θ = Nθ θ, ( k( θ) + k( θ) k( θ)) whch leads to the approxmate α CI for θ θ: θ θ ± z α k( θ) + k( θ) k( θ ) [Note that all covarace matrces are symmetrc. Hece k ( θ ) = k( θ ). ] (Ed of example.) Example 5 (O example C Rce secto 8.5 (both edtos) precptato data) Let X be the amout of precptato for rastorm o., =,,, ( = 7 observatos). Model: X, X,, X are d wth X ~ Γ ( α, λ). The ot dstrbuto s α α λ X, X,, X ~ f( x αλ, ) = ( xx x) e = Γ ( α) λ x The log lkelhood s (7) l ( α, λ) = αl λ+ ( α ) l x λ x l Γ( α) The frst dervatves of l are l = lλ + l x l Γ( α ) α α l α = x λ λ Settg the dervatves equal to zero ad solvg wth respect to α ad λ, gves the mle estmators α ad λ. [Note. There are o explct formulas for the soluto, they must be foud by umercal teratos. For example, Excel works well ths case by the Solver

module: Choose two cells for the argumets α ad λ, wth start values e.g. at the momet estmates, ad the a thrd cell for the fucto (7). The use Solver to maxmze (7). Ths ca also be doe STATA by the ml-commad, but slghtly more volved.] Usg hs program, Rce obtaed the mle estmates. α = 0, 44 ad λ =, 96 We wat approxmate 90% CI s for α ad λ based o the asymptotc ormal dstrbuto of α ad λ. I order to calculate the asymptotc stadard errors we eed the so called d- ad trgamma fuctos: Dgamma fucto: ψ ( α) = l Γ( α) α Trgamma fucto: ψ '( α) = l Γ( α) α Both fuctos ca be calculated STATA (uder the ames dgamma ad trgamma). We eed the Fsher formato matrx: gvg l f ( X α, λ) = αlλ l Γ ( α) + ( α )l X λx l f α = l λ ψ( α) + l X ad l f λ α = X λ Hece l f α = ψ '( α) (trgamma) l f l f = = α λ λ α λ l f λ α = λ Hece the Fsher formato matrx for oe observato

ψ '( α) ψ '( α) λ λ I( αλ, ) = E = α α λ λ λ λ The verse of a symmetrc matrx s a c b c = c b ab c c a Hece I( αλ, ) α λ λ α λ = = αψ '( α ) αψ '( α) λ λψ' ( α) ψ '( α) λ λ λ We obta a estmate of ths by substtutg the mle, α = 0,44 ad λ =, 96, for α ad λ I( αλ, ) α λ 0, 5903,53 = = αψ '( α) λ λ ψ '( α),53 3,8770 Here we foud ψ '( α ) = 6,869 from STATA by the commad: d trgamma(0.44) From the theory we have that α approx. α ~ N,, where the asymptotc covarace s C λ λ 0,004 0,005075 C I(, ) αλ = = 0,005075 0,060950 Hece the asymptotc stadard errors se( α ) = 0,004 = 0,03378 ad se( λ ) = 0,060950 = 0,468

3 Accordg to the theory we the obta approxmate 90% CI for α ad λ α ±, 64 se( α) = 0, 44 ± (, 64)(0, 03378) = [0,386, 0, 496] λ±,64 se( λ) =,96 ± (,64)(0,47) = [,55,,37] Rce (example E, secto 8.5.3)) obtas approxmate 90% CI s by the parametrc bootstrap method: α : [0,404, 0,53] λ : [,46,,3] The dfferece betwee the asymptotc tervals ad the bootstrap tervals does ot appear to be substatal. Wth as much as 7 observatos t s to be expected that the asymptotc theory should work well.