Lecture Note to Rice Chapter 8

ECON 430 HG revsed Nov 06 Lecture Note to Rce Chapter 8 Radom matrces Let Y, =,,, m, =,,, be radom varables (r.v. s). The matrx Y Y Y Y Y Y Y Y Y Y = m m m s called a radom matrx ( wth a ot m-dmesoal dstrbuto, f ( y, y,, ym )). The expected value of Y s defed as () E( Y ) E( Y) E( Y ) E( Y ) E( Y ) E( Y ) E( Y ) def = E( Ym ) E( Ym) E( Ym) The expectato satsfes the followg rules (whch follows drectly from the defto () combed wth the correspodg lear propertes for the expectato the scalar case):. E( AY + C) = A E( Y) + C where A, C, are ay matrces of costats wth dmesos compatble wth Y (.e. A~ k m, ad C ~ k, where k s arbtrary).. E( AYB + C) = A E( Y ) B + C where A, B, C are ay costat matrces compatble wth Y dmeso so that the product ad sum s well defed... Y [ Y ] E( ') = E( ) ' where A ' deotes the trasposed matrx

Y If Y = s a -dmesoal radom vector, t s expectato, μ = E( Y ) (sometmes Y wrtte, μ Y ), s therefore the vector of dvdual expectatos, E( Y) μ μ = E( Y ) = = E( Y) μ Let σ = E ( Y μ )( Y μ ) = σ be the covarace betwee Y ad Y. I partcular we have σ = E ( Y μ ) = var( Y ). The covarace matrx of Y (deoted as cov( Y ) ) s defed as the matrx σ σ var( Y ) cov( Y, Y) Σ= = σ σ cov( Y, Y) var( Y) whch ca be expressed as Y μ cov( Y) = E [( Y μ)( Y μ)' ] = E ( Y μ,, Y μ) = Y μ ( Y μ) ( Y μ)( Y μ ) σ σ () = E = =Σ ( Y μ)( Y μ) ( Y μ) σ σ Example Suppose that vector Y, Y,, Y are d wth expectato E( Y ) Y' = ( Y,, Y ) has expectato = η ad var( Y ) σ =. The the η E( Y ) = = η η

3 ad covarace matrx (sce σ = cov( Y, Y ) = 0 for ): cov( ) σ 0 0 0 0 0 0 0 0 σ 0 0 0 0 Y σ = = σ = σ I where I s the -dmesoal detty matrx. (Ed of example) If Y' = ( Y,, Y ) s a radom vector, ad A a p costat matrx, we obta from.-. (ad the fact that ( BC)' = C' B' for matrces B ad C): () E( AY) = A E( Y) = Aμ ad (3) cov( AY) = A cov( Y) A' = AΣ A' (.e. a p p matrx) whch follows from [ μ μ ] [ μ μ ] [ μ μ ] cov( AY) = E ( AY A )( AY A )' = E A( Y )( Y )' A' = A E ( Y )( Y )' A' = AΣA' I partcular, f Z s a lear combato of Y,, Y,.e. Z = ay + + ay, the (4) var( Z ) = var( ay ' ) = a' Σ a where a' = ( a,, a ) ad Σ = cov( Y ). Y Z = ( a,, a ) = a' Y ca be cosdered a Y Z ' = Z, ad, therefore, cov( Z ) = var( Z) (.e., [Proof: Sce that [ ] matrx, we must have cov( Z) = E ( Z E( Z))( Z E( Z)) ' = E ( Z E( Z)) = var( Z) ). We the see that (4) s a specal case of (3) wth A= a' ]

4 Example Ordary least squares (OLS) Cosder the stadard multple regresso model wth oe respose, Y, ad p explaatory varables (5) β0 β βp p Y = + x + + x + u for =,,, where, for smplcty, all x are cosdered fxed, o radom quattes, ad the errors, u, u,, u are assumed to be d ad ormally dstrbuted wth expectato, E( u ) = 0 ad var( u ) = σ. We ca wrte (5) matrx form as follows Y Y β0 + βx + βx + + βpx p u x x p β0 u Y β 0 βx βx βpx p u x x p β + + + + = = + = u + Y β0 + βx + βx + + βpx p x u x p β p u The three matrces o the rght we deote by X, β,ad u respectvely. The model ca ow be wrtte as (6) Y = Xβ + u where X s the p (so called) desg matrx, β the ( p + ) vector of regresso coeffcets, ad u the vector of errors. Sce E( u) 0 E( u ) 0 = = = E( u) 0 E( u) 0 (where 0 deotes a vector of zeroes), we get from. (otg that X β s o radom) (7) E( Y) = Xβ + E( u) = Xβ The covarace matrx for Y becomes, sce Y Xβ = u, ad usg example, Σ = cov( Y) = E ( Y Xβ)( Y Xβ)' = E uu' = cov( u) = σ I (8) [ ] [ ] Y The OLS estmator, β, for β s obtaed by mmzg the sum of squares

5 = ( ) β0 β βp Q= Y x xp wth respect to β. Dfferetatg Q wth respect to all the β s, ad settg the dervatves equal to 0, leads to the followg system of equatos that the β s must satsfy β0 + ( ) ( ) x x β + + p βp = Y ( ) ( ) ( ) x β + x β + + x x β p p = x Y 0 x β + x x β + + x β = x Y ( ) 0 ( ) ( ) p p p p p Notg that the coeffcets of the left sde are exactly the elemets the ( p+ ) ( p+ ) matrx X ' X, ad that the rght sde, wrtte as a vector, smply s X ' Y, we ca wrte the system more compactly as X ' X β = X ' Y Assumg that X ' X s o sgular (whch ca be show to be the case f o sgle x- varable ca be wrtte exactly as a lear combato of the other x-varables,.e., there s o exact collearty betwee the explaatory varables), we obta the soluto (the OLS estmator) (9) ( ' ) ' β = X X X Y It s ow easy to prove that β s ubased sce, from. ad (7) (0). (7) E( β ) = E ( X ' X) X ' Y = ( X ' X) X 'E( Y) = ( X ' X) X ' Xβ = Ipβ = β Wrtg C = ( X ' X) X ', we have β = CY, ad obta the covarace matrx from (3) ad (8) [ad also usg the rule that the trasposed of a verse square matrx s the verse of the trasposed, A ' = ( A'), whch s see by takg the trasposed of the equato, A A = I. Remember also the AI = A for ay p -matrx A, ad that, f c s a scalar, the c as factor ca be take outsde a matrx product, A ( cb) = cab. ]. ( ) (3) (8) cov( β) = cov( CY ) = CΣ YC ' = C σ I C ' = σ CC ' = σ ( X ' X ) XX '( X ' X )

6 Hece () β = σ X X (Ed of example.) cov( ) ( ' ) Multormal dstrbutos We say that the vector X ' = ( X,, X ) s (mult)ormally dstrbuted wth expectato μ = E( X ), ad covarace matrx, Σ = cov( X ) (wrtte shortly X ~ N( μ, Σ )), f the ot pdf s gve by () ( )' x μ Σ ( x μ ) f( x,, x μ, Σ ) = e where ( π ) det( Σ) meas the determat of Σ. x x = x ad det( Σ ) Ths dstrbuto has a lot of coveet mathematcal propertes (see e.g. Greee, Ecoometrc Aalyss, chapter 3, for a summary), but here we oly eed the followg: (3). If X ~ N( μ, Σ ) ad A s a p costat matrx ( p ) ad b a p costat vector, the Y = A X + b~ N(E( Y), cov( Y)) = N( Aμ + b, AΣ A') [For proof see e.g. Greee chapter 3. ] I partcular, ths shows that all margal dstrbutos are also ormal. For example, the margal dstrbuto of X, X s ormal sce X X = AX where 0 0 A = 0 0 whch gves (check!) (4) X X X μ σ σ ~ N(E, cov ) = N(, ), X X X μ σ σ.e. a bvarate ormal dstrbuto Exercse. Show that the pdf (4) as defed (), reduces to the bvarate ormal desty as defed Example F Rce secto 3.3 (both edtos). [Ht:

7 Itroduce the correlato, ρ, betwee X ad X, ρ = σ σσ, mplyg X σ = σσρ, ad the determat, det(cov ) = σσ σ = σσ( ρ ) X etc. ] Example 3 (Cotuato of example ) The error vector, u, (6) has expectato 0 ad covarace, Σ =cov( u u ) = σ I. We see from () that sayg that u ~ N(0, σ I ) s the same as sayg that u, u,, u are d ad ormally dstrbuted wth expectato, E( u ) = 0 ad var( u ) = σ. I fact, we have the determat σ 0 0 0 σ 0 det( Σ u) = det( σ ) = det = 0 0 σ ad the expoet () reduces to I σ. ( ) ( u E( u))' Σ ( u E( u)) = u' σ I u = u' I u = u' u = u σ σ σ u Substtutg (), shows that the ot dstrbuto () reduces to the product of oedmesoal N(0, σ ) -dstrbutos as the d statemet would mply. By (3), (7), ad (8) we obta that Y s ormally dstrbuted, Y ~ N(E( Y), cov( Y)) = N( Xβ, σ I ), ad, by (3) aga, that ormally dstrbuted ( ' ) ' β = X X X Y s β β β = β σ X X ~ N(E( ),cov( )) N(, ( ' ) ) (Ed of example.)

8 3 O the asymptotc dstrbuto for mle estmators (the mult parameter case) I ths secto we wll oly descrbe how to determe the asymptotc dstrbuto for the mle estmator case there are several ukow parameters the model, wthout gog to detals of dervatos ad proofs. A good summary of the theory ca be foud chapter 4 of Greee s book, Ecoometrc Aalyss. See also Rce at the ed of secto 8.5. (both edtos). Suppose that X, X,, X are d wth X ~ f( x θ ) (pdf), where θ ' = ( θ, θ,, θ r ) s a r-dmesoal vector of ukow parameters. The the ot pdf s f( x θ ) ad the log lkelhood s = l( θ ) = l f( x θ ) = The mle estmator, θ, solves r equatos = l f ( x θ ) = 0, =,,, r θ I order to defe the r r Fsher formato matrx that s eeded the asymptotc dstrbuto of θ, we troduce l f( X θ ) m ( θ ) = E, =,,, r θ θ The the Fsher formato matrx for oe observato s defed as m( θ ) m r ( θ ) I( θ ) = mr( θ ) mr r( θ ) Uder regularty codtos smlar to the oe-parameter case (see Greee for detals), we have that the mle satsfes D ) ( θ θ) N(0, I( θ) The defto of covergece dstrbuto for radom vectors s smlar but slghtly more techcal tha the defto for the oe-dmesoal case, ad we skp the detals

9 here (see Greee for a precse defto). However, the terpretato of the result s the same as the oe-dmesoal case,.e., that for large, approxmately ~, θ Nθ I ( θ ) Hece we ca say that θ s asymptotcally ubased wth asymptotc covarace matrx, ( Iθ ) ( ). Ths matrx s ukow sce θ s ukow, but ca be cosstetly estmated by replacg θ by θ (or ay other cosstet estmator of θ ). [That θ s cosstet meas smply that θ P θ for all =,,, r]. A geeralzato of Slutsk s lemma to the multvarate case (detals omtted), ow allow us to coclude that, for large approxmately (5) θ ~ N θ, I( θ ) whch s the mportat result that you should kow. Usg (3) we also have (6) approxmately Aθ ~ N Aθ, A I( θ) A' for ay costat, p r matrx A. From ths we get the followg: Let k ( θ ) deote elemet, I( θ ). The the estmated asymptotc varace of θ s the -th elemet o the ma dagoal the estmated covarace matrx,.e. k ( )/. θ [Follows from (6). I fact, let a ' = (0,,,,0) where the s posto ad zeroes elsewhere. The from (6) approx. k ( ) θ θ = a' θ ~ Na' θ, a' I( θ) a= N θ, ] Hece, we obta a approxmate α CI for θ : θ ± zα k( θ) where z α s the upper α -pot N(0,).

0 Example 4. Assume we wat a CI for the trasformed parameter, η = θ θ. Ths we obta from (6): Let b ' = (,,0,,0). The, by (6), approx. ' ( ) ' ~ ( ', bi θ b η = θ θ = b θ N b θ = Nθ θ, ( k( θ) + k( θ) k( θ)) whch leads to the approxmate α CI for θ θ: θ θ ± z α k( θ) + k( θ) k( θ ) [Note that all covarace matrces are symmetrc. Hece k ( θ ) = k( θ ). ] (Ed of example.) Example 5 (O example C Rce secto 8.5 (both edtos) precptato data) Let X be the amout of precptato for rastorm o., =,,, ( = 7 observatos). Model: X, X,, X are d wth X ~ Γ ( α, λ). The ot dstrbuto s α α λ X, X,, X ~ f( x αλ, ) = ( xx x) e = Γ ( α) λ x The log lkelhood s (7) l ( α, λ) = αl λ+ ( α ) l x λ x l Γ( α) The frst dervatves of l are l = lλ + l x l Γ( α ) α α l α = x λ λ Settg the dervatves equal to zero ad solvg wth respect to α ad λ, gves the mle estmators α ad λ. [Note. There are o explct formulas for the soluto, they must be foud by umercal teratos. For example, Excel works well ths case by the Solver

module: Choose two cells for the argumets α ad λ, wth start values e.g. at the momet estmates, ad the a thrd cell for the fucto (7). The use Solver to maxmze (7). Ths ca also be doe STATA by the ml-commad, but slghtly more volved.] Usg hs program, Rce obtaed the mle estmates. α = 0, 44 ad λ =, 96 We wat approxmate 90% CI s for α ad λ based o the asymptotc ormal dstrbuto of α ad λ. I order to calculate the asymptotc stadard errors we eed the so called d- ad trgamma fuctos: Dgamma fucto: ψ ( α) = l Γ( α) α Trgamma fucto: ψ '( α) = l Γ( α) α Both fuctos ca be calculated STATA (uder the ames dgamma ad trgamma). We eed the Fsher formato matrx: gvg l f ( X α, λ) = αlλ l Γ ( α) + ( α )l X λx l f α = l λ ψ( α) + l X ad l f λ α = X λ Hece l f α = ψ '( α) (trgamma) l f l f = = α λ λ α λ l f λ α = λ Hece the Fsher formato matrx for oe observato

ψ '( α) ψ '( α) λ λ I( αλ, ) = E = α α λ λ λ λ The verse of a symmetrc matrx s a c b c = c b ab c c a Hece I( αλ, ) α λ λ α λ = = αψ '( α ) αψ '( α) λ λψ' ( α) ψ '( α) λ λ λ We obta a estmate of ths by substtutg the mle, α = 0,44 ad λ =, 96, for α ad λ I( αλ, ) α λ 0, 5903,53 = = αψ '( α) λ λ ψ '( α),53 3,8770 Here we foud ψ '( α ) = 6,869 from STATA by the commad: d trgamma(0.44) From the theory we have that α approx. α ~ N,, where the asymptotc covarace s C λ λ 0,004 0,005075 C I(, ) αλ = = 0,005075 0,060950 Hece the asymptotc stadard errors se( α ) = 0,004 = 0,03378 ad se( λ ) = 0,060950 = 0,468

3 Accordg to the theory we the obta approxmate 90% CI for α ad λ α ±, 64 se( α) = 0, 44 ± (, 64)(0, 03378) = [0,386, 0, 496] λ±,64 se( λ) =,96 ± (,64)(0,47) = [,55,,37] Rce (example E, secto 8.5.3)) obtas approxmate 90% CI s by the parametrc bootstrap method: α : [0,404, 0,53] λ : [,46,,3] The dfferece betwee the asymptotc tervals ad the bootstrap tervals does ot appear to be substatal. Wth as much as 7 observatos t s to be expected that the asymptotc theory should work well.