Classical Linear Regression Model. Normality Assumption Hypothesis Testing Under Normality Maximum Likelihood Estimator Generalized Least Squares

Classical Liear Regressio Model Normality Assumptio Hypothesis Testig Uder Normality Maximum Likelihood Estimator Geeralized Least Squares

Normality Assumptio Assumptio 5 e X ~ N(,s I ) Implicatios of Normality Assumptio (b-b) X ~ N(,s (X X) -1 ) (b k -b k ) X ~ N(, s ([X X) -1 ] kk ) z k b k b 1 s [( XX ' ) ] k kk ~ N(,1)

Hypothesis Testig uder Normality Implicatios of Normality Assumptio Because e i /s ~ N(,1), ( K) s ee ' ε ε ' ~ ( trace( )) M M s s s s where M = I-X(X X) -1 X ad trace(m) = -K.

Hypothesis Testig uder Normality If s is ot kow, replace it with s. The stadard error of the OLS estimator b k is SE(b k ) = 1 s [( X' X) ] Suppose A.1-5 hold. Uder H : the t-statistic defied as t k bk bk SE(b ) k s b k b k [( X' X) kk 1 ] kk bk b k ~ t(-k).

Hypothesis Testig uder Normality Proof of t k bk bk SE(b ) k s b k b k [( X' X) 1 ] kk ~ t(-k) b k b k 1 s [( XX ' ) ] kk N(,1) ~ t( K) ( K) s / s ( K) K K

Hypothesis Testig uder Normality Testig Hypothesis about Idividual Regressio Coefficiet, H : b If s is kow, use z k ~ N(,1). k b k If s is ot kow, use t k ~ t(-k). Give a level of sigificace a, Prob(-t a/ (-K) < t < t a/ (-K)) = 1-a bk bk / ( K) ta / ( K) SE(b ) ta k

Hypothesis Testig uder Normality Cofidece Iterval bk SE(bk )ta / ( K) bk bk SE(bk )ta/ ( K) [b k -SE(b k ) t a/ (-K), b k +SE(b k ) t a/ (-K)] p-value: p = Prob(t> t k )x Prob(- t k <t< t k ) = 1-p sice Prob(t> t k ) = Prob(t<- t k ). Accept H if p >a. Reject otherwise.

Hypothesis Testig uder Normality Liear Hypotheses H : Rb = q r11 r1 r1 K b1 q1 r1 r r K b q r r r b q J1 J JK K J

Hypothesis Testig uder Normality Let m = Rb-q, where b is the urestricted least squares estimator of b. E(m X) = E(Rb-q X) = Rb-q = Var(m X) = Var(Rb-q X) = RVar(b X)R = s R(X X) -1 R Wald Priciple W = m Var(m X) -1 m = (Rb-q) [s R(X X) -1 R ] -1 (Rb-q) ~ χ (J), where J is the umber of restrictios Defie F = (W/J)/(s /s ) = (Rb-q) [s R(X X) -1 R ] -1 (Rb-q)/J

Hypothesis Testig uder Normality Suppose A.1-5 holds. Uder H : Rb = q, where R is JxK with rak(r)=j, the F- statistic defied as 1 1 ( Rb q)'[ R( X'X) R '] ( Rb q) / J F s 1 ( Rb q)'{ R[ Est Var( b X)] R '} ( Rb q) / is distributed as F(J,-K), the F distributio with J ad -K degrees of freedom. J

Discussios Residuals e = y-xb ~ N(,s M) if s is kow ad M = I-X(X X) -1 X If s is ukow ad estimated by s, i s e [1 x ( X' X) x ] ' 1 i 1,,..., i i ~ t( K)

Discussios Wald Priciple vs. Likelihood Priciple: By comparig the restricted (R) ad urestricted (UR) least squares, the F- statistic is show F ( R UR ) / ( UR R) / UR UR SSR SSR J R R J SSR / ( K) (1 R ) / ( K)

Discussios Testig R = : Equivaletly, H : Rb = q, where J = K-1, ad b 1 is the urestricted costat term. The F-statistic follows F(K-1,-K). b b b 1 1 1 K 1

Discussios Testig b k = : Equivaletly, H : Rb = q, where 1 F(1,-K) = b k [Est Var(b)] -1 kkb k t-ratio: t(-k) = b k /SE(b k ) b1 b bk

Discussios t vs. F: t (-K) = F(1,-K) uder H : Rb=q whe J=1 For J > 1, the F test is preferred to multiple t tests Durbi-Watso Test Statistic for Time Series Model: (e i i ei1) DW e i1 The coditioal distributio, ad hece the critical values, of DW depeds o X i

Maximum Likelihood Assumptio 1,, 4, ad 5 imply y X ~ N(Xb,s I ) The coditioal desity or likelihood of y give X is / 1 f ( y X; β, s ) (s ) exp s ( y Xβ)'( y Xβ)

Maximum Likelihood Likelihood Fuctio L(b,s ) = f(y X;b,s ) Log Likelihood Fuctio log L( β, s ) log( ) log( ) log( s log( s ) ) 1 s 1 s ( y Xβ)'( y SSR ( β) Xβ)

Maximum Likelihood ML estimator of (b,s ) = argmax (b,g) log L(b,g), where we set g = s log L( β, β g) 1 g SSR ( β) β log L( β, g g) g 1 SSR ( β) g

Maximum Likelihood Suppose Assumptios 1-5 hold. The the ML estimator of b is the OLS estimator b ad ML estimator of g or s is SSR e' e K s

Maximum Likelihood Maximum Likelihood Priciple Let q = (b,g) Score: s(q) = log L(q)/q Iformatio Matrix: I(q) = E(s(q)s(q) X) Iformatio Matrix Equality: 1 log L( θ) I( θ) E s θθ' X' X s 4

Maximum Likelihood Maximum Likelihood Priciple Cramer-Rao Boud: I(q) -1 That is, for a ubiased estimator of q with a fiite variace-covariace matrix: 1 s ( X' X) 1 Var (ˆ) θ I( θ) 4 s

Maximum Likelihood Uder Assumptios 1-5, the ML or OLS estimator b of b with variace s (X X) -1 attais the Cramer- Rao boud. ML estimator of s is biased, so the Cramer-Rao boud does ot apply. OLS estimator of s, s = e e/(-k) with E(s X) = s ad Var(s X) = s 4 /(-K), does ot attai the Cramer-Rao boud s 4 /.

Discussios Cocetrated Log Likelihood Fuctio log L c ( β) log log( ) L( β,ssr ( β) / log(ssr ( β) / log( / ) 1 log(ssr ( β)) Therefore, argmax b log L(b) = argmi b SSR(b) ) )

Discussios Hypothesis Testig H : Rb = q Likelihood Ratio Test L L F Test as a Likelihood Ratio Test F UR R (SSR SSR K J R SSR SSR SSR UR) / J /( K) UR SSR SSR R UR R UR / 1 K J / 1

Discussios Quasi-Maximum Likelihood Without ormality (Assumptio 5), there is o guaratee that ML estimator of b is OLS or that the OLS estimator b achieves the Cramer-Rao boud. However, b is a quasi- (or pseudo-) maximum likelihood estimator, a estimator that maximizes a misspecified (ormal) likelihood fuctio.

Geeralized Least Squares Assumptio 4 Revisited: E(e e X) = Var(e X) = s I Assumptio 4 Relaxed (Assumptio 4 ): E(e e X) = Var(e X) = s V(X), with osigular ad kow V(X). OLS estimator of b, b=(x X) -1 X y, is ot efficiet although it is still ubiased. t-test ad F-test are o loger valid.

Geeralized Least Squares Sice V=V(X) is kow, V -1 = C C Let y * = Cy, X * = CX, e * = Ce y = Xb + e y * = X * b + e * Checkig A.: E(e * X * ) = E(e * X) = Checkig A.4: E(e * e * X * ) = E(e * e * X) = s CVC = s I GLS: OLS for the trasformed model y * = X * b + e *

Geeralized Least Squares b GLS = (X * X * ) -1 X * y * = (X V -1 X) -1 X V -1 y Var(b GLS X) = s (X * X * ) -1 = s (X V -1 X) -1 If V = V(X) = Var(e X)/s is kow, b GLS = (X [Var(e X)] -1 X) -1 X [Var(e X)] -1 y Var(b GLS X) = (X [Var(e X)] -1 X) -1 GLS estimator b GLS of b is BLUE.

Geeralized Least Squares Uder Assumptio 1-3, E(b GLS X) = b. Uder Assumptio 1-3, ad 4, Var(b GLS X) =s (X V(X) -1 X) -1 Uder Assumptio 1-3, ad 4, the GLS estimator is efficiet i that the coditioal variace of ay ubiased estimator that is liear i y is greater tha or equal to [Var(b GLS X)].

Discussios Weighted Least Squares (WLS) Assumptio 4 : V(X) is a diagoal matrix, or E(e i X) = Var(e i X) = s v i (X) The y (i * i v ( X) 1,,...,) WLS is a special case of GLS. y i i, x * i v x i i ( X)

Discussios If V = V(X) is ot kow, we ca estimate its fuctioal form from the sample. This approach is called the Feasible GLS. V becomes a radom variable, the very little is kow about the distributio ad fiite sample properties of the GLS estimator.

Example Cobb-Douglas Cost Fuctio for Electricity Geeratio (Christese ad Greee [1976]) Data: Greee s Table F4.3 Id = Observatio, 13 + 35 holdig compaies Year = 197 for all observatios Cost = Total cost, Q = Total output, Pl = Wage rate, Sl = Cost share for labor, Pk = Capital price idex, Sk = Cost share for capital, Pf = Fuel price, Sf = Cost share for fuel

Example Cobb-Douglas Cost Fuctio for Electricity Geeratio (Christese ad Greee [1976]) l(cost) = b 1 + b l(pl) + b 3 l(pk) + b 4 l(pf) + b 5 l(q) + ½b 6 l(q)^ + b 7 l(q)*l(pl) + b 8 l(q)*l(pk) + b 9 l(q)*l(pf) + e Liear Homogeeity i Prices: b +b 3 +b 4 =1, b 7 +b 8 +b 9 = Imposig Restrictios: l(cost/pf) = b 1 + b l(pl/pf) + b 3 l(pk/pf) + b 5 l(q) + ½b 6 l(q)^ + b 7 l(q)*l(pl/pf) + b 8 l(q)*l(pk/pf) + e