Chapter Ecoometrics There are o exercises or applicatios i Chapter. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
Chapter The Liear Regressio Model There are o exercises or applicatios i Chapter. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
Chapter 3 Least Squares Exercises. Let X x. x a. The ormal equatios are give by (3-), Xe 0 (we drop the mius sig), hece for each of the colums of X, x k, we kow that Xk e 0. This implies that i ei 0 ad i xe i i 0. b. Use iei to coclude from the first ormal equatio that a y bx. c. We kow that e 0 ad xe 0. It follows the that ( ) 0 x x e i i i i i i i i because i xei xi ei 0. Substitute e i to obtai i ( xi x)( yi a bxi ) 0. or ( x x)( y y b( x x)) 0. i i i i ( )( ) i xi x yi y The, i ( xi x)( yi y) bi ( xi x)( xi x)) so b. ( x x) d. The first derivative vector of ee is Xe. (The ormal equatios.) The secod derivative matrix is (ee)/bb = XX. We eed to show that this matrix is positive defiite. The diagoal elemets are ad x which are clearly both positive. The determiat is [( )( x )] i i i i i i ( x ) 4 x 4( x) 4 [( x ) x ] 4 [( ( x x) ]. Note that a much simpler i i proof appears after (3-6). i i i i i i. Write c as b (c b). The, the sum of squared residuals based o c is (y Xc)(y Xc) [y X(b (c b))][y X(b (c b))] [(y Xb) X(c b)][(y Xb) X(c b)] (y Xb)(y Xb) (c b)xx(c b) (c b)x(y Xb). But, the third term is zero, as (c b)x(y Xb) (c b)xe 0. Therefore, or (y Xc)(y Xc) ee (c b)xx(c b) (y Xc)(y Xc) e e (c b) X X(c b). The right-had side ca be writte as dd where d = X(c b), so it is ecessarily positive. This cofirms what we kew at the outset, least squares is least squares. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
4 Greee Eco ometric A a ysis, Seveth Editio 3. I the regressio of y o i ad X, the coefficiets o X are b (XM X) XM y. M I i(ii) i is the matrix which trasforms observatios ito deviatios from their colum meas. Sice M 0 is idempotet ad symmetric we may also write the precedig as [(XM )(M X)] (XM )(M y) which implies that the regressio of M y o M X produces the least squares slopes. If oly X is trasformed to deviatios, we would compute [(XM )(M X)] (XM )y, but, of course, this is idetical. However, if oly y is trasformed, the result is (XX) XM y, which is likely to be quite differet. 4. What is the result of the matrix product M M where M is defied i (3-9) ad M is defied i (3-4)? M M (I X (XX ) )(I X(XX) ) M X (XX ) XM There is o eed to multiply out the secod term. Each colum of MX is the vector of residuals i the regressio of the correspodig colum of X o all of the colums i X. Sice that x is oe of the colums i X, this regressio provides a perfect fit, so the residuals are zero. Thus, MX is a matrix of zeroes which implies that M M M. 5. The origial X matrix has rows. We add a additioal row, x s. The ew y vector likewise has a additioal elemet. Thus, X,s ad y X y,s. s y The ew coefficiet vector is x s b,s (X,s X,s) (X,sy,s). The matrix is X,s X,s X X x sx s. To ivert this, use (A-66); ( X X, s, s ) ( X X ) ( ) ( ). s s x( XX ) x X X x x X X The vector is s s (X,s y,s) (X y ) x sy s. Multiply out the four terms to get (X,s X,s) (X,sy,s) b ( XX ) x xb x( XX ) x s s s s ( ) XX x sy s ( X ) ( ) X xsx s XX x s y s x( XX ) x s s x b ( XX ) x sy s( X X ) xs s ( X X ) x y ( X X ) x xb x ( X X ) x x ( X X ) x s s s s s s s s x XX b ( ) ( ) ( ) ( ) b ( X X ) x y ( X X ) x xb x( X X ) x x( X X ) x b x( XX ) s( ) xs sy X X x s X X x sxs b x s XX xs xs XX xs s s s s s s s s ( X X x x b ) s( y s s ). s xs 6. Defie the data matrix as follows: i x 0 0 yo X, ad. 0 X X X y y m (The subscripts o the parts of y refer to the observed ad missig rows of X.) We will use Frish-Waugh to obtai the first two colums of the least squares coefficiet vector. b (X M X ) (X M y). Multiplyig it out, we fid that M a idetity matrix save for the last diagoal elemet that is equal to 0. 0 0 X M X X. X X X This just drops the last observatio. X M y is computed likewise. 0 Thus, the coefficiets o the first two colums are the same as if y 0 had bee liearly regressed o X. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
Chapter 3 Least Squares 5 The deomiator of R is differet for the two cases (drop the observatio or keep it with zero fill ad the dummy variable). For the first strategy, the mea of the observatios should be differet from the mea of the full uless the last observatio happes to equal the mea of the first. For the secod strategy, replacig the missig value with the mea of the other observatios, we ca deduce the ew slope vector logically. Usig Frisch-Waugh, we ca replace the colum of x s with deviatios from the meas, which the turs the last observatio to zero. Thus, oce agai, the coefficiet o the x equals what it is usig the earlier strategy. The costat term will be the same as well. 7. For coveiece, reorder the variables so that X [i, P d, P, P s, Y]. The three depedet variables are E d, E, ad E s, ad Y E d + E E s. The coefficiet vectors are The sum of the three vectors is b d (XX) XE d, b (XX) XE, ad b s (XX) XE s. b (XX) X[E d E E s] (XX) XY. Now, Y is the last colum of X, so the precedig sum is the vector of least squares coefficiets i the regressio of the last colum of X o all of the colums of X, icludig the last. Of course, we get a perfect fit. I additio, X[E d E + E s] is the last colum of XX, so the matrix product is equal to the last colum of a idetity matrix. Thus, the sum of the coefficiets o all variables except icome is 0, while that o icome is. 8. Let R deote the adjusted R i the full regressio o variables icludig x k, ad let R deote the adjusted R i the short regressio o - variables whe x k is omitted. Let R ad R deote their uadjusted couterparts. The, R ee/ym 0 y R e e /ym 0 y where ee is the sum of squared residuals i the full regressio, e e is the (larger) sum of squared residuals i the regressio which omits x k, ad ym 0 y = i (y i y ). The, ad R = [( )/( )]( R ) R [( )/( -( ))]( R ). The differece is the chage i the adjusted R whe x k is added to the regressio, R R [( )/( )][e e /ym 0 y] [( )/( )][ee/ym 0 y]. The differece is positive if ad oly if the ratio is greater tha. After cacellig terms, we require for the adjusted R to icrease that e e /( )]/[( )/ee]. From the previous problem, we have that e e ee b (x km x k), where M is defied above ad b k is the least squares coefficiet i the full regressio of y o X ad x k. Makig the substitutio, we require [(ee b (x km x k)) ( )]/[( )ee ee]. Sice ee ( )s, this simplifies to [ee b (x km x k)]/ [ee s ]. Sice all terms are positive, the fractio is greater tha oe if ad oly b (x km x k) s b (x km x k/s ). The deomiator is the estimated variace of b k, so the result is proved. or 0 Pearso Educatio, Ic. Publishig as Pretice Hall
6 Greee Eco ometric A a ysis, Seveth Editio 9. This R must be lower. The sum of squares associated with the coefficiet vector which omits the costat term must be higher tha the oe which icludes it. We ca write the coefficiet vector i the regressio without a costat as c (0,b * ) where b * (WW) Wy, with W beig the other colums of X. The, the result of the previous exercise applies directly. 0. We use the otatios Var[.] ad Cov[.] to idicate the sample variaces ad covariaces. Our iformatio is Var[N], Var[D], Var[Y]. Sice C N D, Var[C] Var[N] Var[D] Cov[N, D] ( Cov[N, D]). From the regressios, we have Cov[C, Y]/Var[Y] Cov[C, Y] 0.8. But, Cov[C, Y] Cov[N, Y] Cov[D, Y]. Also, Cov[C, N]/Var[N] Cov[C, N] 0.5, but, Cov[C, N] Var[N] Cov[N, D] Cov[N, D], so Cov[N, D] 0.5, so that Var[C] ( + 0.5). Ad, Cov[D, Y]/Var[Y] Cov[D, Y] 0.4. Sice Cov[C, Y] 0.8 Cov[N, Y] Cov[D, Y], Cov[N, Y] 0.4. Fially, Cov[C, D] Cov[N, D] Var[D] 0.5 0.5. Now, i the regressio of C o D, the sum of squared residuals is ( ){Var[C] (Cov[C,D]/ Var[D]) Var[D]} based o the geeral regressio result e (y i y) b (x i x ). All of the ecessary figures were obtaied above. Isertig these ad 0 produces a sum of squared residuals of 5.. The relevat submatrices to be used i the calculatios are Ivestmet Costat GNP Iterest Ivestmet * 3.0500 3.996 3.5 Costat 5 9.30.79 GNP 5.8 48.98 Iterest 943.86 The iverse of the lower right 3 3 block is (XX), 7.5874 (XX) 7.4859 7.84078.733.598953.0654637 The coefficiet vector is b = (XX) Xy (.077985,.356,.00364866). The total sum of squares is yy =.6365, so we ca obtai ee yy bxy. Xy is give i the top row of the matrix. Makig the substitutio, we obtai ee.6365.639.0036. To compute R, we require i (y i y ).6365 5(3.05/5).0635333, so R.0036/.063533.7795.. The results caot be correct. Sice log S/N log S/Y log Y/N by simple, exact algebra, the same result must apply to the least squares regressio results. That meas that the secod equatio estimated must equal the first oe plus log Y/N. Lookig at the equatios, that meas that all of the coefficiets would have to be idetical save for the secod, which would have to equal its couterpart i the first equatio, plus. Therefore, the results caot be correct. I a exchage betwee Leff ad Arthur Goldberger that appeared later i the same joural, Leff argued that the differece was a simple roudig error. You ca see that the results i the secod equatio resemble those i the first, but 0 Pearso Educatio, Ic. Publishig as Pretice Hall
Chapter 3 Least Squares 7 ot eough so that the explaatio is credible. Further discussio about the data themselves appeared i a subsequet discussio. [See Goldberger (973) ad Leff (973).] App icatio? Chapter 3 Applicatio Read $ (Data appear i the text.) Namelist ; X = oe,educ,exp,ability$ Namelist ; X = mothered,fathered,sibs$? a. Regress ; Lhs = wage ; Rhs = x$ Ordiary least squares regressio LHS=WAGE Mea =.059333 Stadard deviatio =.583869 WTS=oe Number of observs. = 5 Model size Parameters = 4 Degrees of freedom = Residuals Sum of squares =.763363 Stadard error of e =.63444 Fit R-squared =.8335 Adjusted R-squared = -.393736E-0 Model test F[ 3, ] (prob) =.8 (.5080) Variable Coefficiet Stadard Error t-ratio P[ T >t] Mea of X Costat.66364000.685538.690.00 EDUC.0453897.049049.97.773.8666667 EXP.070300.0480345.479.673.80000000 ABILITY.066537.09973.69.7933.36600000? b. Regress ; Lhs = wage ; Rhs = x,x$ Ordiary least squares regressio LHS=WAGE Mea =.059333 Stadard deviatio =.583869 WTS=oe Number of observs. = 5 Model size Parameters = 7 Degrees of freedom = 8 Residuals Sum of squares =.4566 Stadard error of e =.377673 Fit R-squared =.5634 Adjusted R-squared =.53347 Model test F[ 6, 8] (prob) =.4 (.340) Variable Coefficiet Stadard Error t-ratio P[ T >t] Mea of X Costat.04899633.9488076.05.960 EDUC.0583.0446859.578.5793.8666667 EXP.03395.0473454.84.0605.80000000 ABILITY.03074355.033.54.806.36600000 MOTHERED.063069.070750.448.856.0666667 0 Pearso Educatio, Ic. Publishig as Pretice Hall
8 Greee Eco ometric A a ysis, Seveth Editio FATHERED.0064437.0446490.037.975.6666667 SIBS.05969.069080.857.46.0000000 0 Pearso Educatio, Ic. Publishig as Pretice Hall
Chapter 3 Least Squares 9? c. Regress ; Lhs = mothered ; Rhs = x ; Res = meds $ Regress ; Lhs = fathered ; Rhs = x ; Res = feds $ Regress ; Lhs = sibs ; Rhs = x ; Res = sibss $ Namelist ; XS = meds,feds,sibss $ Matrix ; list ; Mea(XS) $ Matrix Result has 3 rows ad colums. +-------------- -.8438D-4.657933D-4 3 -.5989D-6 The meas are (essetially) zero. The sums must be zero, as these ew variables are orthogoal to the colums of X. The first colum i X is a colum of oes, so this meas that these residuals must sum to zero.? d. Namelist ; X = X,X $ Matrix ; i = iit(,,) $ Matrix ; M0 = ide() - /*i*i' $ Matrix ; b = <X'X>*X'wage$ Calc ; list ; ym0y =(N-)*var(wage) $ Matrix ; list ; cod = /ym0y * b'*x'*m0*x*b $ Matrix COD has rows ad colums. +--------------.563 Matrix ; e = wage - X*b $ Calc ; list ; cod = - /ym0y * e'e $ +------------------------------------+ COD =.5634 The R squared is the same usig either method of computatio. Calc +------------------------------------+ RSQAD =.5335? Now drop the costat Namelist ; X0 = educ,exp,ability,x $ Matrix ; i = iit(,,) $ Matrix ; M0 = ide() - /*i*i' $ ; list ; RsqAd = - (-)/(-col(x))*(-cod)$ Matrix ; b0 = <X0'X0>*X0'wage$ Matrix ; list ; cod = /ym0y * b0'*x0'*m0*x0*b0 $ Matrix COD has rows ad colums. +--------------.5953 Matrix ; e0 = wage - X0*b0 $ Calc ; list ; cod = - /ym0y * e0'e0 $ +------------------------------------+ Listed Calculator Results +------------------------------------+ COD =.55973 The R squared ow chages depedig o how it is computed. It also goes up, completely artificially. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
0 Greee Eco ometric A a ysis, Seveth Editio? e. The R squared for the full regressio appears immediately below.? f. Regress ; Lhs = wage ; Rhs = X,X $ Ordiary least squares regressio WTS=oe Number of observs. = 5 Model size Parameters = 7 Degrees of freedom = 8 Fit R-squared =.5634 Variable Coefficiet Stadard Error t-ratio P[ T >t] Mea of X Costat.04899633.9488076.05.960 EDUC.0583.0446859.578.5793.8666667 EXP.03395.0473454.84.0605.80000000 ABILITY.03074355.033.54.806.36600000 MOTHERED.063069.070750.448.856.0666667 FATHERED.0064437.0446490.037.975.6666667 SIBS.05969.069080.857.46.0000000 Regress ; Lhs = wage ; Rhs = X,XS $ Ordiary least squares regressio WTS=oe Number of observs. = 5 Model size Parameters = 7 Degrees of freedom = 8 Fit R-squared =.5634 Adjusted R-squared =.53347 Variable Coefficiet Stadard Error t-ratio P[ T >t] Mea of X Costat.66364000.5583076.980.076 EDUC.0453897.0444689.39.7509.8666667 EXP.070300.0433557.638.400.80000000 ABILITY.066537.08946345.97.7737.36600000 MEDS.063069.070750.448.856 -.844D-4 FEDS.0064437.0446490.037.975.65793D-4 SIBSS.05969.069080.857.46 -.599D-6 I the first set of results, the first coefficiet vector is b (X M X ) X M y ad b (X M X ) X M y. I the secod regressio, the secod set of regressors is M X, so b (X M X ) X M y where M I (M X )[(M X )(M X )] (M X ). Thus, because the M matrix is differet, the coefficiet vector is differet. The secod set of coefficiets i the secod regressio is b [(M X )M (M X )] (M X )M y (X M X ) X M y because M is idempotet. 0 Pearso Educatio, Ic. Publishig as Pretice Hall
0 Pearso Educatio, Ic. Publishig as Pretice Hall