2. IV ESTIMATION AND TWO STAGE LEAST SQUARES

Size: px

Start display at page:

Download "2. IV ESTIMATION AND TWO STAGE LEAST SQUARES"

Arleen O’Neal’
5 years ago
Views:

1 2. IV ESTIMATION AND TWO STAGE LEAST SQUARES [1] Motivation (1) Consider a regression model: yt = β1+ β2xt βkxtk + εt = xt β + εt. In what cases is the OLS estimator of β ( ˆ β ) consistent? E(x t ε t ) = 0. In what cases is this condition violated? (2) Examples 1) Measurement errors in regressors A simple example: True model: y t = βx * t + ε t. But we only observe x t = x * t + v t (v t : measurement error). [x t may be a proxy variable for x * t.] If we use x t for x * t, y t = x t β + [ε t -βv t ] (model we estimate). x t and (ε t -βv t ) correlated. Assume that the ε t are i.i.d. with N(0,σ 2 ); v t i.i.d. N(0,σ 2 v ); and ε t and v t are stochastically independent. IV-1

2 p ˆ β limt β =, σ v / a where a = plim T -1 Σ t (x * t ) 2. (Greene) p lim ˆ T β < β, if σ 2 v > 0. p lim ˆ β = β, only if σ 2 v = 0. 2) Omitted Variables True model: yt = β1+ β2xt2 + β3xt3 + εt. The model you estimate: yt = β1+ β2xt 2 + vt, where vt = β3xt 2 + εt. The OLS estimator of β 2 from this misspecfied model is generally inconsistent. It is consistent only if x t2 and x t3 are uncorrelated. 3) Simultaneous Equations models (a) c t = β 1 + β 2 y t + ε t ; (b) c t + i t = y t y t = β 1 + β 2 y t + ε t + i t. y t = [β 1 /(1-β 2 )] + i t [1/(1-β 2 )] + ε t /(1-β 2 ). y t is correlated with ε t in (a). OLS inconsistent. IV-2

3 [2] The Method of Instrumental Variables (IV) (1) Intuition: Consider a simple regression model yt = β xt + εt. Suppose there is a variable z t such that (i) E(z t x t ) 0 and (ii) E(z t ε t ) = 0. Assume that (z t,x t,ε t ) is iid. Let ˆ Σtzy t t β IV = : Σ z x ˆ β IV t t Σtztyt Σ tzt( xtβ + εt) Σtztεt = T = T = β + T Σtztxt Σtztxt Σtztxt T T T 0 p β + = β. Ezx ( ) t t (2) Formal Assumptions for instrumental variables: Digression to Weak Ideal Conditions (WIC): (WIC.1) The conditional mean of y t (dependent variable) given ε 1, ε 2,..., ε t-1, x 1, x 2,..., x t (vectors of explanatory variables) is linear in x t : y = E( y x,... x, ε,..., ε ) + ε = x β + ε t t 1i ti 1 t 1 t ti t. IV-3

4 Comment: cov(ε t,ε s ) (t > s) = E(ε t ε s ) = Eε [ E( εε ε )] = Eε [ ε E( ε ε )] E(x s ε t ) = 0 k 1 for all t s. s t s s s s t s = E ε (0) = 0 (LIE: Law of Iterative Expectation). s (WIC.2) E( εt x1, x2,..., xt, ε1, ε2,..., εt 1) = 0 i i i. Comment: In fact, (WIC.1) implies (WIC.2). But we write (WIC.2) as an independent assumption for convenience. By LIE, E(ε t ) = 0 for all t. (WIC.3) The series {x t } are covariance-stationary and ergodic. We no longer require the random sample assumption (SIC.3) and the nonstochastic regressor assumption (SIC.4). (WIC.4) X X is positive definite and plim T T -1 X X = plim T T -1 Σ t x t x t ( Q o ) is finite. Comment: In fact, Q o = lim T T -1 Σ t E(x t x t ) [By GWLLN]. Rules out perfect multicollinearity among regressors. IV-4

5 (WIC.5) var(ε t x 1,x 2,..., x t, ε 1,..., ε t-1 ) = σ 2 for all x t (Homoskedasticity Comment: Assumption). By the law of iterative expectation, var(ε t ) = σ 2 for all t. (WIC.6) x t1 = 1, for all t = 1,..., T. (WIC.7) The error terms ε t are normally distributed. End of Digression Assumptions for Instrumental Variables: 1) y = x β + ε = x 1β x β + ε t ti t t tk k t, or, y = Xβ + ε. 2) Exists a set of instrumental variables z t = (z t1,..., z tq ) such that: 2.1) E( εt εt 1, εt 2,..., ε1, zti,..., z1 i ) = 0; [E(ε t ) = 0, cov(ε t,ε s ) = 0, and E(z t ε t ) = 0; plim T -1 Z ε = plim T -1 Σ t z t ε t = 0.] 2.2) The (z t,x t ) are stationary and ergodic; 1 [plim T ZZ T = plim 1 T Σtzt i z t i Q z is finite and positive T 1 definite; plim T Σtzt i x t i is finite and rank = k.] T IV-5

6 2.3) var(ε t ε t-1,...,ε 1,z t,...,z 1 ) = σ 2 (Homoske. Assum.); [var(ε t ) = σ 2.] 1 2.4) plim T ZX T = plim T 1 T Σ z i x i is finite, nonzero, with rank = k. t t t Comment on 2): z t can contain exogenous regressors in x t. Suppose y t = β 1 + β 2 x t2 + β 3 x t β k x tk + ε t, where x t2 is endogenous and other regressors are exogenous. Suppose that some extra variables h t1, h t2,..., h tr are correlated with x t2 but not correlated with ε t. Then, z t can contain (1,x t3,...,x tk,h t1,...,h tr ). Comment on 2.1): The most important part of this assumption is that z t and ε t should be uncorrelated; that is, E(z t ε t ) = 0. Comment on 2.4): q must be greater than or equal to k. Suppose y t = β 1 + β 2 x t2 + β 3 x t β k x tk + ε t, where x t2 is endogenous and other regressors are exogenous. Then, there must be (a least one) some extra variable (say, h t1,..., h tr ) that are correlated with x t2 conditional on (1,x t3,x t4,...,x tk ) and uncorrelated with ε t. IV-6

7 (3) IV estimator (q = k): ˆ 1 β = ( Z X) Zy. Theorem: p lim ˆ T β = β ; ( ˆ ) ( 2 1 β 0 1 IV β d k 1, σ ( ) ( ) ) T N T ZX ZZ XZ ; plim T s 2 IV = σ 2, where s 2 ( y X ˆ β )( ˆ IV y XβIV) IV =. T k Implication: The IV estimator is consistent. ˆ β (, ˆ IV N β ) Ω, where Ω= ˆ s ( ZX ) ZZ ( XZ ) IV We can use t- or Wald statistics to test hypotheses regarding β. ( R ˆ β IV r) [e.g., t =, W T = RΩˆ R ˆ Ωˆ ˆ.] 1 ( RβIV r)( R R) ( RβIV r) Digression to t and Wald test (D.1) Testing a single restriction on β: H o : Rβ - r = 0, where R is 1 k and r is a scalar. IV-7

8 Example: y t = x t1 β 1 + x t2 β 2 + x t3 β 3 + ε t. We would like to test H o : β 3 = 0. Define R = [0 0 1] and r = 0. Then, Rβ - r = 0 β 3 = 0. H o : β 2 - β 3 = 0 (or β 2 = β 3 ). Define R = [0 1-1] and r = 0. Rβ - r = 0 β 2 - β 3 = 0 H o : 2β 2 + 3β 3 = 3. R = [0 2 3] and r = 3. Rβ - r = 0 H o. (D.2) Testing several restrictions Assume that R is m k and r is m 1 vector, and H o : Rβ = r. Example: A model is given: y t = x t1 β 1 + x t2 β 2 + x t3 β 3 + ε t. Wish to test for H o : β 1 = 0 and β 2 + β 3 = 1. Define: R = ; r = 1 Then, H o Rβ = r. IV-8

9 (D.3) Testing Nonlinear restrictions: General form of hypotheses: Let w(θ) = [w 1 (θ),w 2 (θ),..., w m (θ)], where w j (θ) = w j (θ 1, θ 2,..., θ p ) = a function of θ 1,..., θ p. H o : The true θ (θ o ) satisfies the m restrictions, w(θ) = 0 m 1 (m p). Examples: 1) θ: a scalar H o : θ o = 2 H o : θ o - 2 = 0 H o : w(θ o ) = 0, where w(θ) = θ ) θ = (θ 1, θ 2, θ 3 ). H o : θ 2 0,1 = θ o,2 + 2 and θ o,3 = θ o,1 + θ o,2. H o : θ 2 o,1 -θ o,2-2 = 0 and θ o,3 -θ o,1 -θ o,2 = 0. H o : 2 w1 ( θ ) θ 0 1 θ2 2 w( θ ) = w ( θ ) = = θ θ θ ) linear restrictions θ = [θ 1, θ 2, θ 3 ]. H o : θ o,1 = θ o,2 + 2 and θ o,3 = θ o,1 + θ o,2 H o : w( θ ) w ( θ ) θ θ 2 0 ( ) = = w2 θ = θ3 θ1 θ 2 IV-9

10 H o : θ w( θ) = θ 2 Rθ r = 0. θ 3 Remark: If all restrictions are linear in θ, H o takes the following form: H o : Rθ o - r = 0 m 1, where R and r are known m p and m 1 matrices, respectively. Definition: W ( θ ) w1( θ) w1( θ) w1( θ)... θ1 θ2 θ p w2( θ) w2( θ) w2( θ) w( θ )... θ θ θ θ : : : wm( θ) wm( θ) wm( θ)... θ1 θ2 θ p = 1 2 p m p. Example: (Nonlinear restrictions) Let θ = [θ 1,θ 2,θ 3 ]. H o : θ o,1 2 θ o,2 = 0 and θ o,1 θ o,2 θ o,3 2 = 0. 2 θ1 θ2 w( θ ) = 2 ; θ1 θ2 θ3 W ( θ ) 2θ = 1 1 2θ 3 IV-10

11 Example: (Linear restrictions) θ = [θ 1,θ 2,θ 3 ]. H o : θ o,1 = 0 and θ o,2 + θ o,3 = 1. θ1 0 0 w( θ ) = = θ2 + θ θ w( θ) = θ = 1 0, θ 3 which is of form w( θ) = Rθ r. Theorem: Assume that T( ˆ θ θ ) ( 0 ˆ o d N k 1, plimt ) Ω. Then, under H o : w(θ o ) = 0 m 1, ( ˆ θ θo ) d ( m 1 θo Ω θ o ) T w( ) w( ) N 0, W( ) W( ). <Proof> Taylor s expansion around β o : w( ˆ θ) = w( θ ) + W( θ)( ˆ θ θ ), where θ is between ˆ θ and θ o. Since ˆ θ is consistent, so is θ. Thus, ( ˆ ) T w( θ) w( θ ) W( θ ) T( ˆ θ θ ) o o o o ( ˆ 1 θ Ω θ ) N 0, plim W( ) T W( ). d m T o o o IV-11

12 Implication: ( w ˆ θ w θ ) ˆ ( ˆ ˆ ˆ o = w θ N m 1 W θ Ω W θ ) ( ) ( ) ( ) 0, ( ) ( ). Comment: If m = 1. t w( ˆ θ ) = ˆ ˆ 1 W( θ) Ω W( ˆ θ) d N(0,1). Theorem: Under WIC and H o : w(θ o ) = 0, 1 ˆ ˆ ˆ ˆ ˆ 2 ( θ) ( θ) ( θ) ( θ) d χ ( ). WT = w W ΩW w m <Proof> Under H o : w(θ o ) = 0, β ( ˆ ˆ ˆ p 1 θ Ω θ ) w( ) N 0, W( ) W( ). For a normal random vector h m 1 ~ N(0 m 1,Ψ m m ), h Ψ -1 h ~ χ 2 (m). Thus, we obtain the desired result. Question: What does Wald test mean? A test based on the unrestricted estimator only. End of Digression IV-12

13 [Proof of the IV theorem] ˆ β = ( ) = ( ) ( β + ε ) = β + ( ) ε IV Z X Zy ZX Z X ZX Z ˆIV β β Z X Z T T ε = + ˆ 1 1 plim β β T IV plimt ZX plim T Z' T T ε = +. By the central limit theorem and the given assumptions, Z ' ε d N 0, σ plimt ZZ T T ZX Z ε T T d N 0 q 1, σ plim ZX plim ZZ plim XZ T T T 1 (3) Two-Stage Least Squares (2SLS) (q > k) ˆ ( ( ) β ) 1 ( ) 2SLS = X PZZ XPZ y, where =. 1 P( Z) Z( ZZ) Z Note 1: If q = k, ˆ β ˆ IV = β2sls. IV-13

14 Note 2: (Why is 2SLS called 2SLS) y t = β 1 x t1 + β 2 x t β k x tk + ε t. Suppose x t2 contains measurement errors. Then, your instrument set would look like z t = (x t1,h t1,h t2,...,h tr,x t3,..., x tk ). Regress x t2 on z t. Then, get fitted values x ˆt 2. Estimate β s by regressing the model: y t = β 1 x t1 + β 2 x ˆt 2 + β 3 x t β k x tk + error. Use the 2SLS residuals ( y ˆ ˆ ˆ t β1xt1 β2xt2... βkxtk) to estimate σ 2. Theorem: p lim ˆ T β = β ; 2 SLS ( ˆ ) ( 2 β 1 2SLS β d 0 k 1, σ lim T ( ( ) ) ) T N p T XP Z X. plim T ls 2 2SLS = σ 2, where s 2 ( y X ˆ β ˆ 2SLS )( y Xβ2SLS ) 2SLS =. T k IV-14

15 [3] Weak Instrumental Variables Problem (1) What does weak instrumental variables mean? Consider a simple model: yt = xβ t + εt with z t such that E(z t ε t ) = 0. When E(z t x t ) 0, but the correlation between the two variables is low, we call z t a weak instrumental variable. (2) What is the problem of the weak instrumental variables? The IV or 2SLS estimators are biased toward the OLS estimator, especially when the sample size (T) is small. The statistical inferences based on IV or 2SLS are unreliable. (3) How to check weak instrumental variables? Suppose y = β1+ β2x 2 + β3x β x + ε, t t t k tk t where x t2 is an endogenous variable. Let zt = (1, xt3, xt4,..., ht1,..., h tr) i. Estimate the following model: x = α + α x α x + γ h γ h + error. t2 1 2 t3 k 1 tk 1 t1 r tr Test H o : α 2 =... = α k-1 = γ 1 =... = γ r = 0 (overall significance test). Test H o : γ 1 =... = γ r = 0. If you can reject this hypothesis and the F statistic for this hypothesis is greater than 10, you may not have to worry about the weak instrumental variable. [Why 10? See Stock and Watson.] IV-15

16 [4] Testing Exogeneity of Instrumental Variables (1) Testing Exogeneity of Instrumental Variables (Hansen, 1982, Econometrica; Hausman, 1984, Handbook) Model: y t = x t β + ε t or y = Xβ + ε. Assume the x t contains 1. H o : E(z t ε t ) = 0 q 1. Instruments: z t, or Z. Test procedure: Let ˆε be the vector of the 2SLS residuals (= y Xβˆ 2 SLS ). Regress ˆε on X and get R 2. Then, under the hypothesis E(z t ε t ) = 0 q 1, J T TR 2 d χ 2 (q-k). This test is meaningful only if q > k. IV-16

17 (2) Testing Exogeneity of a Single Regressor Based on 2SLS Model: y t = β 1 + β 2 x t2 + β 3 x t β k x tk + ε t = x t β + ε t. Wish to test whether x t2 is exogenous or not. H o : x t2 is exogenous. Let z t is the set of instruments under the assumption that x t2 is endogenous. Test Procedure: Let ˆ β 2 SLS,1 be the 2SLS estimator using Z as instruments with Φ = Cov( ˆ β ). Let ˆ β 2 SLS,2 be the 2SLS estimator using [Z,X 2 ] as 2 SLS,1 2 SLS,1 instruments with Φ ˆ 2 SLS,2 = Cov( β2 SLS,1). ˆ ˆ ˆ ˆ (1). + T 2 SLS,1 2 SLS,2 2 SLS,1 2 SLS,2 2 SLS,1 2 SLS,2 d 2 H = ( β β ) ( Φ Φ ) ( β β ) χ Alternative by Newey (1985, JEC) and Eichenbaum, Hasnsen and Singleton (1988, JPE) Let 1 ˆε be the vector of 2SLS residuals using Z as instruments. Let 2 ˆε be the vector of 2SLS residuals using [Z,X 2 ] as instruments. J T,1 = T (R 2 from the regression of 1 ˆε on Z). J T,2 = T (R 2 from the regression of 2 ˆε on [Z,X 2 ]). D T J T,2 - J T,1 p χ 2 (df = 1). This test is asymptotically identical to H T. IV-17

18 (3) Testing Exogeneity of Two Regressors by the Hausman Test Model: y t = β 1 + β 2 x t2 + β 3 x t β k x tk + ε t = x t β + ε t. Wish to test whether x t2 and x t3 are exogenous or not. H o : x t2 and x t3 are exogenous. Let z t is the set of instruments under the assumption that both x t2 and x t3 are endogenous. Test Procedure: Let ˆ β 2 SLS,1 be the 2SLS estimator using Z as instruments with Φ = Cov( ˆ β ). Let ˆ β 2 SLS,2 be the 2SLS estimator using [Z,X 2,X 3 ] 2 SLS,1 2 SLS,1 as instruments with Φ ˆ 2 SLS,2 = Cov( β2 SLS,1). ˆ ˆ H β β ˆ β ˆ β χ (2). + T = 2 SLS,1 2 SLS,2 Φ2 SLS,1 Φ2 SLS,2 2 SLS,1 2 SLS,2 d ( ) ( ) ( ) 2 Alternative by Newey (1985, JEC) and Eichenbaum, Hasnsen and Singleton (1988, JPE) Let 1 ˆε be the vector of 2SLS residuals using Z as instruments. Let 2 ˆε be the vector of 2SLS residuals using [Z,X 2,X 3 ] as instruments. J T,1 = T (R 2 from the regression of 1 ˆε on Z). J T,2 = T (R 2 from the regression of 2 ˆε on [Z,X 2,X 3 ]). D T J T,2 - J T,1 p χ 2 (df = 2). This test is asymptotically identical to H T. IV-18

19 [EXAMPLE] Use mwemp.wf1. You can download this file from my web page. This is the data set of working married women in 1981 sampled from PSID. Total number of observations are 923, and 17 variables are observed. VARIABLES DEFINITION LRATE LOG OF HOURLY WAGE RATE ($) ED YEARS OF EDUCATION URB URB=1 IF RESIDENT IN SMSA MINOR MINOR=1 IF BLACK AND HISPANIC AGE YEARS OF AGE TENURE MONTHS UNDER THE CURRENT EMPLOYER EXPP NUMBER OF YEARS WORKED SINCE AGE 18 REGS REGS=1 IF LIVES IN THE SOUTH OF U.S. OCCW OCCW=1 IF WHITE COLOR OCCB OCCB=1 IF BLUE COLOR INDUMG INDUMG=1 IF IN THE MANUFACTURING INDUSTRY INDUMN INDUMN=1 IF NOT IN MANUFACTURING SECTOR UNION UNION=1 IF UNION MEMBER UNEMPR % UNEMPLOYMENT RATE IN THE RESIDENT'S COUNTY, 1980 LOFINC LOG OF OTHER FAMILY MEMBER'S INCOME IN 1980 ($) HWORK HOURS OF HOMEWORK PER WEEK KIDS5 NUMBER OF CHILDREN 5 YEARS OF AGE LHWORK ln(hwork+1) IV-19

20 Suppose we wish to estimate the following equation: LHWORK = γ 12 LRATE + β 12 + β 22 ED + β 32 ED 2 + β 42 AGE + β 52 AGE 2 + β 62 REGS + β 72 MINOR + β 8,2 URB + β 9,2 KIDS5 + β 10,2 LOFINC + ε 2. Endogenous (Potentially): LRATE Exogeous: C, ED, ED 2, EXPP, EXPP 2, AGE, AGE 2, OCCW, OCCB, UNEMPR, REGS, MINOR, INDUMG, UNION, URB, KIDS5, LOFINC. First stage OLS Regression Go to Object/New Objects... and choose Equation. In the equation box, type: LRATE C, ED, ED^2, EXPP, EXPP^2, AGE, AGE^2, OCCW, OCCB, UNEMPR, REGS, MINOR, INDUMG, UNION, URB, KIDS5, LOFINC Then, click on ok. We will see: IV-20

21 <First-stage OLS> Dependent Variable: LRATE Method: Least Squares Sample: Included observations: 923 Variable Coefficient Std. Error t-statistic Prob. C ED ED^ EXPP EXPP^ AGE AGE^2-2.56E OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS LOFINC R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) IV-21

22 Wald Test: Equation: Untitled Test Statistic Value df Probability F-statistic (7, 906) Chi-square Null Hypothesis Summary: Normalized Restriction (= 0) Value Std. Err. C(4) C(5) C(7) -2.56E C(9) C(10) C(13) C(14) Restrictions are linear in coefficients. Comment: Instruments are reasonably higly correlated with LRATE. We could expect that the 2SLS for the structural LHWORK equation would have good finite-sample properties. IV-22

23 <OLS result for the equation> Dependent Variable: LHWORK Method: Least Squares Sample: Included observations: 923 Variable Coefficient Std. Error t-statistic Prob. LRATE C ED ED^ AGE AGE^ REGS MINOR URB KIDS LOFINC R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) IV-23

24 2SLS Estimation. Go to Objects/New Object... and choose Equation. In the Method box, click on the arrow. Then, choose TSLS. Estimate the first equation by 2SLS using the following procedure. In the equation box, type: LRATE LHWORK C ED ED^2 EXPP EXPP^2 AGE AGE^2 OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB In the instrument list box, type: C ED ED^2 EXPP EXPP^2 AGE AGE^2 OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS5 LOFINC Do the same things to estimate the second equation. IV-24

25 <2SLS> Dependent Variable: LHWORK Method: Two-Stage Least Squares Sample: Included observations: 923 Instrument list: C ED ED^2 EXPP EXPP^2 AGE AGE^2 OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS5 LOFINC Variable Coefficient Std. Error t-statistic Prob. LRATE C ED ED^ AGE AGE^ REGS MINOR URB KIDS LOFINC R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Sum squared resid F-statistic Durbin-Watson stat Prob(F-statistic) Comment: Interpretation of the results for ED and ED 2 LHWORK/ ED = *(-0.003)ED = ED LHWORK/ ED > 0 until ED = 16; 0 if from ED 16. Genr TSLSE = RESID IV-25

26 <Exogeneity of Instruments> Dependent Variable: TSLSE Method: Least Squares Sample: Included observations: 923 Variable Coefficient Std. Error t-statistic Prob. C ED ED^2 7.79E EXPP EXPP^ AGE AGE^2-2.88E OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS LOFINC R-squared Mean dependent var -6.22E-15 Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) J T = *923 = < c = (df = 6). The IVs are exogenous. IV-26

27 <Testing Exogeneity of LRATE> 2SLS for the LHWORK Equation Using [X, LRATE] as Instruments Dependent Variable: LHWORK Method: Two-Stage Least Squares Sample: Included observations: 923 Instrument list: C ED ED^2 EXPP EXPP^2 AGE AGE^2 OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS5 LOFINC LRATE Variable Coefficient Std. Error t-statistic Prob. LRATE C ED ED^ AGE AGE^ REGS MINOR URB KIDS LOFINC R-squared Mean dependent var Adjusted R-squared S.D. dependent var S.E. of regression Sum squared resid F-statistic Durbin-Watson stat Prob(F-statistic) GENR TSLSES = RESID IV-27

28 <EHS Test for the Exogeneity of LRATE> Dependent Variable: TSLSES Method: Least Squares Sample: Included observations: 923 Variable Coefficient Std. Error t-statistic Prob. C ED ED^2 4.04E EXPP EXPP^ AGE AGE^2-5.34E OCCW OCCB UNEMPR REGS MINOR INDUMG UNION URB KIDS LOFINC LRATE R-squared Mean dependent var -7.47E-15 Adjusted R-squared S.D. dependent var S.E. of regression Akaike info criterion Sum squared resid Schwarz criterion Log likelihood F-statistic Durbin-Watson stat Prob(F-statistic) J T * = *923 = D T = J T * - J T = = < 3.84 (c at 95%). Do not reject the exogeneity of LRATE. IV-28

29 [5] Hausman-and-Taylor Panel Data Model (1) Model: y i = H i δ + u i = X i β + e T z i γ + u i ; u i = e T α i + ε i, where, X i = [X 1i,X 2i ] and z i = [z 1i,z 2i ]; x 1it = 1 k 1 of time-varying regressors that are uncorrelated with α i ; x 2it = 1 k 2 of time-varying regressors that may be correlated with α i ; z 1i = 1 g 1 of time-invariant regressors that are uncorrelated with α i ; z 2i = 1 g 2 of time-invariant regressors that may be correlated with α i ; α i = iid with (0,σ 2 α ). (2) Example: y it : log of earnings. α i : unobservable talent or IQ. x 1it includes age, health, regional variables, etc. x 2it includes work experience. z 1i includes race, union, etc. z 2i includes schooling. IV-29

30 (3) Assumptions Basic Assumptions (BA): E[x is (u it -u i,t-1 )] = 0 and E[z i (u it -u i,t-1 )] = 0, for any t and s: Regressors are strictly exogenous to ε it. [Note that u it -u i,t-1 = ε it - ε i,t-1.] Hausman and Taylor Assumption (HTA): E( x u ) = 0; E( z u ) = 0. 1i i 1i i Amemiya and MaCurdy Assumption (AMA): E( x u ) = 0; E( z u ) = 0, for any t. 1it i 1i i Breusch, Mizon and Schmidt (BMSA): E( x u ) = 0; E( z u ) = 0; E(( x x ) u ) = 0, for t = 1, 2,..., T-1. 1it i 1i i 2it 2i i IV-30

31 (4) Estimationa CASE A: The ε it are iid over i and t 1) Hausman and Taylor (1981): Define G HT,i = [Q T X i, P T X 1i, e T z 1i ], G HT = [Q V X,P V X 1,VZ 1 ]. Then, under BA and HT, E(X i Q T u i ) = E[X i Q T (e T α i +ε i )] = E(X i Q T ε i ) = 0 k 1. E(X 1i P T u i ) = E(z 1i e T u i ) = 0 k 1 ( 1 0 g 1 ( 1 Exu ( 1i i) = 0 ). k 1 Ez ( 1iu i) = 0 ). g HT estimator is an 2SLS estimator based on E(G HT,i u i ) = 0( k+ k + g ) Observe that E XQ Σ u = EXQ P+ Q u 1/2 ( i T i) ( i T( θ T T) i) Similarly, you can show: E( X PΣ u ) = 0 ; 1/2 1i T i k 1 Ez ( e Σ u) = 0. 1/2 1i T i g 1 = E( X Q u ) = 0. 1 i T i k 1 1 That is, EG Σ =. 1/2 ( HT ui ) 0 ( k + k + g ) IV-31

32 HT offers an estimation procedure under their assumptions. Reconsider the whitened equation: Σ y =Σ H δ +Σ u 1/2 1/2 1/2 i i i 1/2 1/2 1/2 Ω y =Ω Hδ +Ω u HT estimate δ by 2SLS using G HT as IV. [Σ -1/2 = Q T + θp T ]. Procedure STEP 1: Consider the deviation-from-mean equation: Q y = Q X β + Q ε. T i T i T i a) Do OLS and get ˆW β (consistent). b) Using Within residuals, get s 2. c) e = P y P X ˆ β is an estimate of e T z i γ+e T α i +ε i. i T i T i W STEP 2: Consider the equation ei = ez T i + err. a) Do 2SLS on (7) with IV = [P V X 1i,e T z 1i ], and get ˆW γ. [ ˆW γ is consistent but not efficient.] b) Let v i be the residuals from 2SLS on (7) = e i z i ˆW γ. 1 c) Define s 2 BB = Σ iviv i (consistent for Tσ 2 α +σ 2 ε ). Let ˆ θ = [s 2 /s 2 BB ] 1/2. N IV-32

33 STEP 3: Consider the following quasi-differenced equation: Σ -1/2 y i = Σ -1/2 H i γ + error. (*) a) Do 2SLS on (*) using IV = G HT,i = (Q T X i, P T X 1i, e T z 1i ), and get β δ = =[H Ω -1/2 P(G HT )Ω -1/2 H] -1 H Ω -1/2 P(G HT )Ω -1/2 y, γ where P(G HT ) = G HT (G HT G HT ) -1 G HT. b) Cov(δ ) = s 2 [H Ω -1/2 P(G HT )Ω -1/2 H] -1. The Main Results in HT: If k 1 < g 2 ; can t estimate γ. If k 1 = g 2 ; β = β, and γ can be obtained. ˆW k 1 > g 2 ; HT is more efficient than Within. If k 2 = g 2 = 0; HT estimator = GLS estimator. Efficiency of HT estimator: HT estimator is not efficient. [See Amemiya and MaCurdy (1986, AM), Breusch, Mizon and Schmidt (1989, BMS).] BMS is better than AM, and AM is better than HT. IV-33

34 Testing HT specification: Using β and ˆW β, construct Hausman statistic: H T = ( ˆ β β)[ Cov( ˆ β ) Cov( β)] + ( ˆ β β), W W W which is χ 2 with df = Rank[ Cov( ˆ β ) Cov( β) ]. [(.) + means Moore-Penrose generalized inverse.] W Caution!!! Many textbooks say that df = k 1. However, there are many cases in which df < k 1. For example, if x 1it includes time-dummy variables, df = k 1 - # of time dummy variables. In general, if there are time-varying regressors common to everybody, df = k 1 - # of such variables. If H T > critical value, it implies that the variables which you choose for x 1it and z 1i may in fact be correlated with α i. If your model is rejected, then some variables in x 1it may have to be moved to x 2it. Note that if k 1 = g 2, then, HT = 0. In this case, you cannot use Hausman test statistic for testing specification. IV-34

35 2) AM and BMS estimation: Define G AM,i = [Q T X i, e T x AM,1i, e T z 1i ], G AM = [Q V X, VX AM,1,VZ 1 ], where x AM,1i = [x 1i1,..., x 1iT ]. G BMS,i = [G AM,i,e T x BMS,2i ], G BMS = [G AM,VX BMS,2 ], where x BMS,2i = [ x21 x2, x 2 x2,..., x2, 1 x2 ]. i i t i i T i Under the AM assumptions, E(X i Q T u i ) = E[X i Q T (e T α i +ε i )] = E(X i Q T ε i ) = 0. E(x AM,1i e T u i ) = 0. E(z 1i e T u i ) = 0. E(G AM,i u i ) = 0. BMS additionally assume E[x BMS,2i e T u i ] = 0. Reconsider the whitened equation Ω -1/2 y = Ω -1/2 Hδ + Ω -1/2 u (6) We can estimate δ by using more IV than HT use. Procedure: STEP 1: Consider the deviation-from-mean equation: Q y = Q Xβ + Q ε V V V a) Do OLS and get ˆW β (consistent). b) Using within residuals, get s 2. c) e = P y P X ˆ β is an estimate of e T z i γ + e T α + ε i. i T i T i W IV-35

36 STEP 2: [Note that e it is the same for all t.] Set e i = e T z i γ + err: [The err contains α i and is not correlated with x 1it and z 1i.] a) Do 2SLS on this equation with instruments = [e T x AM,1,e T z 1 ], and get the estimate ˆW γ. (For BMS, use [e T x AM,1,e T z 1,e T x BMS,2 ].) b) Let v i be the residuals from this 2SLS: v i = e ezγˆ. i T i W c) s BB 2 = v v/n (consistent for Tσ α 2 +σ ε 2 ) and ˆ θ = [s 2 /s BB 2 ] 1/2. STEP 3: Σ -1/2 y = Σ -1/2 Hδ + error. (*) a) Do 2SLS on (*) using AMIV = G AM,i = [Q T X i,e T x AM,1,e T z 1 ], and get β δ = [AM estimator]. [BMSIV = [G AM,i,e T x BMS,2i ].] γ b) AM is more efficient than HT. Testing AM specification: Using β and ˆW β, construct Hausman statistic: H T = ( ˆ β β)[ Cov( ˆ β ) Cov( β)] + ( ˆ β β), W W W which is chi-squared with df = min{k, Tk 1 -g 2 }. If HT > critical value, it implies that the variables which you choose for X 1it and Z 1i may in fact be correlated with α i. [For BMS, df = min{k,tk 1 +(T-1)k 2 -g 2 }. IV-36

37 CASE B: The ε it are not iid over time, but iid over i Modified Generalized Instrumental Variables (MGIV) Estimator (Im, Ahn, Schmidt and Wooldridge, 1999; Ahn and Schmidt, 1999) W HT,i = [Q T X i,p T X 1i,e T z 1i ]; W AM,i = [Q T X i,e T x AM,1i,e T z 1i ], x AM,1i = [x 1i1,..., x 1iT ] (1 k 1 T). W BMS,i =[W AM,i,e T x BMS,2i ], x BMS,2i =[x 2i1 - x 2i,..., x 2i,T-1 - x 2i ] (1 k 2 (T-1)) Assume that the ε i are homoskedastic across i but heteroskedastic or autocorrelated over t. Digression to 2SLS and 3SLS Notation: For T p matrix M i or T 1 vector m i, M M1 m1 M 2 m 2 = ; m=. : : MN mn H denotes the data matrices of NT rows. With this notation, y = Hδ + u. W i is a T q matrix of instruments such that E(W i u i ) = 0. IV-37

38 Estimators: 1) 2SLS: ˆ δ ( ) = [ ( ) ] ( ). 2SLS Wi HW WW WH HW WW Wy 2) 3SLS: 1 Σ= ˆ Σ( ˆ ˆ i yi Hiδ2SLS)( yi Hiδ 2SLS) ; N Ω= ˆ Σ; ˆ 3SLS I N ˆ δ ( ) = [ ( Ω ˆ ) ] ( Ω ˆ ) Wi HW W W WH HW W W Wy ( ˆ ( ) 1 ) ( ( ˆ ) δ ) 1 3SLS i Cov W = H W W Ω W W H. End of Digression MGIV estimator: δ δ 3 ([Q Σ Q T X i, Σ -1 P T X 1i,Σ -1 e T z 1i ]) MGIV, HT MGIV, AM ˆ SLS ˆ SLS δ δ 3 ([Q Σ Q T X i, Σ -1 e T x AM,1i,Σ -1 e T z 1i ]) δ δ 3 [Σ -1 (Q T X i, e T x AM,1i, e T z 1i, e T x BMS,2i )] MGIV, BMS MGIV, KR ˆ SLS ˆ SLS β β 3 (Q Σ Q T X i ) = Kiefer's GLS for FE model (1980, JEC) [Here, Q Σ = Σ -1 - Σ -1 e T (e T Σ -1 e T ) -1 e T Σ -1.] IV-38

39 The HT (AM, BMS) MGIV estimator is efficient under the HT (AM, BMS) assumptions. Hausman test for HT, AM and BMS: Based on KR-MGIV and MGIV estimators of β. IV-39

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s.

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s. 9. AUTOCORRELATION [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s. ) Assumptions: All of SIC except SIC.3 (the random sample assumption).