Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 26
Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Hausman-Taylor estimator Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 2 / 26
Two-way Error Component Model Two-way Error Component Model y it = α + X itβ + u it i {1,..., N }, t {1,..., T}, (1) the composite error component: u i,t = µ i + λ t + ε i,t, (2) where: µi the unobservable individual-specific effect; λt the unobservable time-specific effect; ui,t the remainder disturbance. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 3 / 26
Fixed effects model Two-way fixed effects model: where y it = α i + λ t + β 1 x 1it +... + β k x kit + u it, (3) αi - the individual-specific intercept; λi - the period-specific intercept; uit N ( 0, σu) 2. As in the case of the one-way fixed effect model, independent variables x 1,..., x k cannot be time invariant (for a given unit). The estimation: 1 the least squares dummy variable estimator (LSDV); 2 the within estimator; 3 combination of the above methods. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 4 / 26
The LSDV estimator In the analogous fashion to the one-way fixed effect model, we define the following dummy variables: for the j-th unit: { 1 i = j D jit = 0 otherwise, (4) for the τ-th period: B τit = { 1 t = τ 0 otherwise. (5) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 5 / 26
The LSDV estimator In the analogous fashion to the one-way fixed effect model, we define the following dummy variables: for the j-th unit: { 1 i = j D jit = 0 otherwise, (4) for the τ-th period: B τit = { 1 t = τ 0 otherwise. (5) The LSDV estimator can be obtained by estimating the following model: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t. (6) j=1 τ=1 where u i,t N ( 0, σu) 2. The parameters in equation (7) can be estimated with the OLS. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 5 / 26
The LSDV estimator: test for poolability The LSDV estimator: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t, (7) j=1 where u i,t N ( 0, σ 2 u). τ=1 We can test whether the individual-specific and period-specific effects are significantly different: H 0 : α 1 =... = α k λ 1 =... = λ T = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSE R ) with the sum of squared errors from the LSDV model (without restrictions, SSE R ). The test statistics F: F = (SSE R SSE U )/(N T 1) SSE U /(NT K T) if null is emph then F F (N T 1,NT K T). (9) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 26
The LSDV estimator: test for poolability The LSDV estimator: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t, (7) j=1 where u i,t N ( 0, σ 2 u). τ=1 We can test whether the individual-specific and period-specific effects are significantly different: H 0 : α 1 =... = α k λ 1 =... = λ T = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSE R ) with the sum of squared errors from the LSDV model (without restrictions, SSE R ). The test statistics F: F = (SSE R SSE U )/(N T 1) SSE U /(NT K T) if null is emph then F F (N T 1,NT K T). One might test only individual or period effects: (9) H 0 : α 1 =... = α k, (10) H 0 : λ 1 =... = λ k. (11) The construction of test for poolability is the same but we use different degrees of freedom. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 26
The within estimator Consider the following transformations: averaging over time (for each unit i): ȳi, averaging over unit (for each period t): ȳt, averaging over time and unit: ȳ, for the dependent variable y it : ȳ i = 1 T T y it, t ȳ t = 1 N N i y it, ȳ = 1 NT T t N y it. i Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 7 / 26
The within estimator Consider the following transformations: averaging over time (for each unit i): ȳi, averaging over unit (for each period t): ȳt, averaging over time and unit: ȳ, for the dependent variable y it : ȳ i = 1 T T y it, t ȳ t = 1 N N i y it, ȳ = 1 NT T t N y it. i Given the above transformations we can eliminate both the individual- and period specific effects: (y it ȳ i ȳ t + ȳ it ) = β (x it x i x t + x it ) + (u it ū i ū t + ū it ). (12) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 7 / 26
The within estimator The within estimator: where ỹ it = β 1 x 1it +... + β k x kit + ũ it (13) ỹ it = y it ȳ i ȳ t + ȳ it x 1it = x 1it x 1i x 1t + x 1it... x kit = x kit x ki x kt + x kit ũ it = u it ū i ū t + ū it. The parameters β 1,..., β k can be estimated by the OLS. The independent variables, i.e., x 1it,..., x kit, cannot be time (unit) invariant. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 8 / 26
Random effects model Two-way random effects model: where the error component is as follows: where y it = α + β 1 x 1it +... + β k x kit + u it, (14) u it = µ i + λ t + ε it, (15) µi is the individual-specific error component and µ i N ( 0, σ 2 µ) ; λt is the period-specific error component and λ t N ( 0, σ 2 λ) ; εit is the idiosyncratic error component and ε it N ( 0, σ 2 ε). Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 9 / 26
Random effects model Two-way random effects model: where the error component is as follows: where y it = α + β 1 x 1it +... + β k x kit + u it, (14) u it = µ i + λ t + ε it, (15) µi is the individual-specific error component and µ i N ( 0, σµ) 2 ; λt is the period-specific error component and λ t N ( 0, σλ) 2 ; εit is the idiosyncratic error component and ε it N ( 0, σε) 2. The independent variables can be time invariant. Individual-specific and period-specific effects are independent: E (µ i, µ j ) = 0 if i j, E (λ s, λ t ) = 0 if s t E (µ i, λ t ) = 0 Estimation method: GLS (generalized least squares). Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 9 / 26
The two-way RE model - the variance-covariance matrix of error term I It is assumed that the error term in the two-way RE model can be described as follow: u it = µ i + λ t + ε it, (16) where µ i N ( 0, σ 2 µ), λt N ( 0, σ 2 λ) and εit N ( 0, σ 2 ε). The variance-covariance matrix of the error term is not diagonal!. The diagonal elements of the variance covariance matrix of the error term: E ( ) uit 2 = E (µ i + λ t + ε it) 2 = E ( ) µ 2 i + E ( ) λ 2 t + E ( ) ε 2 it + 2cov (µ i, λ t) + 2cov (µ i, ε it) + 2cov (λ t, ε it) }{{}}{{}}{{}}{{}}{{}}{{} =0 =0 =0 =σ 2 µ =σ 2 λ = σ 2 µ + σ 2 λ + σ 2 ε. =σ 2 ε Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 10 / 26
The two-way RE model - the variance-covariance matrix of error term II For a given unit, non-diagonal elements of the variance covariance matrix of the error term (t s): cov (u it, u is ) = E [(µ ( i + λ t + ε it ) (µ i + λ s + ε is )], ) = E µ 2 i + E (λ t λ s) + E (ε it ε is ) + COV }{{}}{{}}{{}}{{} 1 =0 =0 =0 = σ 2 µ, =σ 2 µ where COV 1 = cov(λ t, µ i )+cov(λ s, µ i )+cov(λ t, ε it )+cov(λ s, ε it )+2cov(µ i, ε it ). Similarly, for a a given period, non-diagonal elements of the variance covariance matrix of the error term (i j): cov (u it, u jt ) = E [(µ i + λ t + ( ε it (µ j + λ t + ε jt )], ) = E (µ i µ j ) + E λ }{{} 2 t + E (ε it ε jt ) + COV }{{}}{{}}{{} 2 =0 =0 =0 = σ 2 λ. where COV 2 = cov(µ i, λ t )+cov(µ i, λ t )+cov(µ i, ε it )+cov(µ j, ε it )+2cov(λ t, ε it ). =σ 2 λ Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 11 / 26
The two-way RE model - the variance-covariance matrix of error term III Finally, the variance-covariance matrix of the error term can be described: σµ 2 + σλ 2 + σ2 ε if i = j, t = s, σµ E (u it, u js ) = 2 if i = j, t s, σλ 2 (17) if i j, t = s, 0 if i j, t s. Unlike the one-way RE model, the variance-covariance matrix E (u it, u js ) is not block-diagonal because there is correlation between units. This correlation is caused by the the period-specific effects and equals σ 2 λ /(σ2 µ + σ 2 λ + σ2 ε). Akin to the one-way RE model there is equicorrelation which is implied by the unit-specific effects and equals σ 2 µ/(σ 2 µ + σ 2 λ + σ2 ε). Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 12 / 26
The GLS estimation of the two-way RE model: example Consider the following transformation: zit = z it θ 1 z i θ 2 z t + θ 3 z it, (18) where z it {y it, x 1it,..., x kit, u it } and z i = 1 T T z it, t ȳ t = 1 N N i z it, z = 1 NT T t N z it, i and θ 1 = 1 θ 2 = 1 σ ε Tσ 2 µ + σ 2 ε σ ε N σ 2 λ + σ 2 ε. θ 3 = θ 1 + θ 2 + σ ε Tσ 2 µ + N σ 2 λ + σ2 ε 1. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 13 / 26
The GLS estimation of the two-way RE model: Swamy and Aurora (1972) Swamy and Aurora (1972) propose running three least squares regressions to estimate σ 2 ε, σ 2 µ and σ 2 λ which allow us to calculate θ 1, θ 2 and θ 3. 1 The within regression allows to estimate the variance of the idiosyncratic component, i.e., ˆσ 2 ε. 2 The between (individuals) regression allows to estimate the σ µ which is the estimated variance of the error term from the following regression: ȳ i ȳ = β 1 ( x 1i x 1) +... + β k ( x ki x k ) + ū i ū, and ˆσ I 2 is the estimated variance of the error term from the above regression. Then, the estimated variance of the individual specific-error term can be described as follow ˆσ µ 2 = ˆσ2 I ˆσ ε 2. T 3 The between (periods) regression allows to estimate the σ λ which is the estimated variance of the error term from the following regression: ȳ t ȳ = β 1 ( x 1t x 1) +... + β k ( x kt x k ) + ū t ū, and ˆσ T 2 is the estimated variance of the error term from the above regression. Then, the estimated variance of the individual period-error term can be described as follow ˆσ λ 2 = ˆσ2 T ˆσ ε 2. N Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 14 / 26
The GLS estimation of the two-way RE model: general remarks Alternative estimators of σε, 2 σµ 2 and σλ 2 are proposed by Wallace and Hussain (1969), Amemiya (1971), Fuller and Battese (1974) and Nerlove (1971). In general, estimates of σ 2 ε, σ 2 µ and σ 2 λ allow to calculate θ 1, θ 2 and θ 3 and estimate model for transformed variables: y it = β 1 x 1it +... + β k x kit + u it (19) where z it = z it θ 1 z i θ 2 z t + θ 3 z it and z it {y it, x 1it,..., x kit, u 1it }. Computational warning: sometimes estimates of σµ 2 or σλ 2 could be negative. This is due to the fact that we use two-step strategy rather than jointly estimation. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 15 / 26
Two-way Error Component Model y it = α + β 1 x 1it + β k x kit + µ i + λ t + ε it (20) Random ( effects ) Individual µ i N ( 0, σµ 2 ) time λ t N 0, σλ 2 effects drawn from the random sample = we can estimate the parameters of distribution, i.e., σµ 2 and Fixed effects µ i = individual intercepts α i λ t α i and λ t are assumed to be constant over time σλ 2 Assumptions: (i) E (µ i ε it ) = 0 E (λ t ε it ) = 0 (i) E (µ i ε it ) = 0 E (λ t ε it ) = 0 (ii) E (µ i x it ) = 0 E (λ t x it ) = 0 individual-specific and periodspecific effects are independent of the explanatory variable x it Estimation GLS OLS (within or LSDV) Efficiency higher lower Additional: impossible to use time invariant regressors (collinearity with α i ) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 16 / 26
Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Hausman-Taylor estimator Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 17 / 26
Hausman-Taylor estimator Let s consider the following one-way RE model: y it = x 1it β 1 + x 2it β 2 + z 1i γ 1 + z 2i γ 2 + µ i + u it (21) where: x 1it are time-varying variables; not correlated with µ i x 2it are time-varying variables; correlated with µ i z 1i are time-invariant variables; not correlated with µ i z 2i are time-invariant variables; correlated with µ i The RE model estimates on γ 2 are inconsistent. The estimator proposed by Hausman and Taylor (1981) takes into account the above correlation. Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 18 / 26
The anatomy of the Hausman-Taylor estimator First step: Within regression for the model including only time-variable regressors, both x 1it and x 2it. Here, the usual differences from the temporal mean are used: (y it ȳ i ) = β 1 (x 1i, x 1i ) + β 2 (x 2it x 2i ) + (u it ū i ) (22) Based on the expression above we can estimate variance of the idiosyncratic error, i.e., ˆσ 2 ε. Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 19 / 26
The anatomy of the Hausman-Taylor estimator Second step: construct the intra-temporal mean of the residuals from (22): ē = [(ē 1, ē 1,..., ē 1 ),..., (ē N, ē N,..., ē N )] (23) }{{}}{{} T T Then make TSLS for ē i using: variables: z 1it (time invariant, not correlated with µ i ), z 2it (time invariant, correlated with µ i ) instruments: z 1it, x 1it (time invariant, not correlated with µ i ) Specifically, 1 Regress z 2it on z 1it as well as x 1it. 2 Use the predicted value from the above regression and create new matrix, i.e., Z = [z 1it, ẑ 21t]. 3 Regress ē i on Z to get estimates of γ 1 and γ 2. 4 Calculate σtsls,ē 2 the variance of the error components from the above regression. Now, we can calculate the variation of the individual-specific error component: σ 2 µ = σ 2 TSLS,ē σ2 ε T. (24) Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 20 / 26
The anatomy of the Hausman-Taylor estimator Based on the estimates of σµ 2 and σε 2 calculate the conventional in the FGLS regression scale parameter θ: θ = σ 2 ε σ 2 ε + T 1 σ 2 µ (25) Finally, do a TSLS regression of y on X with instruments described by V : more specifically: y = y it θy it, (26) X = [x 1it, x 2it, z 1i, z 2i ] θ[x 1it, x 2it, z 1i, z 2i ], (27) V = [(x 1it x 1i ), (x 2it x 2i ), z 1i, x 1i ], (28) 1 Regress X on the instruments (V ) and obtain fitted values, i.e., ˆX, 2 Regress y on the predicted values from the previous step, i.e, ˆX, in order to get the estimates of [β, γ]. The estimates of the variance-covariance of the structural parameters are a little bit more complicated. Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 21 / 26
Empirical example: PSID Consider the following model: lwage it = α + β 1occ it + β 2south it + β 3smsa it + β 4ind it + (29) +β 5exp it + β 6exp2 it + β 7wks it + β 8ms it + (30) +β 9union it + β 10fem i + β 11blk i + β 12ed i + u it (31) Variable lwage exp it wks it occ it ind it south it smsa it ms it fem i union it ed i blk i Description log wage years of full-time work experience weeks worked occupation; occ it = 1 if in a blue-collar occupation industry; ind it = 1 if working in a manufacturing industry residence; south it = 1 if in the South area smsa it = 1 if in the Standard metropolitan statistical area marital status female or male if wage set be a union contract years of education black Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 22 / 26
Empirical example: PSID FE RE occ 0.022 0.050 south 0.002 0.017 smsa 0.043 0.014 ind 0.019 0.004 exp 0.113 0.082 exp2 0.000 0.001 wks 0.001 0.001 ms 0.030 0.075 union 0.033 0.063 fem 0.339 blk 0.210 ed 0.100 cons 4.649 4.264 0.000 0.000 σ µ 0.263 σ ε 0.152 ρ 0.749 Note: Note: the superscripts, and denote the rejection of null about parameters insignificance at 1%, 5% and 10% significance level, respectively. Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 23 / 26
Empirical example: PSID Correlation between the estimated random unit-specific effect and independent variables x jit ρ(ˆµ i, x jit ) p-value occ 0.009 0.546 south 0.001 0.928 smsa 0.091 0.000 ind 0.035 0.023 exp 0.764 0.000 exp2 0.737 0.000 wks 0.071 0.000 ms 0.034 0.026 union 0.039 0.011 fem 0.000 1.000 blk 0.000 1.000 ed 0.000 1.000 Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 24 / 26
Empirical example: PSID Classification of the explanatory variables uncorrelated with µ i correlated with µ i time-variant wks, south, ms, smsa occ, exp, exp2, ind, union time-invariant fem, blk ed Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 25 / 26
Empirical example: PSID FE RE HT occ 0.022 0.050 0.027 south 0.002 0.017 0.007 smsa 0.043 0.014 0.042 ind 0.019 0.004 0.014 exp 0.113 0.082 0.113 exp2 0.000 0.001 0.000 wks 0.001 0.001 0.001 ms 0.030 0.075 0.030 union 0.033 0.063 0.033 fem 0.339 0.131 blk 0.210 0.286 ed 0.100 0.138 cons 4.649 4.264 2.913 0.000 0.000 σ µ 0.263 0.942 σ ε 0.152 0.152 ρ 0.749 0.975 Note: Note: the superscripts, and denote the rejection of null about parameters insignificance at 1%, 5% and 10% significance level, respectively. Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 26 / 26