Econometrics of Panel Data - PDF Free Download

Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30

Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical variance-covariance matrix of the error term Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 2 / 30

Two-way Error Component Model Two-way Error Component Model y it = α + X itβ + u it i {1,..., N}, t {1,..., T }, (1) the composite error component: u i,t = µ i + λ t + ε i,t, (2) where: µi the unobservable individual-specific effect; λt the unobservable time-specific effect; ui,t the remainder disturbance. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 3 / 30

Fixed effects model Two-way fixed effects model: where y it = α i + λ t + β 1 x 1it +... + β k x kit + u it, (3) αi - the individual-specific intercept; λi - the period-specific intercept; uit N ( 0, σu) 2. As in the case of the one-way fixed effect model, independent variables x 1,..., x k cannot be time invariant (for a given unit). The estimation: 1 the least squares dummy variable estimator (LSDV); 2 the within estimator; 3 combination of the above methods. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 4 / 30

The LSDV estimator In the analogous fashion to the one-way fixed effect model, we define the following dummy variables: for the j-th unit: { 1 i = j D jit = 0 otherwise, (4) for the τ-th period: B τit = { 1 t = τ 0 otherwise. (5) The LSDV estimator can be obtained by estimating the following model: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t. (6) j=1 τ=1 where u i,t N ( 0, σu) 2. The parameters in equation (7) can be estimated with the OLS. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 5 / 30

The LSDV estimator: test for poolability The LSDV estimator: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t, (7) j=1 where u i,t N ( 0, σ 2 u). τ=1 We can test whether the individual-specific and period-specific effects are significantly different: H 0 : α 1 =... = α k λ 1 =... = λ T = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSE R ) with the sum of squared errors from the LSDV model (without restrictions, SSE R ). The test statistics F: F = (SSE R SSE U )/(N T 1) SSE U /(NT K T ) if null is emph then F F (N T 1,NT K T ). (9) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 30

The LSDV estimator: test for poolability The LSDV estimator: N N y it = α j D jit + λ τ B τit + β 1 x 1it +... + β k x kit + u i,t, (7) j=1 where u i,t N ( 0, σ 2 u). τ=1 We can test whether the individual-specific and period-specific effects are significantly different: H 0 : α 1 =... = α k λ 1 =... = λ T = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSE R ) with the sum of squared errors from the LSDV model (without restrictions, SSE R ). The test statistics F: F = (SSE R SSE U )/(N T 1) SSE U /(NT K T ) if null is emph then F F (N T 1,NT K T ). One might test only individual or period effects: (9) H 0 : α 1 =... = α k, (10) H 0 : λ 1 =... = λ k. (11) The construction of test for poolability is the same but we use different degrees of freedom. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 30

The within estimator Consider the following transformations: averaging over time (for each unit i): ȳi, averaging over unit (for each period t): ȳt, averaging over time and unit: ȳ, for the dependent variable y it : ȳ i = 1 T T y it, t ȳ t = 1 N N i y it, ȳ = 1 NT T t N y it. i Given the above transformations we can eliminate both the individual- and period specific effects: (y it ȳ i ȳ t + ȳ it ) = β (x it x i x t + x it ) + (u it ū i ū t + ū it ). (12) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 7 / 30

The within estimator The within estimator: where ỹ it = β 1 x 1it +... + β k x kit + ũ it (13) ỹ it = y it ȳ i ȳ t + ȳ it x 1it = x 1it x 1i x 1t + x 1it... x kit = x kit x ki x kt + x kit ũ it = u it ū i ū t + ū it. The parameters β 1,..., β k can be estimated by the OLS. The independent variables, i.e., x 1it,..., x kit, cannot be time (unit) invariant. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 8 / 30

Random effects model Two-way random effects model: y it = α + β 1 x 1it +... + β k x kit + u it, (14) where the error component is as follows: where u it = µ i + λ t + ε it, (15) µi is the individual-specific error component and µ i N ( 0, σµ) 2 ; λt is the period-specific error component and λ t N ( 0, σλ) 2 ; εit is the idiosyncratic error component and ε it N ( 0, σε) 2. The independent variables can be time invariant. Individual-specific and period-specific effects are independent: E (µ i, µ j ) = 0 if i j, E (λ s, λ t ) = 0 if s t E (µ i, λ t ) = 0 Estimation method: GLS (generalized least squares). Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 9 / 30

The two-way RE model - the variance-covariance matrix of error term I It is assumed that the error term in the two-way RE model can be described as follow: u it = µ i + λ t + ε it, (16) where µ i N ( 0, σ 2 µ), λt N ( 0, σ 2 λ) and εit N ( 0, σ 2 ε). The variance-covariance matrix of the error term is not diagonal!. The diagonal elements of the variance covariance matrix of the error term: E ( ) u 2 it = E (µ i + λ t + ε it) 2 = E ( ) µ 2 i + E ( ) λ 2 t + E ( ) ε 2 it + 2cov (µ i, λ t) + 2cov (µ i, ε it) + 2cov (λ t, ε it) }{{}}{{}}{{}}{{}}{{}}{{} =0 =0 =0 =σ 2 µ =σ 2 λ = σ 2 µ + σ 2 λ + σ 2 ε. =σ 2 ε Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 10 / 30

The two-way RE model - the variance-covariance matrix of error term II For a given unit, non-diagonal elements of the variance covariance matrix of the error term (t s): cov (u it, u is ) = E [(µ ( i + λ t + ε it ) (µ i + λ s + ε is )], ) = E µ 2 i + E (λ tλ s) + E (ε }{{}}{{} it ε is ) + COV }{{}}{{} 1 =0 =0 =0 = σ 2 µ, =σ 2 µ where COV 1 = cov(λ t, µ i )+cov(λ s, µ i )+cov(λ t, ε it )+cov(λ s, ε it )+2cov(µ i, ε it ). Similarly, for a a given period, non-diagonal elements of the variance covariance matrix of the error term (i j): cov (u it, u jt ) = E [(µ i + λ t + ( ε it (µ j + λ t + ε jt )], ) = E (µ i µ j ) + E λ }{{} 2 t + E (ε it ε jt ) + COV }{{}}{{}}{{} 2 =0 =0 =0 = σ 2 λ. where COV 2 = cov(µ i, λ t )+cov(µ i, λ t )+cov(µ i, ε it )+cov(µ j, ε it )+2cov(λ t, ε it ). =σ 2 λ Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 11 / 30

The two-way RE model - the variance-covariance matrix of error term III Finally, the variance-covariance matrix of the error term can be described: σµ 2 + σλ 2 + σ2 ε if i = j, t = s, σµ E (u it, u js ) = 2 if i = j, t s, σλ 2 (17) if i j, t = s, 0 if i j, t s. Unlike the one-way RE model, the variance-covariance matrix E (u it, u js ) is not block-diagonal because there is correlation between units. This correlation is caused by the the period-specific effects and equals σ 2 λ /(σ2 µ + σ 2 λ + σ2 ε). Akin to the one-way RE model there is equicorrelation which is implied by the unit-specific effects and equals σ 2 µ/(σ 2 µ + σ 2 λ + σ2 ε). Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 12 / 30

The GLS estimation of the two-way RE model: example Consider the following transformation: zit = z it θ 1 z i θ 2 z t + θ 3 z it, (18) where z it {y it, x 1it,..., x kit, u it } and z i = 1 T T z it, t ȳ t = 1 N N i z it, z = 1 NT T t N z it, i and. θ 1 = 1 θ 2 = 1 θ 3 = θ 1 + θ 2 + σ ε T σ 2 µ + σ 2 ε σ ε Nσ 2 λ + σ 2 ε σ ε T σ 2 µ + Nσ 2 λ + σ2 ε Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 13 / 30 1.

The GLS estimation of the two-way RE model: Swamy and Aurora (1972) Swamy and Aurora (1972) propose running three least squares regressions to estimate σ 2 ε, σ 2 µ and σ 2 λ which allow us to calculate θ 1, θ 2 and θ 3. 1 The within regression allows to estimate the variance of the idiosyncratic component, i.e., ˆσ 2 ε. 2 The between (individuals) regression allows to estimate the σ µ which is the estimated variance of the error term from the following regression: ȳ i ȳ = β 1 ( x 1i x 1) +... + β k ( x ki x k ) + ū i ū, and ˆσ I 2 is the estimated variance of the error term from the above regression. Then, the estimated variance of the individual specific-error term can be described as follow ˆσ µ 2 = ˆσ2 I ˆσ ε 2. T 3 The between (periods) regression allows to estimate the σ λ which is the estimated variance of the error term from the following regression: ȳ t ȳ = β 1 ( x 1t x 1) +... + β k ( x kt x k ) + ū t ū, and ˆσ T 2 is the estimated variance of the error term from the above regression. Then, the estimated variance of the individual period-error term can be described as follow ˆσ λ 2 = ˆσ2 T ˆσ ε 2 N. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 14 / 30

The GLS estimation of the two-way RE model: general remarks Alternative estimators of σε, 2 σµ 2 and σλ 2 are proposed by Wallace and Hussain (1969), Amemiya (1971), Fuller and Battese (1974) and Nerlove (1971). In general, estimates of σ 2 ε, σ 2 µ and σ 2 λ allow to calculate θ 1, θ 2 and θ 3 and estimate model for transformed variables: y it = β 1 x 1it +... + β k x kit + u it (19) where z it = z it θ 1 z i θ 2 z t + θ 3 z it and z it {y it, x 1it,..., x kit, u 1it }. Computational warning: sometimes estimates of σµ 2 or σλ 2 could be negative. This is due to the fact that we use two-step strategy rather than jointly estimation. Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 15 / 30

Two-way Error Component Model y it = α + β 1 x 1it + β k x kit + µ i + λ t + ε it (20) Random ( effects ) Individual µ i N ( 0, σµ 2 ) time λ t N 0, σλ 2 effects drawn from the random sample = we can estimate the parameters of distribution, i.e., σµ 2 and Fixed effects µ i = individual intercepts α i λ t α i and λ t are assumed to be constant over time σλ 2 Assumptions: (i) E (µ i ε it ) = 0 E (λ t ε it ) = 0 (i) E (µ i ε it ) = 0 E (λ t ε it ) = 0 (ii) E (µ i x it ) = 0 E (λ t x it ) = 0 individual-specific and periodspecific effects are independent of the explanatory variable x it Estimation GLS OLS (within or LSDV) Efficiency higher lower Additional: impossible to use time invariant regressors (collinearity with α i ) Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 16 / 30

Spherical variance-covariance matrix In the previous meeting we ve assumed that the variance covariance matrix of the error term is spherical: E (uu ) = σ 2 ui (21) or, at least, block diagonal (in the RE model): 1 ρ... ρ E (u i. u i.) = Σ i,i = ( σµ 2 + σε 2 ) ρ 1. ρ...... ρ ρ... 1, (22) where ρ = σ2 µ σµ 2 + σε 2, where σµ 2 and σε 2 stands for the variance of the individual-specific and idiosyncratic error term. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 18 / 30

Non-spherical variance-covariance matrix More general case: E (uu ) = Σ = Σ 1,1 Σ 1,2... Σ 1,N Σ 2,1 Σ 2,2... Σ 2,N...... Σ N,1 Σ N,2... Σ N,N, (23) where Σ i,j is the variance-covariance matrix of the error term between i-th and j-th (cross-sectional) unit. Implications: the least squares estimator is still consistent (if other assumptions are satisfied) but it is no longer BLUE (best linear unbiased estimator). Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 19 / 30

The Generalized Least Squares estimator To overcome the problem of non-spherical variance-covariance matrix of the error term If we know Σ and the other assumptions are satisfied one might apply the GLS (Generalized Least Squares) estimator: ˆβ GLS = and the variance-covariance estimator: V ar ( ˆβGLS ) = ( X ˆΣ 1 X) 1 X ˆΣ 1 y, (24) ( X ˆΣ 1 X) 1. (25) In the presence of the non-spherical disturbances the GLS estimator is BLUE. Key challenge: Σ!. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 20 / 30

Non-spherical variance-covariance matrix The variance-covariance matrix of the error term can be non-spherical due to: 1 Autocorrelation (serial correlation). 2 Heteroskedasticity. 3 Cross-sectional dependence. 4 Combination of the above cases. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 21 / 30

Autocorrelation - AR(1) example Autocorrelation or Serial correlation is the correlation of the error term with its past values. AR(1) example: u it = ρu it 1 + η it (26) where ρ (0, 1) and η it N (0, σ 2 η). Autocorrelation signals that disturbances displays some memory. One possible explanation for autocorrelation is that relevant factors are omitted = omitted variables bias. Response to a unit shock in period t = 0. 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 periods ρ = 0.95, ρ = 0.75, ρ = 0.5. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 22 / 30

Detecting autocorrelation Visual inspection: plotting residuals. Simple regressions. Test proposed by Baltagi and Wu (1999). The null hypothesis is about no autocorrelation: H 0 ρ = 0. (27) The test investigates only the first-order autocorrelation. The test statistics is the Durbin-Watson statistic tailored to the panel data. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 23 / 30

The GLS estimator assuming AR(1) error term General idea: 1 The autocorrelation coefficient is estimated from the OLS (within) residuals. 2 All variables are transformed: (1 ˆρ 2 ) 1 2 z it if t = 1 zit = ( ) 1 ( ) (1 ˆρ 2 ) 1 1 2 (z ) 2 it 1 ˆρ z ˆρ 2 2 it 1 1 ˆρ if t > 1 2 (28) Note that the above transformation is quite similar to the Prais-Winters transformation. 3 The first observation of each panel should be removed and then it is possible to apply within (FE) estimator to transformed data. 4 Baltagi and Wu propose the GLS estimator of the RE model with the AR error term. The main idea is quite similar to basic RE model. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 24 / 30

Heteroskedasticity Heteroskedasticity refers to the situation in which the variance of the error term is not constant. Example for panel data: when σ 2 u,1... σ 2 u,n. E (uu ) = Σ = diag ( Iσ 2 u1,..., Iσ 2 u,n ) Iσ 2 u, (29) General intuition: uncertainty associated with the outcome y (captured by the variance of the error term) is not constant for various values of independent variables x. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 25 / 30

The Robust standard errors When the error term is heteroskedastic the robust estimator of the variancecovariance can be used to obtain consistent estimates of the standard errors. White s heteroscedasticity-consistent estimator: V ar( ˆβ) = (X X) 1 ( X ˆΣX ) (X X) 1 (30) where ˆΣ = diag(û 2 1,..., û 2 N ). The clustered robust standard errors: All observations are divided into G groups: V ar( ˆβ) = ( X X ) ( G ) 1 (X x iû iû ix i X ) 1. (31) i=1 Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 26 / 30

Cross-sectional dependence Cross-sectional dependence takes place when the error terms between individuals at the same time period t are correlated: E (u it u jt ) 0 if i j. (32) The cross-sectional dependence may arise due to the presence of common or/and unobserved factors that become a part of the error term. For instance: common business cycles fluctuations, spillovers, neighborhood effects, herd behavior, and interdependent preferences. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 27 / 30

Testing for the cross-sectional dependence Consider the static panel data model: y it = α + β 1 x 1it +... + β k x kit + u it. (33) The cross-sectional correlation coefficient: T t=1 ˆρ i,j = ûitû jt ( T ) 1 t=1 ûit 2 ( ûjt). (34) 1 T 2 t=1 Note that ρ i,j = ρ j,i. For panel consisting of N unit we get N(N 1)/2 pair-wise correlation coefficients. The hypothesis of interest: The LM statistic (Breusch and Pagan, 1980): H 0 : ρ i,j = 0 if i j. (35) H 1 : ρ i,j 0 if i j (36) LM = T N 1 N i=1 j=i+1 ˆρ 2 i,j. (37) The above statistic is valid for fixed N as T and is asymptomatically distributed as χ 2 with N(N 1)/2 degrees of freedom. The LM statistic exhibits substantial size distortion when N is relatively large due to fact that it is not correctly centered for fixed T. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 28 / 30

Testing for the cross-sectional dependence Pesaran (2004) proposes the following test statistic: N 1 2T N CD = N (N 1) i=1 j=i+1 ˆρ i,j. (38) Under the null about no cross sectional dependence, the CD statistic is normally distributed, i.e., CD N (0, 1), for N and sufficiently large T. The CD statistic can be used in a wide range of panel-data models, e.g., basic static models, homogeneous/heterogeneous dynamic model, nonstationary model. Unbalanced panels: N 1 2 CD = N (N 1) N i=1 j=i+1 Tij ˆρ i,j. (39) Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 29 / 30

Testing for the cross-sectional dependence Friedman s statistics: T 1 F R = (40) (N 1)R ave + 1 where R ave is the average Spearman s cross-sectional correlation. The F R statistic is asymptotically χ 2 distributed with with T 1 degrees of freedom, for fixed T and and sufficiently large N. Frees statistics bases on the sum of the squared rank correlation coefficients R 2 ave: where F RE = N N 1 Rave 2 2 = N (N 1) ( Rave 2 1 ), (41) T 1 N i=1 j=i+1 Tij ˆr i,j. (42) where r i,j is the Spearman s correlation between residuals for i and j unit. The null is rejected when R 2 ave > 1/(T 1) + Q b /N, where Q b is the b-th quantile of the Q distribution (the Q distribution is the weighted sum ot two χ 2 random variables). Both Friedman s and Frees statistic are designed to static panel data models. Jakub Mućk Econometrics of Panel Data Non-spherical variance-covariance Meeting # 4 30 / 30