ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

ECON 4551 Econometrics II Memorial University of Newfoundland Panel Data Models Adapted from Vera Tabakova s notes

15.1 Grunfeld s Investment Data 15.2 Sets of Regression Equations 15.3 Seemingly Unrelated Regressions 15.4 The Fixed Effects Model 15.4 The Random Effects Model Extensions RCM, dealing with endogeneity when we have static variables Slide 15-2

The different types of panel data sets can be described as: long and narrow, with long time dimension and narrow, few cross sectional units; short and wide, many units observed over a short period of time; long and wide, indicating that both N and T are relatively large. Slide 15-3

(, ) INV = f V K it it it (15.1) The data consist of T = 20 years of data (1935-1954) for N = 10 large firms. Value of stock, proxy for expected profits Let y it = INV it and x 2it = V it and x 3it = K it Capital stock, proxy for desired permanent Capital stock y =β +β x +β x + e it 1it 2it 2it 3it 3it it (15.2) Notice the subindices! Slide 15-4

INV =β +β V +β K + e t = 1,, 20 GE, t 1 2 GE, t 3 GE, t GE, t INV =β +β V +β K + e t = 1,, 20 WE, t 1 2 WE, t 3 WE, t WE, t (15.3a) yit =β 1+β 2x2it +β 3x3it + eit i= 1, 2; t = 1,, 20 (15.3b) For simplicity we focus on only two firms GRETL: smpl firm = 3 firm = 8 --restrict Slide 15-5

INV =β +β V +β K + e t = 1,, 20 GE, t 1, GE 2, GE GE, t 3, GE GE, t GE, t INV =β +β V +β K + e t = 1,, 20 WE, t 1, WE 2, WE WE, t 3, WE WE, t WE, t (15.4a) yit =β 1i +β 2ix2it +β 3ix3it + eit i= 1, 2; t = 1,, 20 (15.4b) Slide 15-6

( ) ( ) 2 ( ) E e = 0 var e =σ cov e, e = 0 GE, t GE, t GE GE, t GE, s ( ) ( ) 2 ( ) E e = 0 var e =σ cov e, e = 0 WE, t WE, t WE WE, t WE, s (15.5) Assumption (15.5) says that the errors in both investment functions (i) have zero mean, (ii) are homoskedastic with constant variance, and (iii) are not correlated over time; autocorrelation does not exist. 2 2 The two equations do have different error variances σ and σ. GE WE GRETL ols Inv const V K modtest panel wrong in posted notes!!! Slide 15-7

Slide 15-8

Let D i be a dummy variable equal to 1 for the Westinghouse observations and 0 for the General Electric observations. If the variances are the same for both firms then we can run: INV =β +δ D +β V +δ D V +β K +δ D K + e it 1, GE 1 i 2, GE it 2 i it 3, GE it 3 i it it (15.6) Slide 15-9

So we have two separate stories Slide 15-10

( ) cov e, e = σ GE, t WE, t GE, WE (15.7) This assumption says that the error terms in the two equations, at the same point in time, are correlated. This kind of correlation is called a contemporaneous correlation. Under this assumption, the joint regression would be better than the separate simple OLS regressions Slide 15-11

(i) (ii) Econometric software includes commands for SUR (or SURE) that carry out the following steps: Estimate the equations separately using least squares; Use the least squares residuals from step (i) to estimate σ, σ and σ 2 2 GE WE GE, WE ; (iii) Use the estimates from step (ii) to estimate the two equations jointly within a generalized least squares framework. Slide 15-12

Slide 15-13

* Open and summarize data from grunfeld2.gdt (which, luckily for us, is already in wide format!!!) open "c:\program Files\gretl\data\poe\grunfeld2.gdt" system name="grunfeld" equation inv_ge const v_ge k_ge equation inv_we const v_we k_we end system estimate "Grunfeld" method=sur --geomean Slide 15-14

In GRETL the restrict command can be used to impose the cross-equation restrictions on a system of equations that has been previously defined and named. The set of restrictions is started by restrict and terminated with end restrict. Each restriction in the set is expressed as an equation. Put the linear combination of parameters to be tested on the left-hand-side of the equality and a numeric value on the right. Parameters are referenced using b[i,j] where i refers to the equation number in the system, and j the parameter number. Slide 15-15

restrict "Grunfeld" b[1,1]-b[2,1]=0 b[1,2]-b[2,2]=0 b[1,3]-b[2,3]=0 end restrict Slide 15-16

There are two situations where separate least squares estimation is just as good as the SUR technique : (i) (ii) when the equation errors are not contemporaneously correlated; when the same (the very same ) explanatory variables appear in each equation. If the explanatory variables in each equation are different, then a test to see if the correlation between the errors is significantly different from zero is of interest. Slide 15-17

(although text reads 0.729): r 2 ( ) ( )( ) 2 2 σˆ GE, WE 207.5871 GE, WE 2 2 σgeσwe = = = ˆ ˆ 777.4463 104.3079 0.53139 σ 20 20 1 1 ˆ = eˆ eˆ = eˆ eˆ GEWE, GEt, WEt, GEt, WEt, T K t 1 T 3 GE T K WE = t= 1 In this case we have 3 parameters in each equation so: K GE = K = 3. WE Slide 15-18

Testing for correlated errors for two equations: H : σ = 0 0 GE, WE LM = Tr χ under H. 2 2 GE, WE (1) 0 LM = 10.628 > 3.84 (Breusch-Pagan test of independence: chi2(1)) Hence we reject the null hypothesis of no correlation between the errors and conclude that there are potential efficiency gains from estimating the two investment equations jointly using SUR. Slide 15-19

Testing for correlated errors for three equations: H0 : σ 12 =σ 13 =σ 23 = 0 ( ) LM = T r + r + r χ 2 2 2 2 12 13 23 (3) Slide 15-20

Testing for correlated errors for M equations: M i 1 LM T r = i= 2 j= 1 2 ij Under the null hypothesis that there are no contemporaneous correlations, this LM statistic has a χ 2 -distribution with M(M 1)/2 degrees of freedom, in large samples. Slide 15-21

H : β =β, β =β, β =β 0 1, GE 1, WE 2, GE 2, WE 3, GE 3, WE (15.8) Most econometric software will perform an F-test and/or a Wald χ 2 test; in the context of SUR equations both tests are large sample approximate tests. The F-statistic has J numerator degrees of freedom and (MT K) denominator degrees of freedom, where J is the number of hypotheses, M is the number of equations, and K is the total number of coefficients in the whole system, and T is the number of time series observations per equation. The χ 2 -statistic has J degrees of freedom. Slide 15-22

SUR is OK when the panel is long and narrow, not when it is short and wide. Consider instead y =β +β x +β x + e it 1it 2it 2it 3it 3it it (15.9) We cannot consistently estimate the 3 N T parameters in (15.9) with only NT total observations. But we can impose some more structure β =β, β =β, β =β 1it 1i 2it 2 3it 3 (15.10) We consider only one-way effects and assume a common slope parameters across cross-sectional units Slide 15-23

All behavioral differences between individual firms and over time are captured by the intercept. Individual intercepts are included to control for these firm specific differences. y =β +β x +β x + e it 1i 2 2it 3 3it it (15.11) Slide 15-24

1 i= 1 1 i= 2 1 i= 3 D1 i =, D2i =, D3i =, etc. 0 otherwise 0 otherwise 0 otherwise INV =β D +β D + +β D +β V +β K + e it 11 1i 12 2i 1,10 10i 2 2it 3 3it it (15.12) This specification is sometimes called the least squares dummy variable model, or the fixed effects model. Slide 15-25

Slide 15-26

H H : β =β = =β 0 11 12 1N : the β are not all equal 1 1i (15.13) These N 1= 9 joint null hypotheses are tested using the usual F-test statistic. In the restricted model all the intercept parameters are equal. If we call their common value β 1, then the restricted model is: INV =β +β V +β K + e it 1 2 it 3 it it So this is just OLS, the pooled model Slide 15-27

reg inv v k Slide 15-28

F = ( R U) SSE ( NT K ) SSE SSE J U ( ) 522855 ( 200 12) 1749128 522855 9 = = 48.99 We reject the null hypothesis that the intercept parameters for all firms are equal. We conclude that there are differences in firm intercepts, and that the data should not be pooled into a single model with a common intercept parameter. Slide 15-29

yit =β 1i +β 2x2it +β 3x3it + eit t = 1,, T (15.14) 1 T it 1i 2 2it 3 3it it t 1 ( y x x e ) =β +β +β + T = T T T T 1 1 1 1 y = y =β +β x +β x + e i it 1i 2 2it 3 3it it T t= 1 T t= 1 T t= 1 T t= 1 (15.15) =β +β x +β x + e 1i 2 2i 3 3i i Slide 15-30

y =β +β x +β x + e it 1i 2 2it 3 3it it ( y =β +β x +β x + e) i 1i 2 2i 3 3i i (15.16) y y =β ( x x ) +β ( x x ) + ( e e) it i 2 2it 2i 3 3it 3i it i y =β x +β x + e it 2 it 3 it it (15.17) Slide 15-31

Slide 15-32

Usually, there is no interest in the intercepts. INV.1098.3106 it = V it + K (se*) (.0116) (.0169) it (15.18) 2 * ( ) σ ˆ e = SSE NT 2 ( NT ) ( NT N ) 2 2 = 198 188 = 1.02625 Slide 15-33

Some software comes up with one sometimes though Or if wanted you should be able to retrieve the individual ones Slide 15-34

y = b + bx + bx i 1i 2 2i 3 3i b1 i = yi bx 2 2i bx 3 3i i= 1,, N (15.19) Slide 15-35

ONE PROBLEM: Even with the trick of using the within estimator, we still implicitly (even if no longer explicitly) include N-1 dummy variables in our model (not N, since we remove the intercept), so we use up N-1 degrees of freedom. It might not be then the most efficient way to estimate the common slope ANOTHER ONE. By using deviations from the means, the procedure wipes out all the static variables, whose effects might be of interest In order to overcome this problem, we can consider the random effects/or error components model Slide 15-36

In the RE model, the individual firm differences are thought to represent a random variation about some average intercept for the individual in the sample Rather than a separate fixed effect for each firm, we now estimate an overall intercept that represents this average Implicitly, the regression function for the sample firms vary randomly around this average. The variability of the individual effects is captured by a new parameter, which is the variance of the random effect. The larger this parameter is, the more variation you find in the implicit regression functions for the firms.

β =β+u 1i 1 i Average intercept (15.20) ( ) ( ) 2 = 0, cov, = 0, var ( ) = σ E u u u u i i j i u (15.21) y =β +β x +β x + e it 1i 2 2it 3 3it it Randomness of the intercept ( ) = β + u +β x +β x + e 1 i 2 2it 3 3it it Usual error (15.22) Slide 15-38

( ) y =β +β x +β x + e + u it 1 2 2it 3 3it it i =β +β x +β x + v 1 2 2it 3 3it it a composite error (15.23) vit = ui + eit (15.24) Because the random effects regression error has two components, one for the individual and one for the regression, the random effects model is often called an error components model. Slide 15-39

( ) = ( + ) = ( ) + ( ) = 0+ 0= 0 E v E u e E u E e it i it i it ( v ) var ( u e ) σ = var = + 2 v it i it v has zero mean ( u ) ( e ) ( u e ) = var + var + 2cov, =σ +σ 2 2 u e i it i it v has constant variance If there is no correlation between the individual effects and the error term (15.25) Slide 15-40

But now there are several correlations that can be considered. The correlation between two individuals, i and j, at the same point in time, t. The covariance for this case is given by ( v v ) = Evv = E ( u+ e)( u + e ) cov, ( ) it jt it jt i it j jt ( ) ( ) ( ) ( ) = E uu + E ue + E e u + E e e i j i jt it j it jt = 0+ 0+ 0+ 0= 0 Slide 15-41

The correlation between errors on the same individual (i) at different points in time, t and s. The covariance for this case is given by ( v v ) E v v E ( u e )( u e ) cov it, is = ( it is) = i + it i + is ( 2 ) i ( i is ) ( it i ) ( it is ) = E u + E ue + E e u + E e e (15.26) =σ + 0+ 0+ 0=σ 2 2 u u Slide 15-42

The correlation between errors for different individuals in different time periods. The covariance for this case is ( v v ) = Evv = E ( u+ e)( u + e ) cov, ( ) it js it js i it j js ( ) ( ) ( ) ( ) = E uu + E ue + E e u + E e e i j i js it j it js = 0+ 0+ 0+ 0= 0 Slide 15-43

cov( v, v ) σ ρ= corr( vit, vis ) = = var( v ) var( ) σ +σ it 2 it is u 2 2 vis u e (15.27) The errors are correlated over time for a given individual, but are otherwise uncorrelated This correlation does not dampen over time as in the AR1 model Slide 15-44

y =β +β x +β x + e it 1 2 2it 3 3it it LM e = y b bx bx ˆit it 1 2 2 it 3 3 it 2 N T ˆ eit NT i= 1 t= 1 = 1 N T 2( T 1) 2 eˆ it i= 1 t= 1 (15.28) GRETL shows this Breusch and Pagan Lagrange multiplier test for random effects by default Slide 15-45

GRETL shows by default this Breusch and Pagan Lagrangian multiplier test for RE with the null of no variation about a mean (effects are fixed) in the individual effects. This is xttest0 in Stata If H0 is not rejected you can use pooled OLS if the effects are common and the FE if they differ by group

GRETL shows by default this Breusch and Pagan Lagrangian multiplier test for RE with the null of no variation about a mean (effects are fixed) in the individual effects.

GRETL also shows the Hausman test of the null hypothesis that the random effects are indeed random. If they are random, then they should not be correlated with any of your other regressors. If they are correlated with other regressors, then you should use the FE estimator to obtain consistent parameter estimates of your slopes

y =β x +β x +β x + v * * * * * it 1 1it 2 2it 3 3it it (15.29) y = y α y, x = 1 α, x = x α x, x = x αx * * * * it it i 1it 2it 2it 2i 3it 3it 3i (15.30) α= σ e 1 2 2 Tσ u +σe (15.31) Is the transformation parameter Slide 15-49

σˆ e.1951 α= ˆ 1 = 1 =.7437 Tσ ˆ +σˆ 5.1083 +.0381 2 2 u e Is the transformation parameter ( ) Slide 15-50

There are different ways to calculate FE (some packages will calculate an intercept, some won t) There are different ways to calculate sigma-sq (STATA in textbook and GRETL will give you slightly different results!)

Pooled OLS vs different intercepts: test (use a Chow type, after FE or run RE and test if the variance of the intercept component of the error is zero (Breusch- Pagan test (xttest0 in STATA)) You cannot pool onto OLS? Then Choose between FE vs RE: (Hausman test) GRETL summary tests: panel Inv const V K --pooled Different slopes too perhaps? => use SURE or RCM and test for equality of slopes across units

Note that there is within variation versus between variation The OLS is an unweighted average of the between estimator and the within estimator The RE is a weighted average of the between estimator and the within estimator The FE is also a weighted average of the between estimator and the within estimator with zero as the weight for the between part

The RE is a weighted average of the between estimator and the within estimator The FE is also a weighted average of the between estimator and the within estimator with zero as the weight for the between part So now you see where the extra efficiency of RE comes from!...

The RE uses information from both the crosssectional variation in the panel and the time series variation, so it mixes LR and SR effects The FE uses only information from the time series variation, so it estimates SR* effects

With a panel, we can learn about dynamic effects from a short panel, while we need a long time series on a single cross-sectional unit, to learn about dynamics from a time series data set

If the random error v = u + e it i it is correlated with any of the right-hand side explanatory variables in a random effects model then the least squares and GLS estimators of the parameters are biased and inconsistent. This bias creeps in through the between variation, of course, so the FE model will avoid it Slide 15-57

yit =β 1+β 2x2it +β 3x3 it + ( ui + eit ) (15.32) T T T T T 1 1 1 1 1 y = y =β +β x +β x + u + e i it 1 2 2it 3 3it i it T t= 1 T t= 1 T t= 1 T t= 1 T t= 1 (15.33) =β +β x +β x + u + e 1 2 2i 3 3i i i Slide 15-58

y =β +β x +β x + u + e it 1 2 2it 3 3it i it ( y =β +β x +β x + u + e) i 1 2 2i 3 3i i i (15.34) y y =β ( x x ) +β ( x x ) + ( e e) it i 2 2it 2i 3 3it 3i it i Slide 15-59

t bfe, k bre, k bfe, k bre, k = = 2 2 var ( b, ) var (, ) se( FE, k ) se FE k b b RE k ( bre, k ) 12 12 (15.35) We expect to find var var b > 0. ( b ) ( ) FE, k RE, k ( b b ) = ( b ) + ( b ) ( b b ) var var var 2cov, FE, k RE, k FE, k RE, k FE, k RE, k ( b ) var ( b ) = var FE, k RE, k because Hausman proved that ( b b ) = ( b ) cov, var. FE, k RE, k RE, k Slide 15-60

The test statistic to the coefficient of SOUTH is: t b b.0163 (.0818) FE, k RE, k = = = 2 2 12 2 2 12 se FE, k RE, k ( b ) se( b ) (.0361) (.0224) 2.3137 Using the standard 5% large sample critical value of 1.96, we reject the hypothesis that the estimators yield identical results. Our conclusion is that the random effects estimator is inconsistent, and we should use the fixed effects estimator, or we should attempt to improve the model specification. Slide 15-61

If the random error v = u + e it i it is correlated with any of the righthand side explanatory variables in a random effects model then the least squares and GLS estimators of the parameters are biased and inconsistent. Then we would have to use the FE model But with FE we lose the static variables? Solutions? HT, AM, BMS, instrumental variables models could help Slide 15-62

Further issues We can generalise the random effects idea and allow for different slopes too: Random Coefficients Model Again, the now it is the slope parameters that differ, but as in RE model, they are drawn from a common distribution The RCM in a way is to the RE model what the SURE model is to the FE model Slide 15-63

Further issues Unit root tests and Cointegration in panels Dynamics in panels Slide 15-64

Further issues Of course it is not necessary that one of the dimensions of the panel is time as such Example: i are students and t is for each quiz they take Of course we could have a one-way effect model on the time dimension instead Or a two-way model Or a three way model! But things get a bit more complicated there Slide 15-65

Further issues Another way to have more fun with panel data is to consider dependent variables that are not continuous Logit, probit, count data can be considered STATA has commands for these Based on maximum likelihood and other estimation techniques we have not yet considered Slide 15-66

Further issues You can understand the use of the FE model as a solution to omitted variable bias If the unmeasured variables left in the error model are not correlated with the ones in the model, we would not have a bias in OLS, so we can safely use RE If the unmeasured variables left in the error model are correlated with the ones in the model, we would have a bias in OLS, so we cannot use RE, we should not leave them out and we should use FE, which bundles them together in each cross-sectional dummy Slide 15-67

Further issues Another criterion to choose between FE and RE If the panel include all the relevant cross-sectional units, use FE, if only a random sample from a population, RE is more appropriate (as long as it is valid) Slide 15-68

Readings Wooldridge s book on panel data Baltagi s book on panel data Greene s coverage is also good Slide 15-69

Balanced panel Breusch-Pagan test Cluster corrected standard errors Contemporaneous correlation Endogeneity Error components model Fixed effects estimator Fixed effects model Hausman test Heterogeneity Least squares dummy variable model LM test Panel corrected standard errors Pooled panel data regression Pooled regression Random effects estimator Random effects model Seemingly unrelated regressions Unbalanced panel Slide 15-70

Slide 15-71

yit =β 1+β 2x2it +β 3x3 it + ( ui + eit ) (15A.1) y y =β ( x x ) +β ( x x ) + ( e e) it i 2 2it 2i 3 3it 3i it i (15A.2) σ ˆ = 2 e SSEDV NT N K slopes (15A.3) Slide 15-72

yi =β 1+β 2x2i +β 3x3i + ui + ei i= 1,, N (15A.4) T var ( ui + ei) = var ( ui) + var ( ei) = var ( ui) + var eit T t= 1 1 Tσ =σ + var =σ + T 2 2 2 e u 2 eit u 2 T t= 1 T 2 σ =σ u + T 2 e (15A.5) Slide 15-73

2 2 σe SSEBE u T N KBE σ + = (15A.6) 2 2 2 2 σ ˆ e σe SSEBE SSEDV σ ˆ u =σ u + = T T N K T NT N K BE ( ) slopes (15A.7) Slide 15-74