Panel Data Model (January 9, 2018)

Size: px

Start display at page:

Download "Panel Data Model (January 9, 2018)"

Josephine Walker
5 years ago
Views:

Ch 11 Panel Data Model (January 9, 2018) 1 Introduction Data sets that combine time series and cross sections are common in econometrics For example, the published statistics of the OECD contain

periodically from 1968 to the present Data of this type are often referred to as panel data Estimation of relationships that combine time series and cross-sectional data is a problem frequently

1 Ch 11 Panel Data Model (January 9, 2018) 1 Introduction Data sets that combine time series and cross sections are common in econometrics For example, the published statistics of the OECD contain numerous series of economic aggregate observed yearly for many countries The Panel Study of Income Dynamics (PSID) is a study of roughly 6000 families and individuals who have been interviews periodically from 1968 to the present Data of this type are often referred to as panel data Estimation of relationships that combine time series and cross-sectional data is a problem frequently encountered in econometrics Typically, one may possesses several years of data on a number of firms, households, geographical areas, or biological units The problem, when using these data to estimate a relationship, is to specify a model that will adequately allow for difference in behavior over cross-sectional units as well as any difference in behavior over time for a given cross-sectional units Once a model has been specified, there are the additional problems of the most efficient estimation procedure and how to test hypothesis about the parameters For simplicity, we restrict our attention to the linear regression model Y it = x itβ + u it, i = 1, 2,, N, t = 1, 2,, T, (11-1) where x it is a k 1 vector of observations on explanatory variables There are assumed to be N cross-sectional units and T time periods, for a total of N T observations If each u it has expectation zero conditional on its corresponding x it, we can estimate (11-1) by OLS But the OLS is not efficient if the u it is not iid, and the iid assumption is rarely realistic with panel data 1

2 Ch11 Panel Data 1 INTRODUCTION If certain shocks affect the same cross-sectional unit at all point in time, the error terms u it and u is must be correlated (serially correlated) for all t s Similarly, if certain shocks affect all cross-sectional units at the same point in time, the error term u it and u jt must be correlated (cross-dependent) for all i j In consequence, if we use OLS, not only do we obtain inefficient parameter estimates, but we could possibly obtain inconsistent parameter estimate This happen, for example, when x it contains lagged dependent variables and the u it is serially correlated R 2016 by Prof Chingnun Lee 2 Insof Economics,NSYSU,Taiwan

3 Ch11 Panel Data 2 ERROR-COMPONENTS MODEL 2 Error-Components Model The two most popular approaches for dealing with panel data are both based on what are called error-components models This idea is to specify the error term u it in (11-1) as consisting of two or three separate shocks, each of which is assumed to be independent of the other A fairly general specification is u it = λ t + v i + ε it (11-2) Here λ t affects all observations for time period t, v i affects all observations for crosssectional unit i, and ε it affects only observation it It is generally assumed that λ t are independent across t, 1 the v i are independent across i, and ε it are independent across i and t In order to estimate an error-components model, λ t and v i can be regarded as being either fixed or random If the λ t and v i are thought of as fixed effects or are correlated with the explanatory variables x it, then they are treated as parameters to be estimate It turns out that they can then be estimated by OLS using dummy variables If they are thought of as random effects, as disturbances that are uncorrelated with x it, then we must figure out the covariance matrix of u it, and use the feasible GLS Each of these approaches can be appropriate in some circumstances but may be inappropriate in others In what follows of this chapter, we simplify the error-components specification (11-2) by eliminating the λ t Thus we assume that there are shocks specific to each crosssectional unit, or group, but no time-specific shocks This assumption is often made in empirical work, 2 and it considerably simplifies the algebra 21 Pooled, Fixed and Random Effects Models With the assumption that there are shocks specific to each cross-sectional unit or group, the fundamental advantage of a panel data set over a cross section is that it will allow the researcher greater flexibility in model difference in behavior across individuals 1 It is to be noted that although λ t is independent across t, u it is still correlated It is because E(u it u jt ) 0 2 That is, the assumptions are oriented toward cross-section analysis; they are wide but typically short (relatively) Heterogeneity across units is an integral part of this analysis R 2016 by Prof Chingnun Lee 3 Insof Economics,NSYSU,Taiwan

4 Ch11 Panel Data 2 ERROR-COMPONENTS MODEL To consider whether an overall constant or a individual-specific constant in the model, we consider the following basic framework of the form Y it = x itβ + z iα + ε it, i = 1, 2,, N; t = 1, 2,, T (11-3) There are k regressors in x it, not including a constant term The heterogeneity, or individual effect is z iα where z i contains a constant term and a set of individual or group specific variables, which may be observed, such as race, sex, location, and so on or unobserved, such as family specific characteristics, individual heterogeneity in skill or preference, and so on, all of which are taken to be constant over time t The various cases we will consider are: (a) Pooled Regression: If z i contains only a constant term, then there is no individual specific characteristics in this model All we need is pooling the data, Y it = x itβ + α + ε it, i = 1, 2,, N; t = 1, 2,, T (11-4) and OLS provides consistent and efficient estimates of the common β and α (b) Fixed Effects: If z iα = α i, then it is the fixed effect approach to take α i as a group-specific variable in the regression model, Y it = x itβ + α i + ε it, i = 1, 2,, N; t = 1, 2,, T (11-5) It should be noted that the term fixed as used here signifies the correlation of α i and x it, not that α i is non-stochastic The α i could be in fact be random, 3 the essential thing is that they must be independent of the error term ε it They may, however, be correlated with the explanatory variables x it (c) Random Effects: If the individual heterogeneity is an unobserved random variable and can be assumed to be uncorrelated with the included explanatory variables, then the model may be formulated as Y it = x itβ + E(z iα) + [z iα E(z iα)] + ε it = x itβ + α + v i + ε it, i = 1, 2,, N; t = 1, 2,, T (11-6) 3 We can regard α i as a explanatory variable like x it R 2016 by Prof Chingnun Lee 4 Insof Economics,NSYSU,Taiwan

5 Ch11 Panel Data 2 ERROR-COMPONENTS MODEL The random effects approach specifies that v i is a group specific random element, similar to ε it except that for each group, there is but a single draw that enters the regression identically in each period R 2016 by Prof Chingnun Lee 5 Insof Economics,NSYSU,Taiwan

6 Ch11 Panel Data 3 POOLED REGRESSION 3 Pooled Regression We begin the analysis by assuming the simplest version of the model, ie, there is no heterogeneity among the groups, the pooled model, Y it = x itβ + α + ε it, i = 1, 2,, N; t = 1, 2,, T Stack the T observations for individual i in a single equation, 4 y i = X i β + εi, i = 1, 2,, N, or 5 y = X β + ε, where β = [α, β ] now includes the constant term If it is assumed that ε i is well behaved: E(ε it X i ) = 0, t = 1, 2,, T E(ε i ε i X i ) = σ 2 I T ; E(ε i ε j X i ) = 0 if i j, then the most efficient estimator of β is ˆ β = (X X) 1 X y, with estimated variance 4 Here, V ar(ˆ β) = s 2 (X X) 1, in which y i = y i,1 y i,2 y i,t, X i = x i,1 x i,2 x i,t 5 Here, y = y 1 y 2 y N, X = X 1 X 2 X N R 2016 by Prof Chingnun Lee 6 Insof Economics,NSYSU,Taiwan

7 Ch11 Panel Data 3 POOLED REGRESSION s 2 = e e Here, e = y Xˆ β NT k It is quite likely that there is correlation across observations such that E(ε i ε i X i ) = Ω i, then in the same spirit as the White estimator, we can estimate the appropriate asymptotic covariance matrix as Asy V ( N ) ar(ˆ β) = (X X) 1 X ie i e ix i (X X) 1 i=1 Exercise 1 Reproduce the results at Example 91 (Table 91) on p 187 of Greene 6th edition R 2016 by Prof Chingnun Lee 7 Insof Economics,NSYSU,Taiwan

8 Ch11 Panel Data 4 FIXED-EFFECTS MODEL 4 Fixed-Effects Model This formulation of the model assumes that differences across units can be captured in difference in the constant term Each α i is the unobserved time-invariant individual effect: 6 Y it = x itβ + α i + ε it, i = 1, 2,, N; t = 1, 2,, T Unlike the random effects model where the unobserved α i is independent of x it for all t = 1,, T, the fixed effects (FE) model allows α i to be correlated with the regressor matrix x it Strict exogeneity with respect to the idiosyncratic error term ε it, however, is still required Let y i and X i be the T observations the ith unit, i be a T 1 column of ones, and let ε i be associated T 1 vector of disturbances Then the matrix form of (11-4) for an individual i can be written as y i = X i β + iα i + ε i, i = 1, 2,, N (11-7) It is also assumed that the disturbance terms are well behaved, that is E(ε i ε i X i ) = σ 2 I T ; (11-8) E(ε i ε j X i ) = 0 if i j, E(ε it X i, α i ) = 0, t = 1, 2,, T (11-9) 41 Fixed Effects Transformation (Within Transformation) 411 Estimation of β Collect all the cross-sectional unit in (11-7) to be matrix form we have y 1 X 1 i 0 0 α 1 ε 1 y 2 X 2 0 i 0 0 α 2 ε 2 = β + +, y N X N 0 0 i α N ε N 6 Unlike x it, α i cannot be directly observed R 2016 by Prof Chingnun Lee 8 Insof Economics,NSYSU,Taiwan

9 Ch11 Panel Data 4 FIXED-EFFECTS MODEL or in more compact form y = Xβ + Dα + ε, where y and ε are NT 1, X is NT k, β is k 1, α = [α 1, α 2,, α N ] and i i 0 0 D = = [d 1 d 2 d N ] 0 0 i is NT N with d i is a dummy variables indicating the ith unit This model is usually also referred to as the least squares dummy variable (LSDV) model Since this model satisfies the ideal conditions, OLS estimator is BLUE By using the familiar partitioned regression of Ch 6, the slope estimator would be 7 where ˆβ F E = (X M D X) 1 X M D y, (11-10) M D = I NT D(D D) 1 D Lemma (Group-Demeaned Matrix) M D = I NT D(D D) 1 D = M i M i M i, where M i = I T i(i i) 1 i = I T 1/T (ii ) is the demean-matrix 7 Since α i is not observable, it cannot be directly controlled for The FE model eliminates α i by demeaning the variables using the within transformation R 2016 by Prof Chingnun Lee 9 Insof Economics,NSYSU,Taiwan

10 Ch11 Panel Data 4 FIXED-EFFECTS MODEL Proof By definition, D D = = and therefore i i i i i i i i i i i i = T T T, N N M D = I NT D(D D) 1 D I T I T 0 0 = 0 0 I T T T T 1 i i i i i i R 2016 by Prof Chingnun Lee 10 Insof Economics,NSYSU,Taiwan

11 Ch11 Panel Data 4 FIXED-EFFECTS MODEL = = I T 1 T ii I T 1 T ii I T 1 T ii M i M i M i and It is easy to see that the matrix M D is idempotent and that M i 0 0 y 1 y 1 0 M i 0 0 y 2 Ȳ1i y 2 Ȳ2i M D y = = 0 0 M 0 i y N y N ȲNi M D X = M i M i M i X 1 X 2 X N = M i X 1 M i X 2 M i X N where the scalar Ȳi = 1/T T t=1 Y it, i = 1, 2,, N, and let X i = [x i1 x i2 x ik ], 8 then, 8 That is, X i = x i1 x i2 x it = [x i1 x i2 x ik ] = X i11 X i12 X i1k X i21 X i22 X i2k X it 1 X it 2 X it k R 2016 by Prof Chingnun Lee 11 Insof Economics,NSYSU,Taiwan

12 Ch11 Panel Data 4 FIXED-EFFECTS MODEL M i X i = [M i x i1 M i x i2 M i x ik ] Therefore M i x ij = x ij X ij i, j = 1, 2,, k with X ij = 1/T T t=1 X itj Denote x i = [ X i1 Xi2 X ik ], the least squares regression estimation of fixed effects slope estimator ˆβ F E in (11-10), ie M D y on M D X is equivalent to regression of [Y it Ȳi] on [x it x i ] 9 The idea for estimating β is to transform the equations to eliminate the fixed effect α i The above fixed effects transformation are called within transformation To see this, the FE transformation is obtained by first averaging (11-5) over t = 1,, T to get the cross-sectional equation Ȳ i = x iβ + α i + ε i, (11-11) where x i = T 1 T t=1 x it and ε i = T 1 T t=1 ε it Subtracting (11-11) from (11-5) for each t gives the FE transformation equation, Y it Ȳi = (x it x i)β + ε it ε i (11-12) The time demeaning of the original equation has removed the individual specific effect α i Exercise 2 Show that (ε it ε i ) in Eq (11-12) satisfies the ideal conditions Denote the LSDV residuals to be 412 Estimation of α e = y X ˆβ F E D ˆα, (11-13) 9 To see this, recall the notation that M i X i = X i11 X i1 X i12 X i2 X i1k X ik X i21 X i1 X i22 X i2 X i2k X ik X it 1 X i1 X it 2 X i2 X it k X ik = x i1 x i x i2 x i x it x i R 2016 by Prof Chingnun Lee 12 Insof Economics,NSYSU,Taiwan

13 Ch11 Panel Data 4 FIXED-EFFECTS MODEL then the dummy variables coefficient estimators can be recovered from or D y = D X ˆβ + D D ˆα + D e, ˆα = (D D) 1 D (y X ˆβ F E ), (11-14) because D e = 0 This implies that ˆα 1 i 0 0 ˆα 2 0 i 0 0 = 1 T ˆα N 0 0 i = 1 T T t=1 y 1t 1 T T t=1 y 2t 1 T T t=1 y Nt y 1 X 1 ˆβF E y 2 X 2 ˆβF E y N X N ˆβF E 1 T T t=1 [X 1t1 X 1t2 X 1tk ] 1 T T t=1 [X 2t1 X 2t2 X 2tk ] 1 T T t=1 [X Nt1 X Nt1 X Ntk ] ˆβ F E = Ȳ 1 x 1 ˆβ F E Ȳ 2 x 2 ˆβ F E Ȳ N x N ˆβ F E 413 Variance of the Fixed Effects Estimators The variance of the Fixed-Effect estimators ˆβ F E and ˆα are in the following R 2016 by Prof Chingnun Lee 13 Insof Economics,NSYSU,Taiwan

14 Ch11 Panel Data 4 FIXED-EFFECTS MODEL Theorem (Var( ˆβ F E )) Let the fixed effects model be partitioned as y = X ˆβ F E + D ˆα + e, the variance of ˆβ F E is V ar( ˆβ F E ) = σ 2 (X M D X) 1 Proof ˆβ = (X M D X) 1 X M D y = β + (X M D X) 1 X M D ε, therefore, V ar( ˆβ) = E[( ˆβ β)( ˆβ β) ] = E[((X M D X) 1 X M D ε)((x M D X) 1 X M D ε) ] = E[(X M D X) 1 X M D εε M D X(X M D X) 1 ] (11-15) = σ 2 [(X M D X) 1 X M D I NT M D X(X M D X) 1 ] = σ 2 [(X M D X) 1 X M D X(X M D X) 1 ] = σ 2 (X M D X) 1 (11-16) With the above results, the appropriate estimator of V ar( ˆβ) is therefore Est(V ar( ˆβ)) = s 2 (X M D X) 1, where the disturbance variance estimator is s 2 s 2 = (y X ˆβ D ˆα) (y X ˆβ D ˆα) N T i=1 t=1 = (Y it x ˆβ it ˆα i ) 2 (11-17) NT N k NT N k Theorem (Var( ˆα)) The variance of individual effect estimator is V ar(ˆα i ) = σ2 T + x iv ar( ˆβ) x i R 2016 by Prof Chingnun Lee 14 Insof Economics,NSYSU,Taiwan

15 Ch11 Panel Data 4 FIXED-EFFECTS MODEL Proof ˆα i = ȳ i x ˆβ i = 1 T (α i + x T itβ + ε it ) x ˆβ i t=1 The expected value of ˆα i therefore is E(ˆα i ) = 1 T = α i T (α i + x itβ) x iβ t=1 The variance of ˆα i therefore is V ar(ˆα i ) = E(α i ˆα i ) 2 [ ] 2 1 T = E ε it x T i( ˆβ β) t=1 ( T ) 2 = 1 T E ε 2 it + E( x i( ˆβ β)) 2 t=1 = σ2 T + E[ x i( ˆβ β)( ˆβ β) x i ] = σ2 T + x iv ar( ˆβ) x i 414 Testing the Significance of the Group Effects Consider the null hypothesis that H 0 : α 1 = α 2 = = α N = α Under this null hypothesis, the efficient estimator is the pooled least squares The F ration used for the test would be F N 1,NT N k = (R2 LSDV R2 P ooled )/(N 1) (11-18) (1 RLSDV 2 )/(NT N k), where RLSDV 2 indicates the R2 from the dummy variables model and RP 2 ooled indicates the R 2 from the pooled or restricted model with only a single overall constant R 2016 by Prof Chingnun Lee 15 Insof Economics,NSYSU,Taiwan

16 Ch11 Panel Data 4 FIXED-EFFECTS MODEL Example See Example 94 (p199) of Greene 6th edition 415 Nonspherical Disturbance and Robust Variance Matrix Estimator Without assumption (11-8), express (11-16) gives an improper variance matrix estimator While heteroscedasticity in ε it is always a potential problem, the more substantive problem is cross-observation correlation, or autocorrelation In a longitudinal data set, the group of observations may all pertain to the same individual, so any latent effects left out of the model will carry across all periods Suppose, we assume that the disturbance is (instead of (11-8)) E(ε i ε i) = Ω i, (11-19) following (11-15), then the robust variance matrix estimator of ˆβ F E is ( N ) AsyV ar( ˆβ) = (X M D X) 1 Ẍ ie i e iẍi (X M D X) 1, (11-20) i=1 where M D X = Ẍ = [Ẍ 1 Ẍ 2, Ẍ N ] and e i is the vector of T residuals for the ith individual (see (11-13)) This is suggested by Arellano (1987) and follows the general results of White (1984) The robust variance matrix estimator is valid in the presence of any heteroscedasticity and serial correlation in ε it, provided that T is small relative to N 42 First Differencing Transformation We can also use differencing to eliminate the unobserved effect α i Lagging the model (11-5) one period and subtracting gives Y it = x itβ + ε it, i = 1, 2,, N; t = 2,, T, (11-21) R 2016 by Prof Chingnun Lee 16 Insof Economics,NSYSU,Taiwan

17 Ch11 Panel Data 4 FIXED-EFFECTS MODEL where Y it = Y it Y i,t 1, x it = x it x i,t 1 and ε it = ε it ε i,t 1 As with the fixed effects transformation, this first-differencing transformation eliminates the unobserved effect α i In differencing we lose the first time period for each cross section The first-difference (FD) estimator, ˆβF D, the pooled OLS estimator from the regression Y it on x it, t = 2,, T ; i = 1, 2,, N (11-22) Denote υ it = ε it When we assume that E(υ i υ i) = σ 2 υi T 1, (11-23) then the variance of ˆβ F D can be estimated by V ar( ˆβ F D ) = ˆσ 2 υ( X X) 1, where ˆσ 2 υ is a consistent estimator of σ 2 υ The simplest estimator is obtained by computing the OLS residuals ˆυ it = Y it x ˆβ it F D from the pooled regression (11-22) A consistent estimate of συ 2 is N T ˆσ υ 2 i=1 t=2 = ˆυ it N(T 1) k If the assumption (11-23) is violated, then, as usual, we can compute a robust variance matrix which is ( N ) asy V ar( ˆβ F D ) = ( X X) 1 X iυ i υ i X i ( X X) 1, where X denote the N(T 1) k matrix of stacked first difference of x it i=1 R 2016 by Prof Chingnun Lee 17 Insof Economics,NSYSU,Taiwan

18 Ch11 Panel Data 4 FIXED-EFFECTS MODEL 43 The Within and Between Groups Estimators We could formulate a pooled regression model in three ways (a) First, the original formulation is Y it = α + x itβ + ε it, i = 1, 2,, N; t = 1, 2,, T (11-24) (b) In term of deviations from the group means, Y it Ȳi = (x it x i ) β + ε it ε i, i = 1, 2,, N; t = 1, 2,, T, (11-25) (c) and in terms of the group means, Ȳ i = α + x iβ + ε i, i = 1, 2,, N (11-26) To estimate β by pooled OLS in (11-24) (partial out the effect of α) we would use the total sum of squares and cross products, and S total xx = S total xy = N i=1 N i=1 T (x it x)(x it x) t=1 T (x it x)(y it Ȳ ), t=1 where x = 1 N T NT i=1 t=1 x it and Ȳ = 1 N T NT i=1 t=1 Y it In (11-25), the moments matrices we use are within-group s (ie, deviations from the group means) sums of squares and cross products, and S W ithin xx = S W ithin xy = N i=1 N i=1 T (x it x i )(x it x i ) (11-27) t=1 T (x it x i )(Y it Ȳi), (11-28) t=1 R 2016 by Prof Chingnun Lee 18 Insof Economics,NSYSU,Taiwan

19 Ch11 Panel Data 4 FIXED-EFFECTS MODEL Finally, for (11-26), the means of group mean are the overall mean (ie, 1/N N i=1 Ȳi = Ȳ ) Therefore the moment matrices are the between-groups sums of squares and cross products, and S Between xx = S Between xy = N T ( x i x)( x i x) i=1 N T ( x i x)(ȳi Ȳ ) i=1 and It is easy to verify that S T otal xx S T otal xy = S W ithin xx = S W ithin xy + S Between xx, + S Between xy Therefore, there are three possible least squares estimator of β corresponding to theses decompositions (a) The least squares estimator in the pooling regression (11-24) is ˆβ T otal = (S T xx otal ) 1 S T xy otal = (S W ithin xx + S Between xx ) 1 (S W xy ithin + S Between xy ) (b) The within-groups estimator from (11-25) is ˆβ W ithin = (S W xx ithin ) 1 S W xy ithin ( N ) 1 ( T N ) T = (x it x i )(x it x i ) (x it x i )(Y it Ȳi) i=1 t=1 i=1 t=1 This is the LSDV estimator computed earlier R 2016 by Prof Chingnun Lee 19 Insof Economics,NSYSU,Taiwan

20 Ch11 Panel Data 4 FIXED-EFFECTS MODEL (c) An alternative estimator would be the between-groups estimator from (11-26) ˆβ Between = (S Between xx ) 1 S Between xy This is the least squares estimator based on the N sets of group means From the preceding expression we know that S W ithin xy = S W ithin xx ˆβ W ithin and hence where S Between xy = S Between xx ˆβ Between, ˆβ T otal = ( S W ithin xx = ( S W ithin xx + ( S W ithin xx ) + S Between 1 ) xx (S W xx ithin ˆβ W ithin + S Between xx ˆβ Between ) + S Between 1 xx S W ithin xx ˆβ W ithin + S Between xx ) 1 S Between = F W ithin ˆβW ithin + F Between ˆβBetween, F W ithin = (S W ithin xx F W ithin + F Between = (S W ithin xx = I xx ˆβ Between + S Between xx ) 1 S W xx ithin and, + S Between xx ) 1 (S W xx ithin + S Between xx ) That is the pooling OLS estimator is a matrix weighted average of the within- and between-groups estimator 44 Fixed Effects with Unbalanced Panels In our treatment of panel data models we have assumed that a balanced panel is available each cross section unit has the same time periods available Often, some time periods are missing for some units in the population of interest, and we are left with an unbalanced panel The preceding analysis assumed equal group sizes and relied R 2016 by Prof Chingnun Lee 20 Insof Economics,NSYSU,Taiwan

21 Ch11 Panel Data 4 FIXED-EFFECTS MODEL on the assumption at several points A modification to allow unequal group sizes is quite simple First, the full sample size is N i=1 T i instead of NT, which calls for minor modifications in the computations of s 2, V ar( ˆβ) and the F statistics Second, group means must be based on T i, which varies across group now is where The regressors moment matrix shown in (11-27), S W ithin xx = N i=1 S W ithin xx,unbalance = x i,un = 1 T i T i t=1 T (x it x i )(x it x i ) t=1 N i=1 x it ( Ti ) (x it x i,un )(x it x i,un ), (11-29) t=1 The other moment S W xy,unbalance ithin ithin and SWyy,unbalance are computed likewise No other changes are necessary for the one factor LSDV estimators R 2016 by Prof Chingnun Lee 21 Insof Economics,NSYSU,Taiwan

22 Ch11 Panel Data 5 RANDOM EFFECTS 5 Random Effects The fixed effects model allows the unobserved individual effects to be correlated with the included variables We then modeled the differences between units strictly as parametric shifts of the regression function This model might be viewed as applying only to the cross-sectional units in the study, not to additional ones outside the sample For example, an intercountry comparison may well include the full set of countries for which it is reasonable to assume that the model is constant If the individual effects are strictly uncorrelated with the regressors, then it might be appropriate to model the individual specific constant terms as randomly distributed across cross-sectional units This view would be appropriate if we believed that sampled cross-sectional units were drawn from a large population The payoff to this form is that it greatly reduces the number of parameters to be estimated The cost is the possibility of inconsistent estimates, should the assumption turn out to be inappropriate Consider the model Y it = x itβ + α + v i + ε it, i = 1, 2,, N; t = 1, 2,, T (11-30) where there are k regressors including a constant and now the single constant term α is the mean of the unobserved heterogeneity, E(z iα) in (11-6) The component v i is the random heterogeneity specific to the ith observation and is constant through time 10 We assume further for t = 1, 2,, T, E(ε it X i ) = E(v i X i ) = 0; E(ε 2 it X i ) = σε; 2 E(vi 2 X i ) = σv; 2 E(ε it v j X i ) = 0 for all i, t, and j; E(ε it ε js X i ) = 0 if t s or i j; E(v i v j X i ) = 0 if i j Denote u it = ε it + v i, 10 In this stage, we assume that E(v 2 i ) = σ2 v below which mean that v i capture all the random heterogeneity specific to the ith observation, though they have same variance R 2016 by Prof Chingnun Lee 22 Insof Economics,NSYSU,Taiwan

23 Ch11 Panel Data 5 RANDOM EFFECTS and let y i and X i (including the constant term) be the T observations of the ith unit, and let u i = [u i1, u i2,, u it ], then collecting the T observations of ith unit we have y i = X i β + u i, i = 1, 2,, N, and the variance of the disturbances would be ε i1 + v i ε i2 + v i Σ = E(u i u i) = E [ ] εi1 + v i ε i2 + v i ε it + v i ε it + v i σε 2 + σv 2 σv 2 σv 2 σv 2 σv 2 σε 2 + σv 2 σv 2 σ 2 v = = σεi 2 T + σvi 2 T i T, σv 2 σv 2 σv 2 σε 2 + σv 2 where i T is a T 1 column of ones Observations of all the cross-sectional unit can be collected and be rewritten as y 1 X 1 u 1 y 2 X 2 u 2 = β +, y N X N u N or in more compact form y = Xβ + u, where y and u are NT 1, X is NT k, β is k 1 and the variance-covariances of R 2016 by Prof Chingnun Lee 23 Insof Economics,NSYSU,Taiwan

24 Ch11 Panel Data 5 RANDOM EFFECTS the stacked model are Ω = E(uu ) = Σ Σ Σ = I N Σ (11-31) 51 Generalized Least Squares Estimation The generalized least squares estimator of the coefficients would be ( N ) 1 ( N ) β = (X Ω 1 X) 1 X Ω 1 y = X iσ 1 X i X iσ 1 y i, (11-32) i=1 if Ω are known As with many generalized least squares problem it is convenient to find a transform matrix Ω 1/2 = [I N Σ] 1/2 so that the OLS can be applied to the transformed model where Fuller and Battese (1973) show that in this case Σ 1/2 = 1 [I T θ ] σ ε T i T i T, (11-33) θ = 1 σ ε σ 2 ε + T σv 2 i=1 The transformation of y i and X i for GLS is therefore Y i1 θȳi Σ 1/2 y i = 1 Y i2 θȳi σ ε Y it θȳi (11-34) R 2016 by Prof Chingnun Lee 24 Insof Economics,NSYSU,Taiwan

25 Ch11 Panel Data 5 RANDOM EFFECTS and likewise for the rows of X i (including constant term) Eq (11-34) implies that the random-effect GLS estimator is identical to the pooled OLS (11-4) estimator when θ = 0, which happens when σv 2 = 0, 11 and equals to the within-groups, or fixed-effects, estimator when θ = 1, which happens when σε 2 = 0 (One would interpret θ as the effect that would remain if σε 2 = 0, because the only effect then would be v i In this case, the fixed and the random effects model would be indistinguishable, so this results make sense) 52 FGLS When Σ is Unknown If the variance component (σε, 2 σv) 2 are known, generalized least squares estimation (11-32) can be computed as shown earlier Of course, this is unlikely, so as usual, we must first estimate the disturbance variance and then use an FGLS procedure A heuristic approach to estimate the variance components is as follows 12 The random effects model is 521 Estimation of σ 2 ε Y it = α + x itβ + ε it + v i, i = 1, 2,, N; t = 1, 2,, T, and in term of group means, Ȳ i = α + x iβ + ε i + v i, i = 1, 2,, N Therefore, taking deviation from the group means removes the heterogeneity: Y it Ȳi = (x it x i ) β + ε it ε i, i = 1, 2,, N; t = 1, 2,, T, (11-35) 11 In that case there is no individual s specific random effect 12 Because random effects model is a FGLS, all that we need are consistent estimators of σ 2 ε and σ 2 u, there are alternative consistent estimators, see for example, Hsiao (1986, Section 33) and Wooldridge (2002, section 104, p260) R 2016 by Prof Chingnun Lee 25 Insof Economics,NSYSU,Taiwan

26 Ch11 Panel Data 5 RANDOM EFFECTS Because ( 13 T ) E (ε it ε i ) 2 = (T 1)σε, 2 t=1 if β is observable (and therefore ε is observed), then an unbiased and consistent estimator of σε 2 based on T observations in group i would be (from laws of large number) T ˆσ ε(i) 2 t=1 = (ε it ε i ) 2 T 1 Since β (without constant term) must be estimated, we may use the residuals from the LSDV estimator (which is consistent and unbiased in general) and correct the degree of freedom to form the estimator: T s 2 t=1 e(i) = (e it ē i ) 2 (11-36) T k 1 We have N such estimators, so we average them to obtain [ s 2 e = 1 N s 2 N e(i) = 1 N ] T i=1 t=1 (e it ē i ) 2 N T i=1 t=1 = (e it ē i ) 2 N T k 1 NT Nk N i=1 The degree of freedom correction in s 2 e is excessive because it assumes that α and β are re-estimated for each i The unbiased estimator would be N T ˆσ ε 2 = s 2 i=1 t=1 LSDV = (e it ē i ) 2 (11-37) NT N k where ē i should be 0 14 This is the same variance estimator as (11-17) This can be proved by denoting that ε = (ε i1 ε i2 ε it ), Then it is by assumption that ε N(0, σεi 2 T ( t=1 T ), and therefore (εit εi)2 χ 2 T raceofm 0(=T 1), then E T ) t=1 (ε it ε i ) 2 = σ 2 ε ε M 0ε σ 2 ε (T 1)σ 2 ε 14 To see this results, the fixed-effect model in terms of deviation from the group means can be derived as follows: Y it = α i + x itβ + ε it, i = 1, 2,, N; t = 1, 2,, T and in term of group means, Ȳ i = α i + x iβ + ε i, i = 1, 2,, N Therefore, taking deviation from the group means removes the heterogeneity: Y it Ȳi = (x it x i ) β + ε it ε i, i = 1, 2,, N; t = 1, 2,, T (11-38) Eq (11-38)) is exactly the same with (11-35) Therefore the variance, σ 2 ε of a random effects model can be consistently estimated from a fixed-effect model 15 It should be noted here that it seems the e it is from (11-35) where no constant term is added But from the example below, Greene indeed use the LSDV residuals to estimate σ 2 ε R 2016 by Prof Chingnun Lee 26 Insof Economics,NSYSU,Taiwan

27 Ch11 Panel Data 5 RANDOM EFFECTS 522 Estimation of σ 2 v It remains to estimate σ 2 v Back to the original model: Y it = α + x itβ + ε it + u i, i = 1, 2,, N; t = 1, 2,, T (11-39) In spite of the correlation across observations, this is a classical regression model in which the OLS estimation slope and variance estimator are both consistent and, in general, unbiased Therefore, using the OLS (pooling) residuals from the model with only a single overall constant, we have plim s 2 P ooled = plim e e NT k 1 = σ2 ε + σ 2 v This provides the two estimators needed for the variance components; the second would be ˆσ 2 u = s 2 P ooled s 2 LSDV (11-40) Example See Example 96 (Table 95) on p 207 of Greene 6th edition 53 Testing for Presence of Random Effects The absence of an unobserved random effect in the random-effect model Y it = x itβ + α + v i + ε it, i = 1, 2,, N; t = 1, 2,, T is statistical equivalent to the null hypothesis that H 0 : σv 2 = 0 To test H 0 : σv 2 = 0, we can use the simple test for serial correlation because the error u it = ε it + v i are serially uncorrelated under the H 0 : σv 2 = 0 16 Under the null hypothesis, pooled OLS is efficient and all associated pooled OLS statistics are asymptotically valid Breusch and Pagan (1980) have devised a Lagrange 16 If σv 2 0, then E(u it u is ) = E[(ε it v i )(ε is v i )] = σv 2 0 R 2016 by Prof Chingnun Lee 27 Insof Economics,NSYSU,Taiwan

28 Ch11 Panel Data 5 RANDOM EFFECTS multiplier test for the random effects model based on the OLS (pooling) residuals for 17 H 0 : σ 2 v = 0, (or Cov(u it u is ) = 0), H 1 : σ 2 v 0, the test statistics is LM = NT 2(T 1) [ N i=1 (T ē i) 2 N T 1 i=1 t=1 e2 it ] 2 = NT [ e DD 2 e 1], 2(T 1) e e where e is the OLS (pooling) residual vector and D is matrix of dummy variables as defined in fixed effects model Under the null hypothesis, LM is distributed as χ 2 with one degree of freedom Example See Example 133 on p 299 of Greene 5th edition 54 Hausman s Specification Test for the Random Effects Model Since the key consideration in choosing between a random effects and a fixed effects approach is whether α i (or v i ) and x it are correlated, it is important to have a method for testing this assumption From a purely practical standpoint, the dummy variables approach is costly in terms of degree of freedom lost, and in a wide, longitudinal data set, the random effects model has some intuitive appeals On the other hand, the fixed effects approach has one considerable virtue There is no justification for treating the individual effects as uncorrelated with other regressors, as assumed in the random effects model The random effects treatment, therefore, may suffer from the inconsistency due to omitted variables The specification test developed by Hausman (1978) is used to test for orthogonality of the random effects and the regressors Under the null hypothesis of no correlation (between α i (or v i ) and x it ), both OLS ˆβ, in the LSDV model and GLS, β, in the random effects model are consistent, but OLS is inefficient, 18 whereas under the alternative (of correlation between α i (or u i ) and x it ), OLS from LSDV is consistent, 17 Remember that a LM test statistics is a test statistics derived under the null hypothesis 18 Referring to the GLS matrix weighted average given earlier, we see that the efficient weight uses θ, whereas OLS sets θ only to 1 R 2016 by Prof Chingnun Lee 28 Insof Economics,NSYSU,Taiwan

29 Ch11 Panel Data 5 RANDOM EFFECTS but GLS is not Therefore, under the null hypothesis, the two estimates should not different systematically, and a test can be based on the difference The essential ingredient for the test is the covariance matrix of the difference vector, [ ˆβ β]: V ar[ ˆβ β] = V ar[ ˆβ] + V ar[ β] Cov[ ˆβ, β] Cov[ ˆβ, β] (11-41) Hausman s essential result is that the covariance of an efficient estimator with its difference from an inefficient estimator is zero, which implies Cov[( ˆβ β), β] = Cov[ ˆβ, β] V ar[ β] = 0 or that Cov[ ˆβ, β] = V ar[ β] Inserting this result to (11-40) produces the required covariance matrix for the test V ar[ ˆβ β] = V ar[ ˆβ] V ar[ β] = Ξ The chi-squared test is based on the Wald criterion: W = [ ˆβ β] ˆΞ 1 [ ˆβ β] χ 2 [k] For ˆΞ, we use the estimated covariance matrices of the slope estimator is the LSDV model and estimated covariance matrix in the random effects model, excluding the constant term Under the null hypothesis, W is asymptotically distributed as chi-squared with k degrees of freedom Exercise 3 Reproduce the results of Example 97 on p 209 of Greene 6th edition 55 Unbalanced Panels and Random Effects Unbalanced panels add a layer of difficulty in the random effects model The first problem can be seen in (11-31) The matrix Ω is no longer I N Σ because the R 2016 by Prof Chingnun Lee 29 Insof Economics,NSYSU,Taiwan

30 Ch11 Panel Data 5 RANDOM EFFECTS diagonal blocks in Σ i are of different sizes There is also groupwise heteroscedasticity in (11-34), because the ith diagonal block in Ω 1/2 is and Σ 1/2 i = 1 σ ε [ I Ti θ T i i Ti i T i ], (11-42) θ i = 1 σ ε σ 2 ε + T i σv 2 The transformation of y i and X i for GLS in unbalanced panel is therefore Y i1 θ i Ȳ i Σ 1/2 i y i = 1 Y i2 θ i Ȳ i σ ε, Y iti θ i Ȳ i and likewise for the rows of X i (including constant term) 56 Nonspherical Disturbances and Robust Covariance Estimation Since the random effects model is already a generalized regression model with a known structure, robust estimation of the covariance matrix for the OLS estimator in this context is not the best use of the data Can we use OLS to estimate the random effects model with robust covariance estimator? R 2016 by Prof Chingnun Lee 30 Insof Economics,NSYSU,Taiwan

31 Ch11 Panel Data 6 DYNAMIC MODELS 6 Dynamic Models The analysis so far has involved static models and relatively straightforward estimation problems Dynamic effects in the panel data model, with or without heterogeneity, raise complex new issues in estimation and inference This section will explore a few of the specifications and show that the familiar estimation techniques (OLS, FGLS, etc) are not effective in these cases 61 Random Effects Model Consider a homogeneous dynamic panel data model Y it = φy i,t 1 + x itβ + v i + ε it, (11-43) where φ < 1 and v i is, as in the preceding sections of this chapter, individual unmeasured heterogeneity, that may or may not be correlated with x it We consider methods of estimation for this model when T is fixed and relatively small, and N may be large and increasing The disturbance v i + ε it may be correlated with x it, but either way, it is surely correlated with Y i,t1 By substitution, Cov[Y i,t 1, (v i + ε it )] = σ 2 v + φcov[y i,t 2, (v i + ε it )], and so on By repeated substitution, it can be seen that for moderately large T, Cov[Y i,t 1, (v i + ε it )] σ2 v 1 φ Consequently, OLS (pooled) and GLS are inconsistent 62 Fixed Effects Model The fixed effects approach does not solve the problem either Let Y it = α i + φy i,t 1 + x itβ + ε it, R 2016 by Prof Chingnun Lee 31 Insof Economics,NSYSU,Taiwan

32 Ch11 Panel Data 6 DYNAMIC MODELS taking deviations from individual means, we have Y it Ȳi = (x it x i)β + φ(y i,t 1 Ȳi) + ε it ε i Anderson and Hsiao (1982) showed that 20 Cov((Y i,t 1 Ȳi), (ε it ε i ) [( = E Y i,t 1 Y i,1 + Y i,2 + + Y i,t T [ ] (T 1) T φ + φ T = σ 2 ε T (1 φ) 2 σ 2 ε T (1 φ) 2 T [(1 φ) 1 φt T ) ( ε it ε )] i,1 + ε i,2 + + ε i,t T ] (11-44) This does converge to zero as T increases, but, again, we are considering cases in which T is small or moderate, say 5 to 15, in which case, the bias in the OLS estimator could be 15 percent to 60 percent The implication is that the within transformation does not produce a consistent estimator 63 Instrumental Variables Estimation The general approach, which has been developed in several stages in the literature, relies on instrumental variables estimators and most recently on a GMM estimator In either the fixed or the random effects cases, the heterogeneity can be swept from the model by taking first differences, which produces Y it Y i,t 1 = (x it x i,t 1 ) β + φ(y i,t 1 Y i,t 2 ) + (ε it ε t 1 ) This model is still complicated by correlation between the lagged dependent variable and the disturbance, Cov[(Y i,t 1 Y i,t 2 ), (ε it ε t 1 )] = E(Y i,t 1 ε i,t 1 ) 0 (11-45) But without the group effects, there is a simple instrumental variables estimator available Assuming that the time series is long enough, one could use the differences, (Y i,t 2 Y i,t 3 ), or the lagged levels, Y i,t 2 and Y i,t 3, as one or two instrumental variables for (Y i,t 1 Y i,t 2 ) By this construction, then, the treatment of this model is 20 See also the famous Nickell (1981) s Nickell bias R 2016 by Prof Chingnun Lee 32 Insof Economics,NSYSU,Taiwan

33 Ch11 Panel Data 6 DYNAMIC MODELS a standard application of the instrumental variables technique that we developed in Section 32 of Chapter 7 There is a question as to whether one should use differences or levels as instruments Arellano (1989) gives evidence that the latter is preferable R 2016 by Prof Chingnun Lee 33 Insof Economics,NSYSU,Taiwan

34 Ch11 Panel Data 7 APPENDIX: PROOFS OF HAUSMAN TEST 7 Appendix: Proofs of Hausman Test The theory underlying the proposed specification tests rests on one fundamental idea Under the null hypothesis of no misspecification, there will exist a consistent, asymptotically normal and asymptotically efficient estimator, where efficiency means attaining the asymptotic Cramer-Rao bound Under the alternative hypothesis of misspecification, however, this estimator (random-effected estimator) will be biased and inconsistent To construct a test of misspecification, it is necessary to find another estimator which is not adversely affected by the misspecification; but this estimator will not be asymptotically efficient under the null hypothesis A consideration of the difference between the two estimators, q = ˆβ 1 β 0, where β 0 is the efficient estimator under H 0 and ˆβ 1 is a consistent estimator under H 1, will then lead to a specification test If no misspecification is present, the probability limit of q is zero With misspecification plim q will differ from zero In constructing tests based on q, an immediate problem comes to mind To develop test not only is the probability limit of q required, but the variance of the asymptotic distribution of T q, ie V(q), must also be determined Luckily, the calculation is easy and in fact, V(q) = V( ˆβ 1 ) V( β 0 ) V 1 V 0 under the null hypothesis of no misspecification Thus, the construction of specification error tests is simplified, since the estimators may be considered separately because the variance of the difference T q = T ( ˆβ1 β 0 ) is the difference of the respective variance To prove the result formally, consider the following lemma Lemma Consider two estimators β 0 and ˆβ 1 which are both consistent and asymptotically normally distributed with β 0 attaining the asymptotic Cramer-Rao bound so T ( β 0 β) N(0, V 0 ) and T ( ˆβ 1 β) N(0, V 1 ), where V 0 is the inverse of Fisher s information matrix Let the difference between the two estimators be q = ˆβ 1 β 0 The limiting distribution of the efficient estimator, T ( β 0 β), and T q have zero covariance, Cov( β 0, q) = Cov( β 0, ˆβ 1 β 0 ) = 0, a zero matrix Proof Suppose β 0 and q are not orthogonal Since plim q = 0 (both estimators are consistent), we define a new estimator β 2 = β 0 + raq where r is a scalar and A is an R 2016 by Prof Chingnun Lee 34 Insof Economics,NSYSU,Taiwan

35 Ch11 Panel Data 7 APPENDIX: PROOFS OF HAUSMAN TEST arbitrary k k matrix to be chosen The new estimator is consistent and asymptotic normal with asymptotic variance V( β 2 ) = V( β 0 ) + ra Cov( β 0, q) + r Cov( β 0, q) A + r 2 AV(q)A Now consider the difference between the asymptotic variance of the new estimator and the old asymptotically efficient estimator F(r) = V( β 2 ) V( β 0 ) = ra Cov( β 0, q) + r Cov( β 0, q) A + r 2 AV(q)A Taking derivative with respect to r yields F (r) = A Cov( β 0, q) + Cov( β 0, q) A + 2rAV(q)A Then choose A = Cov( β 0, q) and note that Cov( β 0, q) is symmetric, which leads to F (r) = 2Cov( β 0, q) Cov( β 0, q) + 2rCov( β 0, q) V(q)Cov( β 0, q) Therefore at r = 0, F (0) = 2Cov( β 0, q) Cov( β 0, q) is the sense of being nonpositive definite, which means that for r small 21 F j(0) = F j (r) F j (0) lim r 0 + r 0 = F j (r) 0 lim 0, j = 1, 2, k, r 0 + r implying F j (r) < 0, j There is a contradiction unless Cov( β 0, q) = 0 since β 0 is asymptotically efficient already implies F(r) is positive definite or F j (r) 0, j Once it has been shown that the efficient estimator is uncorrelated with q, the asymptotic variance of q is easily calculated Corollary V(q) = V( ˆβ 1 ) V( β 0 ) V 1 V 0 is nonnegative definite Proof Since q + β 0 = ˆβ 1, then V(q) + V( β 0 ) + 2Cov( β 0, q) = V( ˆβ 1 ) 21 F(0) = 0 R 2016 by Prof Chingnun Lee 35 Insof Economics,NSYSU,Taiwan

Ch11 Panel Data 7 APPENDIX: PROOFS OF HAUSMAN TEST Because it has been shown that Cov(β 0, q) = 0, thus V(q) = V(β 1 ) V(β 0 ) is apparent22 Given the above result a general misspecification test can

36 Ch11 Panel Data 7 APPENDIX: PROOFS OF HAUSMAN TEST Because it has been shown that Cov(β 0, q) = 0, thus V(q) = V(β 1 ) V(β 0 ) is apparent22 Given the above result a general misspecification test can be specified by considering the statistics W = T q0 V (q) 1 q where V (q) is a consistent estimator of V(q) This statistic will shown to be distributed asymptotically as central χ2k under the null hypothesis where k is the number of unknown parameters in β when no misspecification is present The Seashore-Entrance to NSYSU End of this Chapter 22 It is to be noted that here V(q) is the variance of r 2016 by Prof Chingnun Lee 36 T q Insof Economics,NSYSU,Taiwan

Topic 10: Panel Data Analysis

Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel