Short T Panels - Review We have looked at methods for estimating parameters on time-varying explanatory variables consistently in panels with many cross-section observation units but a small number of time periods, allowing for correlation between the individual-specific time-invariant component of the error term and some or all of the explanatory variables y it = x it β + (η i + v it ) for i = 1, 2,..., N and t = 1, 2,..., T We distinguish between explanatory variables that are: - strictly exogenous wrt v it : E(x it v is ) = 0 for all s, t - predetermined wrt v it : E(x it v is ) = 0 for s t; E(x it v is ) 0 for s < t - endogenous wrt v it : E(x it v is ) = 0 for s > t; E(x it v is ) 0 for s t 1
If all the explanatory variables are strictly exogenous, we can estimate β consistently using Within Groups or first-differenced OLS If all the explanatory variables are predetermined, we can estimate β consistently using 2SLS in the first-differenced equations, with x i,t 1 or x i,t 1 as instruments for x it If all the explanatory variables are endogenous, we can estimate β consistently using 2SLS in the first-differenced equations, with x i,t 2 or x i,t 2 as instruments for x it 2SLS estimators using the first-differenced equations can be generalized to allow for a mix of endogenous, predetermined and strictly exogenous covariates 2
We have also looked at asymptotic inference for these estimators - this typically uses a cluster-robust estimator of the variance-covariance matrix, allowing for correlation between the errors (in the transformed equations used for estimation) in different time periods for the same cross-section observation unit Asymptotic approximations are derived considering only the number of cross-section observation units becoming large - large N, fixed T (semi-)asymptotics A key assumption is that the data on (y i, X i ) is independent over i = 1, 2,.., N (cross-sectional independence) 3
These definitions of predetermined and endogenous explanatory variables are intended for models in which the time-varying error component is serially uncorrelated - they can be generalized to allow for particular forms of low order serial correlation (e.g. MA(1) or AR(1)), but the use of lagged values of endogenous or predetermined variables as instruments is not compatible with general forms of serial correlation If v it is serially uncorrelated, the lagged dependent variable y i,t 1 is a particular predetermined explanatory variable, provided we have E(y i1 v it ) = 0 for t = 2, 3,..., T 4
Some Methods for Long T Panels We now look more briefly at some methods for estimating parameters on time-varying explanatory variables consistently in panels with observations for many time periods, again allowing for correlation between the individualspecific time-invariant component of the error term and some or all of the explanatory variables As in time series econometrics, the time series properties of the variables are important for asymptotic approximations that are derived by considering T 5
For simplicity, we focus on linear models that relate stationary variables (in the sense of being integrated of order zero, or I(0)) - there is an extensive literature on unit root tests for long T panels, and a literature which extends cointegration methods to this context Initially we maintain the assumptions of common slope parameters and cross-sectional independence 6
We consider the dynamic model y it = αy i,t 1 + βx it + (η i + v it ) for i = 1, 2,..., N and t = 2, 3,..., T in which α < 1, y it and x it are I(0) variables, E(η i ) = E(v it ) = E(η i v it ) = 0, the time-varying error component v it is serially uncorrelated, and both y i,t 1 and x it are correlated with η i and predetermined wrt v it Here we can estimate (α, β) consistently using Within Groups or Least Squares Dummy Variables This is also the case if x it happens to satisfy strict exogeneity, and the distinction between strictly exogenous and predetermined explanatory variables plays a minor role in the context of long T panels 7
This consistency property is seen more easily using the augmented LSDV specification (with individual-specific intercepts) y it = N j=1 η j I j i + αy i,t 1 + βx it + v it where we have the orthogonality condition E[(I 1 i, I2 i,..., IN i, y i,t 1, x it )v it ] = 0 OLS thus estimates the slope (and intercept) parameters consistently as T, but recall that the Within Groups estimator of the slope parameters coincides with the OLS estimator in this augmented specification 8
These estimators are consistent here as T with N fixed - indeed under our assumptions they are consistent in the time series model which is the special case with N = 1 y t = η + αy t 1 + βx t + v t These estimators are also consistent here as T and N both go to infinity 9
In this simple dynamic specification N y it = η j I j i + αy i,t 1 + βx it + v it j=1 it is straightforward to allow for endogeneity, using (for example) x i,t 1 as an instrument for x it This becomes more challenging in more general dynamic specifications, for example in the autoregressive-distributed lag (ARDL(p,q)) model N y it = η j I j i + α 1y i,t 1 + α 2 y i,t 2 +... + α p y i,t p = j=1 + β 0 x it + β 1 x i,t 1 +... + β q x i,t q + v it N p q η j I j i + α s y i,t s + β s x i,t s + v it j=1 s=1 s=0 10
In principle, we could use x i,t q 1 as an instrument for x it, but this would require the long lag x i,t q 1 to be informative in the first-stage linear projection, which relates: x it to (Ii 1, I2 i,..., IN i, y i,t 1, y i,t 2,..., y i,t p, x i,t 1, x i,t 2,..., x i,t q, x i,t q 1 ) In practice, we may require outside instruments that are both uncorrelated with v it and informative in this first-stage linear projection to allow for endogeneity of x it in such general dynamic specifications 11
Mean Groups We now consider relaxing the assumption of slope parameter heterogeneity Returning to our simple dynamic specification, we allow the parameters on y i,t 1 and x it, as well as the intercept, to take different values for each observation unit Using the individual-specific dummies I j i for j = 1, 2,..., N, we can write this in the form y it = N η j I j i + N α j (I j i y i,t 1) + N β j (I j i x it) + v it j=1 j=1 j=1 12
To economize on notation, we will write this more simply as y it = η i + α i y i,t 1 + β i x it + v it in which it is understood that η i is an individual-specific intercept parameter, and (α i, β i ) are individual-specific slope parameters The Mean Groups estimator treats the average values of these slope parameters over i = 1, 2,..., N as the objects of interest - for example, we may be interested in the relationship for an average firm, or for an average country The Mean Groups estimator was proposed by Pesaran and Smith (Jnl of Ects, 1995) 13
The Mean Groups estimator is straightforward to compute For each individual observation unit, we estimate the parameters (η i, α i, β i ) by OLS, using only the time series data on (y it, x it ) for that observation unit Denote these individual-specific estimated parameters by ( η i, α i, β i ) The Mean Groups estimator of the corresponding average parameters is just the mean of these individual-specific estimates, over the sample of observation units i = 1, 2,..., N N N N η MG = 1 N η i α MG = 1 N α i βmg = 1 N β i i=1 i=1 i=1 14
This can be implemented by: - looping over the N individual observation units - for each individual, run the OLS regression and store the results - at the end of the loop, average the N estimates of each parameter of interest In Stata, this procedure is automated in the xtmg command 15
Maintaining the assumption of cross-sectional independence, it is also straightforward to estimate the variance of these averages Since, for example, we have β MG = 1 N ( β 1 + β 2 +... + β N ) ( ) ( ) 1 1 = β N 1 + β N 2 +... + ( ) 1 N β N we have Var( β MG ) = = ( 1 N 2 ) Var( β 1 ) + ( ) 1 N N 2 i=1 Var( β i ) ( ) 1 Var( β N 2 2 ) +... + ( ) 1 Var( β N 2 N ) and Var( β i ) is estimated from the time series regression for each i = 1, 2,..., N 16
To allow for endogeneity of the explanatory variable x it, we can replace the OLS estimates of the time series model for each observation unit by 2SLS estimates of this time series model, for example using an outside instrument, or using the lagged level x i,t 1, as an instrument for x it A practical concern with the Mean Groups estimator, particularly when T is not so large, relative to the number of variables included in the model, is that the estimated parameters for some of the individual observation units may be very imprecise, and sample means can be strongly influenced by the presence of a small number of outliers 17
We can investigate this by comparing the means to the medians of the individual-specific estimated parameters - medians are robust to the presence of a small number of outliers - though constructing standard errors for medians, and hence conducting hypothesis tests, is not so easy We can also consider outlier-robust estimates of the mean in this context, which are also linear combinations of the individual-specific estimated parameters, and for which constructing standard errors is again straightforward 18
Short-run and Long-run Parameters In our simple dynamic specification y it = η i + α i y i,t 1 + β i x it + v it the parameter β i measures the change in y it associated with a unit change in x it for individual i, holding y i,t 1 constant This is called the short-run or impact effect of a change in x it If we consider a step increase in the level of x it, from one constant level to another constant level at time t, this short-run effect is not the end of the story 19
In period t + 1, the lagged dependent variable y it is higher (if β i > 0) by β i x it, and this is associated with a further increase in the outcome in period t + 1 by α i β i x it In period t + 2, this is associated with a further increase in the outcome by α 2 i β i x it And so on 20
Cumulating the sequence of changes to the outcome following this step change in x it, up to time T, we have y it y i,t 1 = β i x it (1 + α i + α 2 i +... + α T i ) with the limit as T given by lim (y it y i,t 1 ) = T ( βi 1 α i ) x it The parameter ( βi 1 α i ) measures the eventual change in y it associated with a unit change in x it for individual i This is called the long-run effect of a change in x it 21
These kinds of dynamic specifications allow the full effect of the change in x it on the outcome variable to occur gradually How long it takes for (almost) all of the adjustment to occur depends on the coeffi cient α i on the lagged dependent variable, which is a measure of the inertia in the outcome variable For α i = 0, all the adjustment occurs in the first period, and the short-run and long-run effects of the change in x it are the same As α i 1, it takes more and more periods for any given fraction of the full adjustment to be completed 22
We can also derive the long-run effect by considering the relationship between y it and x it in a steady state, in which y it = y i and x it = x i for all time periods t Then we have y i = η i + α i y i + β i x i y i (1 α i ) = η i + β i x i ( ) ηi y i = 1 α i + ( βi 1 α i ) x i 23
Similarly the long-run parameter in the ARDL(p,q) specification p q y it = η i + α si y i,t s + β si x i,t s + v it can be shown to be s=1 s=0 q s=0 β si 1 p s=1 α si = β 0i + β 1i +...β qi 1 α 1i α 2i...α pi 24
The simple dynamic model can also be written in the error correction (or equilibrium correction ) form y it = η i + α i y i,t 1 + β i x it + v it y it y i,t 1 = η i + (α i 1)y i,t 1 + β i x it + v it [ ( ) ] βi = η i + (α i 1) y i,t 1 x it 1 α i + v it or y it = η i φ i [y i,t 1 θ i x it ] + v it in which φ i = 1 α i > 0 is called the speed of adjustment parameter, and ( ) θ i = βi 1 α i is the long-run parameter 25
The Mean Groups estimator of the average value of this long-run parameter is obtained by: - estimating θ i for each individual, using the time series data for that individual to estimate β i and α i, and combining these estimates to obtain θi = β i /(1 α i ) - averaging these estimated long-run parameters over the N individuals to obtain θ MG = 1 N θ N i=1 i Notice that θmg β MG 1 α MG 26
Conversely, since the speed of adjustment parameter φ i = 1 α i is a linear function of α i, we do have φ MG = 1 α MG If we compare the Mean Groups estimate α MG to pooled estimates which impose the restriction α i = α for all i = 1, 2,..., N, a common finding is that the pooled estimates tend to be higher than the Mean Groups estimate, implying slower adjustment If the restriction is invalid but we impose it, the estimated model can be written as y it = η i + α i y i,t 1 + β i x it + v it = η i + αy i,t 1 + βx it + [v it + (α i α)y i,t 1 + (β i β)x it ] 27
y it = η i + αy i,t 1 + βx it + [v it + (α i α)y i,t 1 + (β i β)x it ] Since most series used in economic applications tend to be positively autocorrelated, these additional components in the error term of the (misspecified) pooled specification will tend to be a source of upward bias in the OLS estimate of the parameter α on the lagged dependent variable 28
Pooled Mean Groups The Pooled Mean Groups estimator considers a hybrid specification in which the short-run parameters α i and β i are allowed to take different values for each individual, but the long-run parameter θ i = be common ( βi 1 α i ) is imposed to That is, we estimate the dynamic model [ y it = η i + (α i 1) y i,t 1 ( βi 1 α i ) x it ] + v it = η i φ i [y i,t 1 θ i x it ] + v it subject to the restriction that θ i = ( βi 1 α i ) = θ, a common parameter for all i = 1, 2,..., N 29
The restricted model is y it = η i φ i [y i,t 1 θx it ] + v it Note that we have ( βi 1 α i ) = θ, so we can infer the values of β i from the relation β i = (1 α i )θ = φ i θ after estimating φ i and θ 30
In some applications, we may consider that the long-run relationship is more likely to be similar across observation units than the short-run adjustment dynamics - or we may find that the Mean Groups estimate of the average long-run parameter is too imprecise to be very useful Pesaran, Shin and Smith (JASA, 1999) proposed a Maximum Likelihood estimator for the common long-run parameters, and heterogeneous short-run parameters, in ARDL/ECM models of this kind This is implemented in the xtpmg command in Stata 31
Cross-sectional Dependence We now consider relaxing the assumption of cross-sectional independence Pesaran (Ecta, 2006) allows for a form of cross-sectional dependence by introducing an error component with a factor structure For example, in our simple dynamic model, this has the form y it = η i + α i y i,t 1 + β i x it + (κ i f t + v it ) in which f t is an unobserved common factor, and κ i is an individualspecific factor loading parameter 32
Note that this specification is more general than one with a common, timespecific error component Here the effect of common, time-specific unobservables represented by the common factor f t on the outcome y it is allowed to be heterogeneous across different observation units Moreover, it is hard to allow for a common, time-specific error component in models with heterogeneous slope parameters Clearly we cannot allow for time-specific intercept parameters in the time series regression models that are used to obtain the Mean Groups estimator 33
In models with common slope parameters, subtracting the time-specific means N N N y t = 1 N i=1 y it y t 1 = 1 N i=1 y i,t 1 x t = 1 N from the original variables y it, y i,t 1 and x it respectively is equivalent to including a full set of time dummies - just as subtracting individual-specific means from the original variables (the within transformation) is equivalent to including a full set of individual dummies (the LSDV specification) However this equivalence does not hold in models with heterogeneous slope parameters 34 i=1 x it
In this case, demeaning the original variables by subtracting their timespecific means - while often useful in practice - provides only an approximate control for the presence of a common, time-specific error component Pesaran (2006) shows that it is comparatively simple to allow for the presence of an unobserved common factor f t with a heterogeneous loading parameter κ i We can control for the presence of this error component by: - augmenting the original specification by including the time-specific means (y t 1, x t, and y t ) as additional explanatory variables - and estimating individual-specific parameters on each of these additional explanatory variables 35
That is, we estimate the augmented specification y it = η i + α i y i,t 1 + β i x it + γ i y t 1 + δ i x t + ω i y t + v it This specification can then be estimated using Mean Groups This specification is called a cross-sectionally augmented autoregressivedistributed lag model (CS-ARDL), and the Mean Groups estimator in this setting is called the Common Correlated Effects Mean Groups estimator (CCE-MG) This estimator can be implemented simply by constructing the time-specific means and including them as additional explanatory variables In Stata, this is also available as an option with the xtmg command 36
A practical consideration is that we are now estimating twice as many parameters from the time series regressions for each individual observation unit In a more general specification with K x explanatory variables in a vector x it, p lags of the dependent variable and an intercept, we require T > K = K x + p + 1 to be able to estimate the individual-specific parameters required for the original Mean Groups estimator from the time series data for each individual In the cross-sectionally augmented specification, we add a further K x +p+1 individual-specific parameters on the additional time-specific means, and so require T > 2K 37
This approach extends to considering f t to be a vector of unobserved common factors, with κ i a vector of individual-specific factor loading parameters (a multi-factor error structure ) Another extension allows for factors which are relevant for only a subset of the observation units For example, with long T panel data on countries, we can allow for regionspecific factors - this specification augments the original model by including region-specific time means, in addition to the time-means for all countries in the sample Panel unit root tests which allow for heterogeneous autoregressive parameters and this form of cross-sectional dependence have also been developed 38
Spatial Models A different approach to allowing for cross-sectional dependence is known as spatial econometrics Usually the cross-section observation units in this context are geographical units (e.g. cities, counties, regions, countries), and we have data on the distance between them (or their mid-points) The relationship between the outcomes (or the error terms) for two units i and j is then modelled as a function of this distance In economic applications, other measures of the strength of the connection between pairs of observation units can also be considered, such as the volume of trade flows, or travel times 39
Two examples of models used in this context are the spatial lag model y it = α(w i y t ) + x it β + v it in which y t is the vector of observations (y 1t, y 2t,..., y Nt ), and W i is a vector of (scaled, inverse) distances between unit i and each of the observation units j = 1, 2,..., N (with the i th element zero) The spatially lagged dependent variable W i y t is analogous to a (temporally) lagged dependent variable in time series econometrics - note that we can write the lagged dependent vatiable y i,t 1 in the form W st y i, where y i = (y i1, y i2,..., y it ) and W st is an indicator vector which selects the observation on period t 1 40
And the spatial autoregressive model, which considers a similar structure in the error term y it = x it β + v it v it = ρ(w i v i ) + ε it This is analogous to specifying an AR(1) error component v it = ρv i,t 1 +ε it in time series econometrics, and has the effect of introducing spatial lags of both y and x into the equations used for estimation y it = x it β + ρ(w i y t ) ρβ(w i x t ) + ε it 41