SPATIAL PANEL ECONOMETRICS

Size: px

Start display at page:

Download "SPATIAL PANEL ECONOMETRICS"

Bathsheba Jemimah Tucker
6 years ago
Views:

1 Chapter 18 SPATIAL PANEL ECONOMETRICS Luc Anselin University of Illinois, Urbana-Champaign Julie Le Gallo Universit«e Montesquieu-Bordeaux IV Hubert Jayet Universit«e de Lille 1. Introduction Spatial econometrics is a subfield of econometrics that deals with the incorporation of spatial effects in econometric methods (Anselin, 1988a). Spatial effects may result from spatial dependence, a special case of cross-sectional dependence, or from spatial heterogeneity, a special case of cross-sectional heterogeneity. The distinction is that the structure of the dependence can somehow be related to location and distance, both in a geographic space as well as in a more general economic or social network space. Originally, most of the work in spatial econometrics was inspired by research questions arising in regional science and economic geography (early reviews can be found in, among others, Paelinck and Klaassen, 1979, Cliff and Ord, 1981, Upton and Fingleton, 1985, Anselin, 1988a, Haining, 1990, Anselin and Florax, 1995). However, more recently, spatial (and social) interaction has increasingly received more attention in mainstream econometrics as well, both from a theoretical as well as from an applied perspective (see the recent reviews and extensive references in Anselin and Bera, 1998, Anselin, 2001b, Anselin, 2002, Florax and Van Der Vlist, 2003, and Anselin et al., 2004a). The research by Luc Anselin and Julie Le Gallo was supported in part by the U.S. National Science Foundation Grant BCS to the Center for Spatially Integrated Social Science (CSISS).

2 2 The central focus in spatial econometrics to date has been the single equation cross-sectional setting. However, as Arrelano argues in the introduction to his recent panel data econometrics text, the field [of econometrics of panel data] has expanded to cover almost any aspect of econometrics (Arellano, 2003, p. 2). It is therefore not surprising that this has included spatial econometrics as well. For example, the second edition of Baltagi s well known panel data text now includes a brief discussion of spatial panels (Baltagi, 2001, pp ), and an increasing number of papers are devoted to the topic (see the reviews in Anselin, 2001b, Elhorst, 2001, and Elhorst, 2003, as well as the recent papers by Baltagi et al., 2003a, Baltagi et al., 2003b, Kapoor et al., 2003, and Pesaran, 2004, among others). In this chapter, we review and organize this recent literature and emphasize a range of issues pertaining to the specification, estimation and diagnostic testing for spatial effects in panel data models. Since this encompasses a large and rapidly growing literature, we limit our attention to models with continuous dependent variables, 1 and to a design where the cross-sectional dimension (N) vastly exceeds the time dimension (N T ). We also avoid duplication by excluding aspects of the standard treatment of heterogeneity and dependence in panel data models, as well as the case where cross-sectional dependence is modeled by relying on the time dimension (e.g., as in the classic SURE case with fixed N, and some more recent extensions, such as Chen and Conley, 2001). The chapter is organized into five remaining sections. First, we define the notion of spatial effects more precisely and provide a brief outline of how the traditional cross-sectional models can be extended to panel data model specifications. Next, we consider this more closely and develop a taxonomy of space-time models. We then turn to the issues of model estimation and diagnostic testing. We close with some concluding remarks. 2. Spatial Effects As a point of departure, consider a simple pooled linear regression model: y it = x it β + ɛ it, (18.1) where i is an index for the cross-sectional dimension, with i = 1,..., N, and t is an index for the time dimension, with t = 1,..., T. 2 Using customary notation, y it is an observation on the dependent variable at i and t, x it a 1 K vector of observations on the (exogenous) explanatory variables, β a matching K 1 vector of regression coefficients, and ɛ it an error term. Given our interest in spatial effects, the observations will be stacked as successive cross-sections for t = 1,..., T, referred to as y t (a N 1 vector of cross-sectional observations for time period t), X t (a N K matrix of observations on a cross-section of the explanatory variables for time period t) and ɛ t

3 Spatial Panel Econometrics 3 (a N 1 vector of cross-sectional disturbances for time period t). In stacked form, the simple pooled regression then becomes: y = Xβ + ɛ, (18.2) with y as a NT 1 vector, X as a NT K matrix and ɛ as a NT 1 vector. In general, spatial dependence is present whenever correlation across crosssectional units is non-zero, and the pattern of non-zero correlations follows a certain spatial ordering. When little is known about the appropriate spatial ordering, spatial dependence is reduced to simple cross-sectional dependence. For example, the error terms are spatially correlated when E[ɛ it ɛ jt ] 0, for a given t and i j, and the non-zero covariances conform to a specified neighbor relation. Note how the correlation is purely cross-sectional in that it pertains to the same time period t. The neighbor relation is expressed by means of a so-called spatial weights matrix. We will briefly review the concept of spatial weights (and the associated spatial lag operator) and outline two classes of specifications for models with spatial dependence. In one, the spatial correlation pertains to the dependent variable, in a so-called spatial lag model, in the other it affects the error term, a so-called spatial error model. The two specifications can also be combined, resulting in so-called higher order spatial models. While these models and terms are by now fairly familiar in the spatial econometric literature, we thought it useful to briefly review them and to illustrate how they may be incorporated into a panel data setting. 3 The second class of spatial effects, spatial heterogeneity, is a special case of the observed and unobserved heterogeneity which is treated prominently in the mainstream panel data econometrics literature. For example, a heterogeneous panel would relax the constant regression coefficient in equation 18.1, and replace it by: y it = x it β i + ɛ it, where the β i is a K 1 vector of regression coefficients specific to the crosssectional unit i. This heterogeneity becomes spatial when there is a structure to the variability across the i that is driven by spatial variables, such as location, distance or region. In the spatial literature, discrete spatial variability is referred to as spatial regimes (Anselin, 1988a). The continuous case can be modeled as a special form of random coefficient variation (where the covariance shows a spatial pattern), or deterministically, as a function of extraneous variables (socalled spatial expansion, e.g., Casetti, 1997), or as a special case of local regression models (so-called geographically weighted regression, Fotheringham et al., 2002). Neither of these has seen application in panel data contexts. 4 Since most econometric aspects of spatial heterogeneity can be handled by means of the standard panel data methods, we will focus the discussion that

4 4 follows on spatial dependence and will only consider the heterogeneity when it is relevant to the modeling of the dependence. 2.1 Spatial Weights and Spatial Lag Operator A spatial weights matrix W is a N N positive matrix in which the rows and columns correspond to the cross-sectional observations. An element w ij of the matrix expresses the prior strength of the interaction between location i (in the row of the matrix) and location j (column). This can be interpreted as the presence and strength of a link between nodes (the observations) in a network representation that matches the spatial weights structure. In the simplest case, the weights matrix is binary, with w ij = 1 when i and j are neighbors, and w ij = 0 when they are not. By convention, the diagonal elements w ii = 0. For computational simplicity and to aid in the interpretation of the spatial variables, the weights are almost always standardized such that the elements in each row sum to 1, or, w s ij = w ij/ j w ij. 5 A side effect of this standardization is that the sum of all elements in W equals N, the number of cross-sectional observations. Whereas the original weights are often symmetric, the row-standardized form is no longer, which is an unusual complication with significant computational consequences. The specification of the spatial weights is an important problem in applied spatial econometrics. 6 Unless the weights are based on a formal theoretical model for social or spatial interaction, their specification is often ad hoc. In practice, the choice is typically driven by geographic criteria, such as contiguity (sharing a common border) or distance, including nearest neighbor distance (for examples and further discussion, see, e.g., Cliff and Ord, 1981, pp , Anselin, 1988a, Chapter 3). Generalizations that incorporate notions of economic distance are increasingly used as well (e.g., Case et al., 1993, Conley and Ligon, 2002, Conley and Topa, 2002). A slightly different type of economic weights are so-called block weights, where all observations in the same region are considered to be neighbors (and not only the adjoining observations). More formally, if there are N g units in a block (such as counties in a state), they are all considered to be neighbors, and the spatial weights equal 1/(N g 1) for all observations belonging to the same block (see, e.g., Case, 1991, Case, 1992, and, more recently, Lee, 2002). So far, the weights considered were purely cross-sectional. To extend their use in a panel data setting, they are assumed to remain constant over time. 7 Using the subscript to designate the matrix dimension, with W N as the weights for the cross-sectional dimension, and the observations stacked as in equation 18.2, the full NT NT weights matrix then becomes: W NT = I T W N, (18.3)

5 Spatial Panel Econometrics 5 with I T as an identity matrix of dimension T. Unlike the time series case, where neighboring observations are directly incorporated into a model specification through a shift operator (e.g., t 1), this is not unambiguous in a two dimensional spatial setting. For example, observations for irregular spatial units, such as counties or census tracts, typically do not have the same number of neighbors, so that a spatial shift operator cannot be implemented. Instead, in spatial econometrics, the neighboring observations are included through a so-called spatial lag operator, more akin to a distributed lag than a shift (Anselin, 1988a). In essence, a spatial lag operator constructs a new variable that consists of the weighted average of the neighboring observations, with the weights as specified in W. More formally, for a cross-sectional observation i for variable z, the spatial lag would be j w ijz j. In most applications, the bulk of the row elements in w ij are zero (resulting in a sparse structure for W ) so that in effect the summation over j only incorporates the neighbors, i.e., those observations for which w ij 0. In matrix notation, this corresponds to the matrix operation W N y t, in which the N N cross-sectional weights matrix is post-multiplied by a N 1 vector of cross-sectional observations for each time period t = 1,..., T. Spatial variables are included into a model specification by applying a spatial lag operator to the dependent variable, to the explanatory variables, or to the error term. A wide range of models for local and global spatial externalities can be specified in this manner (for a review, see Anselin, 2003). This extends in a straightforward manner to the panel data setting, by applying the NT NT weights from equation 18.3 to the stacked y, X or ɛ from equation More precisely, in the same notation as above, a vector of spatially lagged dependent variables follows as: W y = W NT y = (I T W N )y, (18.4) a matrix of spatially lagged explanatory variables as: W X = W NT X = (I T W N )X, and a vector of spatially lagged error terms as: W ɛ = W NT ɛ = (I T W N )ɛ. The incorporation of these spatial lags into a regression specification is considered next. 2.2 Spatial Lag Model A spatial lag model, or, mixed regressive spatial autoregressive model, includes a spatially lagged dependent variable on the RHS of the regression specification (Anselin, 1988a). While usually applied in a pure cross-sectional setting, it can easily be extended to panel models. Using the stacked equation 18.2

6 6 and the expression for the spatial lag from equation 18.4, this yields: y = ρ(i T W N )y + Xβ + ɛ, (18.5) where ρ is the spatial autoregressive parameter, and the other notation is as before. In a cross-section, a spatial lag model is typically considered as the formal specification for the equilibrium outcome of a spatial or social interaction process, in which the value of the dependent variable for one agent is jointly determined with that of the neighboring agents. 8 This model is increasingly applied in the recent literature on social/spatial interaction, and is used to obtain empirical estimates for the parameters of a spatial reaction function (Brueckner, 2003) or social multiplier (Glaeser et al., 2002). It should be noted that other formulations to take into account social interaction have been suggested as well (e.g., Manski, 2000, Brock and Durlauf, 2001) mostly in the context of discrete choice. The modeling of complex neighborhood and network effects (e.g., Topa, 2001) requires considerable attention to identification issues, maybe best known from the work of Manski on the reflection problem (Manski, 1993). Because of this theoretical foundation, the choice of the weights in a spatial lag model is very important. At first sight, the extension of the spatial lag model to a panel data context would presume that the equilibrium process at hand is stable over time (constant ρ and constant W ). However, the inclusion of the time dimension allows much more flexible specifications, as outlined in section The essential econometric problem in the estimation of 18.5 is that, unlike the time series case, the spatial lag term is endogenous. This is the result of the two-directionality of the neighbor relation in space ( I am my neighbor s neighbor ) in contrast to the one-directionality in time dependence (for details, see Anselin and Bera, 1998). The consequence is a so-called spatial multiplier (Anselin, 2003) which formally specifies how the joint determination of the values of the dependent variables in the spatial system is a function of the explanatory variables and error terms at all locations in the system. The extent of the joint determination of values in the system can be seen by expressing equation 18.5 as a reduced form: y = [ I T (I N ρw N ) 1] Xβ + [ I T (I N ρw N ) 1] ɛ, (18.6) with the subscripts indicating the dimensions of the matrices. The inverse matrix expression can be expanded and considered one cross-section at a time, due to the block-diagonal structure of the inverse. A a result, for each N 1 cross-section at time t = 1,..., T : y t = X t β + ρw N X t β + ρ 2 W 2 NX t β ɛ t + ρw N ɛ t + ρ 2 W 2 Nɛ t...

7 Spatial Panel Econometrics 7 The implication of this reduced form is that the spatial distribution of the y it in each cross-section is determined not only by the explanatory variables and associated regression coefficients at each location (X t β), but also by those at neighboring locations, albeit subject to a distance decay effect (the increasing powers of ρ and W N ). In addition, the unobserved factors contained in the error term are not only relevant for the location itself, but also for the neighboring locations (W N ɛ), again, subject to a distance decay effect. Note that in the simple pooled model, this spatial multiplier effect is contained within each cross-section and does not spill over into other time periods. 9 The presence of the spatially lagged errors in the reduced form illustrates the joint dependence of the W N y t and ɛ t in each cross-section. In model estimation, this simultaneity must be accounted for through instrumentation (IV and GMM estimation) or by specifying a complete distributional model (maximum likelihood estimation). Even without a solid theoretical foundation as a model for social/spatial interaction, a spatial lag specification may be warranted to spatially detrend the data. This is referred to as a spatial filter: [I T (I N ρw N )] y = Xβ + ɛ, (18.7) with the LHS as a new dependent variable from which the effect of spatial autocorrelation has been eliminated. In contrast to time series, a simple detrending using ρ = 1 is not possible, since that value of ρ is not in the allowable parameter space. 10 As a consequence, the parameter ρ must be estimated in order for the spatial filtering to be operational (see Anselin, 2002). 2.3 Spatial Error Model In contrast to the spatial lag model, a spatial error specification does not require a theoretical model for spatial/social interaction, but, instead, is a special case of a non-spherical error covariance matrix. An unconstrained error covariance matrix at time t, E[ɛ it ɛ jt ], i j contains N (N 1)/2 parameters. These are only estimable for small N and large T, and provided they remain constant over the time dimension. In the panel data setting considered here, with N T, structure must be imposed in order to turn the covariance matrix into a function of a manageable set of parameters. Four main approaches have been suggested to provide the basis for a parsimonious covariance structure: direct representation, spatial error processes, spatial error components, and common factor models. Each will be reviewed briefly Direct Representation. The direct representation approach has its roots in the geostatistical literature and the use of theoretical variogram and covariogram models (Cressie, 1993). It consists of specifying the covariance

8 8 between two observations as a direct function of the distance that separates them, i j and t = 1,..., T : E[ɛ it ɛ jt ] = σ 2 f(τ, d ij ), (18.8) where τ is a parameter vector, d ij is the (possibly economic) distance between observation pairs i, j, σ 2 is a scalar variance term, and f is a suitable distance decay function, such as a negative exponential. 11 The parameter space for τ should be such that the combination of functional form and the distance metric ensures that the resulting covariance matrix is positive definite (for further discussion, see, e.g, Dubin, 1988). An extension to a panel data setting is straightforward. With σ 2 Ω t,n as the error covariance matrix that results from applying the function 18.8 to the N 1 cross-sectional error vector in time period t, the overall NT NT error variance-covariance matrix Σ NT becomes a block diagonal matrix with the N N variance matrix for each cross-section on the diagonal. 12 However, as specified, the function 18.8 does not vary over time, so that the result can be expressed concisely as: with Ω t,n = Ω N t. 13 Σ NT = σ 2 [I T Ω N ], Spatial Error Processes. Whereas the direct representation approach requires a distance metric and functional form for the distance decay between a pair of observations, spatial error processes are based on a formal relation between a location and its neighbors, using a spatial weights matrix. The error covariance structure can then be derived for each specified process, but typically the range of neighbors specified in the model is different from the range of spatial dependence in the covariance matrix. This important aspect is sometimes overlooked in empirical applications. In analogy to time series analysis, the two most commonly used models for spatial processes are the autoregressive and the moving average (for extensive technical discussion, see Anselin, 1988a, Anselin and Bera, 1998, Anselin, 2003, and the references cited therein). A spatial autoregressive (SAR) specification for the N 1 error vector ɛ t in period t = 1,..., T, can be expressed as: ɛ t = θw N ɛ t + u t, where W N is a N N spatial weights matrix (with the subscript indicating the dimension), θ is the spatial autoregressive parameter, and u t is a N 1 idiosyncratic error vector, assumed to be distributed independently across the cross-sectional dimension, with constant variance σ 2 u.

9 Spatial Panel Econometrics 9 Continuing in matrix notation for the cross-section at time t, it follows that: ɛ t = (I N θw N ) 1 u t, and hence the error covariance matrix for the cross-section at time t becomes: Ω t,n = E[ɛ t ɛ t] = σ 2 u (I N θw N ) 1 ( I N θw N) 1, or, in a simpler notation, with B N = I N θw N : Ω t,n = σ 2 u(b NB N ) 1. As before, in this homogeneous case, the cross-sectional covariance does not vary over time, so that the full NT NT covariance matrix follows as: Σ NT = σu 2 [ IT (B NB N ) 1]. (18.9) Note that for a row-standardized weights matrix, B N will not be symmetric. Also, even though W N may be sparse, the inverse term (B N B N) 1 will not be sparse and suggests a much wider range of spatial covariance than specified by the non-zero elements of the weights matrix. In other words, the spatial covariance structure induced by the SAR model is global. A spatial moving average (SMA) specification for the N 1 error vector ɛ t in period t = 1,..., T, can be expressed as: ɛ t = γw N u t + u t, where γ is the moving average parameter, and the other notation is as before. In contrast to the SAR model, the variance covariance matrix for an error SMA process does not involve a matrix inverse: Ω t,n = E[ɛ t ɛ t] = σu 2 [ IN + γ(w N + W N) + γ 2 W N W N ], (18.10) and, in the homogenous case, the overall error covariance matrix follows directly as: Σ NT = σ 2 u ( IT [ I N + γ(w N + W N) + γ 2 W N W N]). Aso, in contrast to the SAR model, the spatial covariance induced by the SMA model is local Spatial Error Components. A spatial error components specification (SEC) was suggested by Kelejian and Robinson as an alternative to the SAR and SMA models (Kelejian and Robinson, 1995, Anselin and Moreno, 2003). In the SEC model, the error term is decomposed into a local and a spillover effect.

10 10 In a panel data setting, the N 1 error vector ɛ t for each time period t = 1,..., T, is expressed as: ɛ t = W N ψ t + ξ t, (18.11) where W N is the weights matrix, ξ t is a N 1 vector of local error components, and ψ t is a N 1 vector of spillover error components. The two component vectors are assumed to consist of i.i.d terms, with respective variances σ 2 ψ and σ 2 ξ, and are uncorrelated, E[ψ itξ jt ] = 0, i, j, t. The resulting N N cross-sectional error covariance matrix is then, for t = 1,..., T : Ω t,n = E[ɛ t ɛ t] = σ 2 ψ W NW N + σ 2 ξ I N. (18.12) In the homogeneous model, this is again unchanging across time periods, and the overall NT NT error covariance matrix can be expressed as: Σ NT = σ 2 ξ I NT + σ 2 ψ (I T W N W N). Comparing equations and 18.12, it can be readily seen that the range of covariance induced by the SEC model is a subset that of the SMA model, and hence also a case of local spatial externalities Common Factor Models. In the standard two-way error component regression model, the cross-sectional units share an unobserved component due to the time period effect (e.g., Baltagi, 2001, p. 31). In our notation: ɛ it = µ i + λ t + u it, with µ i as the cross-sectional component, with variance σ 2 µ, λ t as the time component, with variance σ 2 λ, and u it as an idiosyncratic error term, assumed to be i.i.d with variance σ 2 u. Furthermore, the three random components are assumed to be zero mean and to be uncorrelated with each other. The random components µ i are assumed to be uncorrelated across cross-sectional units, and the components λ t are assumed to be uncorrelated across time periods. This model is standard, except that for our purposes, the data are stacked as crosssections for different time periods. Consequently, the N 1 cross-sectional error vector ɛ t for time period t = 1,..., T, becomes: ɛ t = µ + λ t ι N + u t, (18.13) where µ is a N 1 vector of cross-sectional error components, λ t is a scalar time component, ι N is a N 1 vector of ones, and u t is a N 1 vector of idiosyncratic errors. The structure in equation results in a particular form of cross-sectional (spatial) correlation, due to the common time component: E[ɛ t ɛ t] = σ 2 µi N + σ 2 λ ι Nι N + σ 2 ui N,

11 Spatial Panel Econometrics 11 where the subscript N indicates the dimension of the identity matrices. Note that the second term in this expression indicates equicorrelation in the crosssectional dimension, i.e., the correlation between two cross-sectional units i, j equals σλ 2, no matter how far these units are apart. While perfectly valid as a model for general (global) cross-sectional correlation, this violates the distance decay effect that underlies spatial interaction theory. The complete NT 1 error vector can be written as (see also Anselin, 1988a, p. 153): ɛ = (ι T I N )µ + (I T ι N )λ + u, where the subscripts indicate the dimensions, λ is a T 1 vector of time error components, u is a NT 1 vector of idiosyncratic errors, and the other notation is as before. The overal error variance covariance matrix then follows as: Σ NT = σ 2 µ(ι T ι T I N ) + σ 2 λ (I T ι N ι N) + σ 2 ui NT. Note the how the order of matrices in the Kronecker products differs from the standard textbook notation, due to the stacking by cross-section. A recent extension of the error component model can be found in the literature on heterogeneous panels. Here, the time component is generalized and expressed in the form of an unobserved common effect, common shock or factor f t to which all cross-sectional units are exposed (for a recently developed general framework that includes factors as a special case, see Andrews, 2005). However, unlike the standard error component model, each cross-sectional unit has a distinct factor loading on this factor. The simplest form is the so-called one factor structure, where the error term is specified as: ɛ it = δ i f t + u it, with δ i as the cross-sectional-specific loading on factor f t, and u it as an i.i.d zero mean error term. Consequently, cross-sectional (spatial) covariance between the errors at i and j follows from the the inclusion of the common factor f t in both error terms: E[ɛ it ɛ jt ] = δ i δ j σ 2 f. The common factor model has been extended to include multiple factors. In these specifications, a wide range of covariance structures can be expressed by including sufficient factors and through cross-sectional differences among the factor loadings. In general, the error term is: ɛ it = δ i f t + u it, where δ and f are vectors of dimension m, equal to the number of unobserved common factors. After standardizing the idiosyncratic error (e.g., as in Pesaran, 2004), we find the correlation between two cross-sectional terms as: ρ ij = δ i δ j (1 + δ i δ i ) 1/2 (1 + δ j δ j ) 1/2.

12 12 In contrast to the covariance structures considered in sections , the multiple factor model allows for dependencies that are not direct functions of distance or of a spatial weights specification. Provided the data are sufficiently rich (and, typically, in large T contexts), the correlation is determined by the model residuals (for further details, see Driscoll and Kraay, 1998, Coakley et al., 2002, Coakley et al., 2005, Hsiao and Pesaran, 2004, Pesaran, 2005 and Kapetanios and Pesaran, 2005). 3. A Taxonomy of Spatial Panel Model Specifications So far, we have considered the introduction of spatial effects for panel data in the form of spatial lag or spatial error models under extreme homogeneity. The point of departure was the pooled specification, equation 18.1, and lag and error models are obtained as outlined in sections and We now extend this taxonomy by introducing heterogeneity, both over time and across space, as well as by considering joint space-time dependence. It should be noted that a large number of combinations of space-time heterogeneity and dependence are possible, although many of those suffer from identification problems and/or are not estimable in practice. In our classification here, we purposely limit the discussion to models that have seen some empirical applications (other, more extensive typologies can be found in Anselin, 1988a, Chapter 4, Anselin, 2001b, Elhorst, 2001, and Elhorst, 2003). 3.1 Temporal Heterogeneity General Case. Temporal heterogeneity is introduced in the familiar way in fixed effects models, by allowing time-specific intercepts and/or slopes, and in random effects models, by incorporating a random time component or factor (see section ). The addition of a spatially lagged dependent variable or spatially correlated error term in these models is straightforward. For example, consider a pooled model with time-specific intercept and slope coefficient to which a spatially autoregressive error term is added. The cross-section in each period t = 1,..., T, is: with y t = α t + X t β t + ɛ t, (18.14) ɛ t = θ t W N ɛ t + u t where θ t is a period-specific spatial autoregressive parameter, α t is the periodspecific intercept and β t a (K 1) 1 vector of period-specific slopes. Since T is fixed (and the asymptotics are based on N ), this model is a straightforward replication of T cross-sectional models. A spatial lag specification is obtained in a similar way.

13 Spatial Panel Econometrics Spatial Seemingly Unrelated Regressions. A generalization of the fixed effects model that has received some attention in the empirical literature (e.g., Rey and Montouri, 1999) allows the cross-sectional error terms ɛ t to be correlated over time periods. This imposes very little structure on the form of the temporal dependence and is the spatial counterpart of the classic SURE model. It is referred to as the spatial SUR model (see Anselin, 1988a, Chapter 10, and Anselin, 1988b). In matrix form, the equation for the crosssectional regression in each time period t = 1,..., T, is as in equation 18.14, but now with the constant term included in the vector β t : y t = X t β t + ɛ t, (18.15) with the cross-equation (temporal) correlation in general form, as: E[ɛ t ɛ s] = σ ts I N, s t, where σ ts is the temporal covariance between s and t (by convention, the variance terms are expressed as σ 2 t ). In stacked form (T cross-sections), the model is: y = Xβ + ɛ, (18.16) with E[ɛɛ ] = Σ T I N (18.17) and Σ T is the T T temporal covariance matrix with elements σ ts. Spatial correlation can be introduced as a spatial lag specification or a spatial error specification. Consider the spatial lag model first (see Anselin, 1988a for details). In each cross-section (with t = 1,..., T ), the standard spatial lag specification holds, but now with a time-specific spatial autoregressive coefficient ρ t : y t = ρ t W N y t + X t β t + ɛ t. To consider the full system, let β be a T K 1 vector of the stacked timespecific β t, for t = 1,..., T. 15 The corresponding NT KT matrix X of observations on the explanatory variables then takes the form: X X = 0 X (18.18) X T Also, let the spatial autoregressive coefficients be grouped in a T T diagonal matrix R T, as: ρ R T = 0 ρ ρ T

14 14 The full system can then be expressed concisely as: y = (R T W N )y + Xβ + ɛ, (18.19) with the error covariance matrix as in equation In empirical practice, besides the standard hypothesis tests on diagonality of the error covariance matrix and stability of the regression coefficients over time, interest in the spatial lag SUR model will center on testing the hypothesis of homogeneity of the spatial autoregressive coefficients, or, H 0 : ρ 1 = ρ 2 =... = ρ T = ρ. If this null hypothesis can be maintained, a simplified model can be implemented: y = ρ(i T W N )y + Xβ + ɛ. Spatial error autocorrelation can be introduced in the spatial SUR model in the form of a SAR or SMA process for the error terms (see Anselin, 1988a, Chapter 10). For example, consider the following SAR error process for the cross-section in each time period t = 1,..., T : ɛ t = θ t W N ɛ t + u t. (18.20) The cross-equation covariance is introduced through the remainder error term u t, for which it is assumed that E[u t ] = 0, E[u t u t] = σ 2 t I N, and E[u t u s] = σ ts I N, for t s. As a result, the covariance matrix for the stacked NT 1 error vector u becomes the counterpart of equation 18.17: E[uu ] = Σ T I N, with, as before, Σ T as a T T matrix with elements σ ts. The SAR error process in equation can also be written as: ɛ t = (I N θ t W N ) 1 u t, or, using the simplifying notation B t,n = (I N θ t W N ), as: ɛ t = B 1 t,n u t. The overall cross-equation covariance between error vectors ɛ t and ɛ s then becomes: E[ɛ t ɛ s] = B 1 t,n E[u tu s]b 1 s,n = σ tsb 1 t,n B 1 s,n, which illustrates how the simple SUR structure induces space-time covariance as well (the B 1 t,n matrices are not diagonal). In stacked form, the error process for the NT 1 error vector ɛ can be written as: ɛ = B 1 NT u,

15 Spatial Panel Econometrics 15 with B NT as the matrix: B NT = [I NT (Θ T W N )], (18.21) and Θ T as a T T diagonal matrix containing the spatial autoregressive coefficients θ t, t = 1,..., T. The overall error covariance matrix for the stacked equations then becomes: E[ɛɛ ] = B 1 NT (Σ T I N )B 1 NT. (18.22) As in the spatial lag SUR model, specific interest in the spatial error SUR model centers on the homogeneity of the spatial autoregressive parameters, H 0 : θ 1 = θ 2 =... = θ T = θ. If the homogeneity holds, the expression for B NT (equation 18.21) simplifies to: 3.2 Spatial Heterogeneity B NT = [I T (I N θw N )]. We limit our attention in the treatment of spatial heterogeneity to models with unobserved heterogeneity, specified in the usual manner as either fixed effects or random effects. Both have been extended with spatial lag and spatial error specifications Fixed Effects Models. The classic fixed effects model (e.g., Baltagi, 2001, pp , and Arellano, 2003, pp ) includes an individual specific dummy variable to capture unobserved heterogeneity. For each observation i, t this yields, keeping the same notation as before: y i,t = α i + x it β + ɛ it for i = 1,..., N, t = 1,..., T, and with an additional constraint of the form i α i = 0, such that the individual effects α i are separately identifiable from the constant term in β. As is well known, consistent estimation of the individual fixed effects is not possible when N, due to the incidental parameter problem. Since spatial models rely on the asymptotics in the cross-sectional dimension to obtain consistency and asymptotic normality of estimators, this would preclude the fixed effects model from being extended with a spatial lag or spatial error term (Anselin, 2001b). Nevertheless, it has been argued that when the interest is primarily in obtaining consistent estimates for the β coefficients, the use of demeaned spatial regression models may be appropriate, for example, using the standard maximum likelihood estimation approach (Elhorst, 2003, p ).

16 16 There are several slight complications associated with this approach. One is that the demeaning operator takes on a different form from the usual expression in the literature, since the observations are stacked as cross-sections for different time periods. Also, the demeaned models no longer contain a constant term, which may be incompatible with assumptions made by standard spatial econometric software. More importantly, the variance covariance matrix of the demeaned error terms is no longer σ 2 ɛ I, but becomes σ 2 ɛ Q, where Q is the demeaning operator (this aspect is ignored in the likelihood functions presented in Elhorst, 2003, p. 250). To illustrate these points, consider a fixed effects spatial lag model in stacked form, using the same setup as in equation 18.5, with the addition of the fixed effects: y = ρ(i T W N )y + (ι T α) + Xβ + ɛ, (18.23) where α is a N 1 vector of individual fixed effects, with the constraint that α ι N = 0, and, as before, E[ɛɛ ] = σ 2 ɛ I NT. Note the difference with the classic formulation in the Kronecker product for the fixed effects, due to the stacking of cross-sections, rather than individual time series. The demeaned form of equation is obtained by substracting the average for each cross-sectional unit computed over the time dimension, which wipes out the individual fixed effects (as well as the constant term). Formally, this can be expressed as: Q NT y = ρ(i T W N )Q NT y + Q NT Xβ + Q NT ɛ, (18.24) where Q NT is the demeaning operator (and Q NT X and β no longer contain a constant term). The demeaning operator is a NT NT matrix that takes the form: Q NT = I NT (ι T ι T /T I N ) with, as before, ι as a vector of ones and the subscripts denoting the dimension of vectors and matrices. Again, note the difference with the standard textbook notation, due to stacking by cross-section. The matrix Q NT is idempotent, and, as a result, the expression for the variance of the error in becomes: E[ɛɛ ] = σ 2 ɛ Q NT. Since Q NT is idempotent, this variance is singular, which limits the practicality of this approach. The use of a conditional likelihood perspective, standard in the panel data literature (e.g., Arellano, 2003, pp ) does not apply in a straightforward manner to a spatial lag specification, since the average of the dependent variable ȳ i = (1/T ) t y it is spatially correlated among the i. A more careful elaboration of the relative merits of alternative approaches remains largely a topic of ongoing research.

17 Spatial Panel Econometrics Random Effects Models. In the random effects approach to modeling unobserved heterogeneity, interest has centered on incorporating spatial error correlation into the regression error term, in addition to the standard cross-sectional random component. Note that the latter induces serial correlation over time (of the equi-correlated type). The addition of a spatial lag term in these models is straightforward, by specifying the proper error variance covariance structure. We therefore focus attention on the one-way error component specification (e.g, Baltagi, 2001, pp , Arellano, 2003, Ch. 3). 16 In contrast to the fixed effects case, asymptotics along the cross-sectional dimension (with N ) present no problem for random effects models. The standard specification of the error term in this model, is, for each i, t: ɛ it = µ i + ν it, where µ i IID(0, σ 2 µ) is the cross-sectional random component, and ν it IID(0, σ 2 ν) is an idiosyncratic error term, with µ i and ν it independent from each other. In each cross-section, for t = 1,..., T, the N 1 error vector ɛ t becomes: ɛ t = µ + ν t, (18.25) where µ is a N 1 vector of cross-sectional random components. The common approach to introduce spatial error autocorrelation in equation is to specify a SAR process for the error component ν t, for t = 1,..., T (Anselin, 1988a, p. 153, and, more recently, Baltagi et al., 2003b): ν t = θw N ν t + u t, (18.26) with θ as the spatial autoregressive parameter (constant over time), W N as the spatial weights matrix, and u t as an i.i.d idiosyncratic error term with variance σ 2 u. Using the notation B N = I N θw N, we obtain the familiar result: ν t = (I N θw N ) 1 u t = B 1 N u t. In stacked form, the NT 1 error term then becomes: ɛ = (ι T I N )µ + (I T B 1 N )u, (18.27) where u IID(0, σ 2 ui NT ) is a NT 1 vector of idiosyncratic errors, and the variance covariance matrix for ɛ is: Σ NT = E[ɛɛ ] = σ 2 µ(ι T ι T I N ) + σ 2 u[i T (B NB N ) 1 ]. (18.28) Note that the first component induces correlation in the time dimension, but not in the cross-sectional dimension, whereas the opposite holds for the second component (correlation only in the cross-sectional dimension).

18 18 An alternative specification for spatial correlation in this model applies the SAR process first and the error components specification to its remainder error (Kapoor et al., 2003). Consider a SAR process for the NT 1 error vector ɛ: ɛ = θ(i T W N )ɛ + ν, or, using similar notation as the spatially correlated component in equation 18.27: ɛ = (I T B 1 N )ν. Now, the innovation vector ν is specified as a one way error component model (Kapoor et al., 2003): ν = (ι T I N )µ + u, with µ as the N 1 vector of cross-sectional random components, and u IID(0, σ 2 ui NT ). As a result, the error variance covariance matrix takes on a form that is different from 18.28: Σ NT = E[ɛɛ ] = (I T B 1 N )[σ2 µ(ι T ι T I N ) + σui 2 NT ](I T B 1 N ). (18.29) Again, this model combines both time-wise as well as cross-sectional correlation. Statistical inference for error components models with spatial SAR processes can be carried out as a special case of models with non-spherical error covariance. This is addressed in sections 18.4 and Spatio-Temporal Models The incorporation of dependence in both time and space dimensions in an econometric specification adds an additional order of difficulty to the identification of the NT (NT 1)/2 elements of the variance covariance matrix. An important concept in this regard is the notion of separability. Separability requires that a NT NT space-time covariance matrix Σ NT can be decomposed into a component due to space and a component due to time (see, e.g., Mardia and Goodall, 1993), or: Σ NT = Σ T Σ N, where Σ T is a T T variance covariance matrix for the time-wise dependence and Σ N is a N N variance covariance matrix for the spatial dependence. 17 This ensures that the space-time dependence declines in a multiplicative fashion over the two dimensions. It also addresses a central difficulty in space-time modeling, i.e., the lack of a common distance metric that works both in the cross-sectional and the time dimension. The approach taken in spatial panel econometrics is to define neighbors in space by means of a spatial weights matrix and neighbors in time by means of the customary time lags. However, the speed of the dynamic space-time process may not be compatible with these choices, leading to further misspecification.

19 Spatial Panel Econometrics Model Taxonomy. Ignoring for now any space-time dependence in the error terms, we can distinguish four basic forms to introduce correlation in both space and time in panel data models (following Anselin, 2001b, p ). As before, we focus on models where N T and do not consider specifications where the time dimension is an important aspect of the model. 18 Also, we limit the scope of the discussion to the pooled dynamic model, since the incorporation of spatial effects in heterogeneous dynamic panel models has seen little attention to date. To facilitate exposition, we express these models for a N 1 cross-section at time t = 1,..., T. Pure space recursive models, in which the dependence pertains only to neighboring locations in a previous period: y t = γw N y t 1 + X t β + ɛ t, (18.30) with γ as the space-time autoregressive parameter, and W N y t 1 as a N 1 vector of observations on the spatially lagged dependent variable at t 1. Note that this can be readily extended with time and spatial lags of the explanatory variables, X t 1 or W N X t. However, since W N y t 1 already includes W N X t 1, adding a term of this form would create identification problems. This is sometimes overlooked in other taxonomies of dynamic space-time models (e.g., in the work of Elhorst, 2001, p. 121, where space-time lags for both the dependent and the exploratory variables are included in the specification). Consider the space-time multiplier more closely. Start by substituting the equation for y t 1 in 18.30, which yields: or, y t = γw N [γw N y t 2 + X t 1 β + ɛ t 1 ] + X t β + ɛ t, y t = γ 2 W 2 Ny t 2 + X t β + γw N X t 1 β + ɛ t + γw N ɛ t 1. Successive substitution reveals a space-time multiplier that follows from a series of consecutively higher orders of both spatial and time lags applied to the X (and error terms). Also, since the spatial dependence takes one period to manifest itself, this specification becomes quite suitable to study spatial diffusion phenomena (see the early discussion in Upton and Fingleton, 1985, and Dubin, 1995). Time-space recursive models, in which the dependence relates to both the location itself as well as its neighbors in the previous period: y t = φy t 1 + γw N y t 1 + X t β + ɛ t, (18.31) with φ as the serial (time) autoregressive parameter, operating on the crosssection of dependent variables at t 1. Spatially lagged contemporaneous explanatory variables (W N X t ) may be included as well, but time lagged explanatory variables will result in identification problems. This model has particular appeal in space-time forecasting (e.g., Giacomini and Granger, 2004).

20 20 Again, the nature of the space-time multiplier can be assessed by substituting the explicit form for the spatially and time lagged terms: or, y t = φ[φy t 2 + γw N y t 2 + X t 1 β + ɛ t 1 ] +γw N [φy t 2 + γw N y t 2 + X t 1 β + ɛ t 1 ] +X t β + ɛ t, y t = (φ 2 + 2φγW N + γ 2 WN)y 2 t 2 +X t β + (φ + γw N )X t 1 β +ɛ t + (φ + γw N )ɛ t 1, revealing a much more complex form for the effect of space-time lagged explanatory variables (and errors), including the location itself as well as its neighbors. Time-space simultaneous models, which include a time lag for the location itself together with a contemporaneous spatial lag: y t = φy t 1 + ρw N y t + X t β + ɛ t, with ρ as the (contemporaneous) spatial autoregressive parameter. The mulitplier in this model is complex, due to the combined effect of the cross-sectional spatial multiplier (in each period) and the space-time multiplier that follows from the time lag in the dependent variable. First, consider the pure cross-sectional multiplier: y t = (I N ρw N ) 1 [φy t 1 + X t β + ɛ t ]. Next, substitute the corresponding expression for y t 1 : which yields: y t = (I N ρw N ) 1 [φ[(i N ρw N ) 1 (φy t 2 +X t 1 β + ɛ t 1 )] + X t β + ɛ t ], y t = φ 2 (I N ρw N ) 2 y t 2 +(I N ρw N ) 1 X t β + φ(i N ρw N ) 2 X t 1 β +(I N ρw N ) 1 ɛ t + φ(i N ρw N ) 2 ɛ t 1. From this it follows that the inclusion of any spatially lagged X in the original specification will lead to identification problems. Time-space dynamic models, where all three forms of lags for the dependent variable are included: y t = φy t 1 + ρw N y t + γw N y t 1 + X t β + ɛ t.

21 Spatial Panel Econometrics 21 While this model is sometimes suggested as a general space-time specification, it results in complex nonlinear constraints on the parameters, and, in practice, often suffers from identification problems. For example, focusing only on the time lagged terms and substituting their expression for t 1 (and rearranging terms) yields: y t = φ[φy t 2 + ρw N y t 1 + γw N y t 2 + X t 1 β + ɛ t 1 ] +γw N [φy t 2 + ρw N y t 1 + γw N y t 2 + X t 1 β + ɛ t 1 ] +ρw N y t + X t β + ɛ t, or, grouping by time period: y t = ρw N y t + X t β + ɛ t +φρw N y t 1 + γρwny 2 t 1 + φx t 1 β + γw N X t 1 β +φɛ t 1 + γw N ɛ t 1 +φ 2 y t 2 + γφw N y t 2 + γ 2 W 2 Ny t 2. The same types of space-time dependence processes can also be specified for the error terms in panel data models (e.g., Fazekas et al., 1994). However, combinations of both spatially lagged dependent variables and spatially lagged error terms may lead to identification problems unless the parameters of the explanatory variables are non-zero. An alternative form of error space-time dependence takes the error components approach, to which we turn briefly Error Components with Space-Time Dependence. The starting point for including explicit serial dependence (in addition to the equicorrelated form) in random effects models is the spatially autocorrelated form considered in equations However, instead of the indiosyncratic error u t in 18.26, a serially correlated term ζ t is introduced (Baltagi et al., 2003a): ν t = θw N ν t + ζ t (18.32) with ζ t = φζ t 1 + u t, (18.33) where, as before, u t is used to denote the idiosyncratic error, and t = 1,..., T. The counterpart of the N 1 cross-sectional error vector ɛ t in equation becomes: ɛ t = (I N θw N ) 1 ζ t = B 1 N ζ t, with ζ replacing the original error u. In stacked form, this becomes: ɛ = (ι T I N )µ + (I T B 1 N )ζ,

22 22 with µ of dimension N 1 and both ɛ and ζ of dimension NT 1. The serial correlation in ζ will yield serial covariances of the familiar AR(1) form, with: ( ) φ E[ζ i,t ζ i,t k ] = σu 2 k 1 φ 2, for k = 0,..., T 1, and i = 1,..., N, where σ 2 u is the variance of the error term u. Grouping these serial covariances into a T T variance covariance matrix Ω T yields the overall variance covariance matrix for ɛ as (Baltagi et al., 2003a): Σ NT = E[ɛɛ ] = σ 2 µ(ι T ι T I N ) + [Ω T (B NB N ) 1 ]. 4. Estimation of Spatial Panel Models The estimation of panel data models that include spatially lagged dependent variables and/or spatially correlated error terms follows as a direct extension of the theory developed for the single cross-section. In the first case, the endogeneity of the spatial lag must be dealt with, in the second, the non-spherical nature of the error variance covariance matrix must be accounted for. Two main approaches have been suggested in the literature, one based on the maximum likelihood principle, the other on method of moments techniques. We consider each in turn. We limit our attention to models with a parameterized form for the spatial dependence, specified as a spatial autoregressive process. 19 Note that some recent results in the panel econometrics literature have also addressed estimation in models with general, unspecified cross-sectional correlation (see, e.g., Driscoll and Kraay, 1998, Coakley et al., 2002, Coakley et al., 2005, Pesaran, 2005, Kapetanios and Pesaran, 2005). 4.1 Maximum Likelihood Estimation The theoretical framework for maximum likelihood estimation of spatial models in the single cross-section setup is by now well developed (see, among others, Ord, 1975, Mardia and Marshall, 1984, Anselin, 1988a, Cressie, 1993, Anselin and Bera, 1998). While the regularity conditions are non-standard, and require a consideration of triangular arrays (Kelejian and Prucha, 1999), the results for error terms with a Gaussian distribution are fairly well established. In practice, estimation consists of applying a non-linear optimization to the log-likelihood function, which (in most circumstances) yields a consistent estimator from the numerical solution to the first order conditions. Asymptotic inference is based on asymptotic normality, with the asymptotic variance matrix derived from the information matrix. This requires the second order partial derivatives of the log-likelihood, for which analytical solutions exist in many

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Outline. Overview of Issues. Spatial Regression. Luc Anselin Spatial Regression Luc Anselin University of Illinois, Urbana-Champaign http://www.spacestat.com Outline Overview of Issues Spatial Regression Specifications Space-Time Models Spatial Latent Variable Models