splm: econometric analysis of spatial panel data

Similar documents
Journal of Statistical Software

Testing Random Effects in Two-Way Spatial Panel Data Models

Cross-sectional and spatial dependence in panels with R

Spatial Regression. 14. Spatial Panels (2) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Lecture 6: Hypothesis Testing

1 Overview. 2 Data Files. 3 Estimation Programs

Econometrics of Panel Data

Analyzing spatial autoregressive models using Stata

The Two-way Error Component Regression Model

Spatial Econometrics

Econometrics. Journal. The Hausman test in a Cliff and Ord panel model. First version received: August 2009; final version accepted: June 2010

Spatial Regression. 15. Spatial Panels (3) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Outline. Overview of Issues. Spatial Regression. Luc Anselin

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

Fixed Effects Models for Panel Data. December 1, 2014

SPATIAL PANEL ECONOMETRICS

GMM Estimation of Spatial Error Autocorrelation with and without Heteroskedasticity

Econometrics of Panel Data

Prediction in a Generalized Spatial Panel Data Model with Serial Correlation

Advanced Econometrics

Topic 10: Panel Data Analysis

Comparing estimation methods. econometrics

Sensitivity of GLS estimators in random effects models

Short T Panels - Review

Network data in regression framework

Econometrics of Panel Data

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

SPATIAL PANEL DATA MODELS AND FUEL DEMAND IN BRAZIL. Gervásio F. dos Santos Weslem R. Faria

Interpreting dynamic space-time panel data models

Modeling Spatial Externalities: A Panel Data Approach

Spatial Econometrics. Wykªad 6: Multi-source spatial models. Andrzej Torój. Institute of Econometrics Department of Applied Econometrics

Luc Anselin and Nancy Lozano-Gracia

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Regional Science and Urban Economics

Economics 582 Random Effects Estimation

Lecture 7: Dynamic panel models 2

Spatial Regression. 13. Spatial Panels (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Appendix A: The time series behavior of employment growth

Creating and Managing a W Matrix

Econometrics of Panel Data

Econometric Analysis of Panel Data. Final Examination: Spring 2013

A correlated randon effects spatial Durbin model

Lecture 7: Spatial Econometric Modeling of Origin-Destination flows

Finite sample properties of estimators of spatial autoregressive models with autoregressive disturbances

6/10/15. Session 2: The two- way error component model. Session 2: The two- way error component model. Two- way error component regression model.

Spatial panels: random components vs. xed e ects

1 Introduction to Generalized Least Squares

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

SERIAL CORRELATION. In Panel data. Chapter 5(Econometrics Analysis of Panel data -Baltagi) Shima Goudarzi

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

ON THE NEGATION OF THE UNIFORMITY OF SPACE RESEARCH ANNOUNCEMENT

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics of Panel Data

1 Estimation of Persistent Dynamic Panel Data. Motivation

Econometric Analysis of Cross Section and Panel Data

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Anova-type consistent estimators of variance components in unbalanced multi-way error components models

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Seemingly Unrelated Regressions With Spatial Error Components

Panel VAR Models with Spatial Dependence

EC327: Advanced Econometrics, Spring 2007

Spatial Regression. 10. Specification Tests (2) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Topic 7: Heteroskedasticity

Robust Standard Errors to spatial and time dependence in. state-year panels. Lucciano Villacorta Gonzales. June 24, Abstract

Formulary Applied Econometrics

Instrumental Variables/Method of

y it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

Nonstationary Panels

GARCH Models Estimation and Inference

Econ 582 Fixed Effects Estimation of Panel Data

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Cointegration Lecture I: Introduction

Autocorrelation. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Autocorrelation POLS / 20

Test of hypotheses with panel data

Title. Description. var intro Introduction to vector autoregressive models

A SPATIAL CLIFF-ORD-TYPE MODEL WITH HETEROSKEDASTIC INNOVATIONS: SMALL AND LARGE SAMPLE RESULTS

Non-linear panel data modeling

Bootstrap Test Statistics for Spatial Econometric Models

Specification Testing for Panel Spatial Models with Misspecifications

HOW IS GENERALIZED LEAST SQUARES RELATED TO WITHIN AND BETWEEN ESTIMATORS IN UNBALANCED PANEL DATA?

F9 F10: Autocorrelation

Econometrics of Panel Data

10 Panel Data. Andrius Buteikis,

Some Recent Developments in Spatial Panel Data Models

PhD course: Spatial Health Econometrics. University of York, Monday 17 th July 2017

EC821: Time Series Econometrics, Spring 2003 Notes Section 9 Panel Unit Root Tests Avariety of procedures for the analysis of unit roots in a panel

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet

GMM estimation of spatial panels

Analyzing Spatial Panel Data of Cigarette Demand: A Bayesian Hierarchical Modeling Approach

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Econometric Methods for Panel Data Part I

Seemingly Unrelated Regressions in Panel Models. Presented by Catherine Keppel, Michael Anreiter and Michael Greinecker

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Chapter 2. Dynamic panel data models

Transcription:

splm: econometric analysis of spatial panel data Giovanni Millo 1 Gianfranco Piras 2 1 Research Dept., Generali S.p.A. and DiSES, Univ. of Trieste 2 REAL, UIUC user! Conference Rennes, July 8th 2009

Introduction splm = Spatial Panel Linear Models Georeferenced data and Spatial Econometrics Individual heterogeneity, unobserved effects and panel models

Outline 1 Motivation Growing theory on Spatial Panel Data R ideal environment splm 2 General ML framework Cross Sectional Models Computational approach to the ML problem 3 ML estimation RE Models FE Models 4 GM estimation Error Components model Spatial simultaneous equation model 5 LM tests LM tests 6 Demonstration ML procedure GM procedure Tests 7 Conclusions

MOTIVATION

Growing theory on Spatial Panel Data Motivation Reasons for developing an R library for spatial panel data: Spatial econometrics has experienced an increasing interest in the last decade. Spatial panel data are probably one of the most promising but at the same time underdeveloped topics in spatial econometrics. Recently, a number of theoretical papers have appeared developing estimation procedures for different models. Among these: Anselin et al. (2008); Kapoor et al. (2007); Elhorst (2003); Lee and Yu (2008); Yu et al. (2008); Korniotis (2007); Mutl and Pfaffermayer (2008).

Growing theory on Spatial Panel Data Motivation Theoretical papers developing test procedures to discriminate among different specifications Baltagi et al. (2007); Baltagi et al. (2003). Although there exist libraries in R, Matlab, Stata, Phyton to estimate cross-sectional models, to the best of our knowledge, there is no other software available to estimate spatial panel data models than Elhorst s Matlab code and one (particular) Stata example on Prucha s web site

R ideal environment Why is R the ideal environment? (Faith) (Reason)...more specifically, because of the availability of three libraries: spdep, plm and Matrix spdep: extremely well known library for spatial analysis that contains spatial data infrastructure and estimation procedures for spatial cross sectional models plm: recently developed panel data library performing the estimation of most non-spatial models, tests and data management Matrix: library containing methods for sparse matrices (and much more). Turns out to be very relevant because of the particular properties of a spatial weights matrix..

splm The splm package Taking advantage of these existing libraries we are working on the development of a library for spatial panel data models.. From spdep: we are taking: contiguity matrices and spatial data management infrastructure specialized object types like nb and listw From plm: how to handle the double (space and time) dimension of the data model object characteristics and printing methods From Matrix: efficient specialized methods for sparse matrices

splm General ML Framework

Cross Sectional Models General ML framework Anselin (1988) outlines the general procedure for a model with a spatial lag and spatially autocorrelated (and possibly non-spherical) innovations: y = λw 1 y + Xβ + u u = ρw 2 u + η (1) with η N(0, Ω) and Ω σ 2 I. Special cases: λ = 0, spatial error model ρ = 0, spatial lag model.

Cross Sectional Models General ML framework Introducing the simplifying notation model (1) can be rewritten as: A = I λw 1 B = I ρw 2 Ay = Xβ + u Bu = η. (2) If there exists Ω such that e = Ω 1 2 η and e N(0, σ 2 ei) and B is invertible, then u = B 1 Ω 1 2 e

Cross Sectional Models General ML framework Model (1) can be written as or, equivalently, Ay = Xβ + B 1 Ω 1 2 e Ω 1 2 B(Ay Xβ) = e with e a well-behaved" error term with zero mean and constant variance. Unfortunately, the error term e is unobservable and therefore, to make the estimator operational, the likelihood function needs to be expressed in terms of y. This in turn requires calculating J = det( e y ) = Ω 1 2 BA = Ω 1 2 B A, the Jacobian of the transformation.

Cross Sectional Models General ML framework These determinants directly enter the log-likelihood, which becomes Note that: logl = N 2 lnπ 1 2 ln Ω + ln B + ln A 1 2 e e The difference compared to the usual likelihood of the classic linear model is given by the Jacobian terms The likelihood is a function of β, λ, ρ and parameters in Ω. The overall errors covariance B ΩB can in turn be scaled and written as B ΩB = σ 2 eσ.

Computational approach to the ML problem Computational approach to the ML problem A general description of the ML estimation problem consists of defining an efficient and parsimonious transformation of the model at hand to (unobservable) spherical errors, and the corresponding log-likelihood for the observables; translating the inverse and the determinant of the scaled covariance of the errors, Σ 1 and Σ, and the determinants A and B into computationally manageable objects; implementing the two-step optimization iterating between maximization of the concentrated log-likelihood and the closed-form GLS solution.

RE Models

RE Models General Random Effects Panel Model Let us concentrate on the error model considering a panel model within a more specific, yet quite general setting, allowing for the following features of the composite error term (i.e., parameters describing Σ): random effects (φ = σ 2 µ/σ 2 ɛ ) spatial correlation in the idiosyncratic error term (λ) serial correlation in the idiosyncratic error term (ρ) y = Xβ + u u = (ı T µ) + ɛ ɛ = λ(i T W 2 )ɛ + ν ν t = ρν t 1 + e t

RE Models Particular cases of the general model Error features can be combined, giving rise to the following possibilities: λ, ρ 0 λ 0 ρ 0 λ = ρ = 0 σµ 2 0 SEMSRRE SEMRE SSRRE RE σµ 2 = 0 SEMSR SEM SSR OLS where SEMRE is the usual random effects spatial error panel and SEM the standard spatial error model (here, pooling with W = I T w) The likelihoods involved give rise to computational issues that limit the application to data sets of certain dimensions.

FE Models FE Models

FE Models FE Models A fixed effect spatial lag model can be written in stacked form as: y = λ(i T W N )y + (ι T α) + Xβ + ε (3) where λ is a spatial autoregressive coefficient. On the other hand, a fixed effects spatial error model can be written as: y = (ι T α) + Xβ + u u = ρ(i T W N )u + ε (4) where ρ is a spatial autocorrelation coefficient.

FE Models FE Spatial Panel estimation The estimation procedure for both models is based on maximum likelihood and can be summarized as follows. 1 Applying OLS to the demeaned model. 2 Plug the OLS residuals into the expression for the concentrated likelihood to obtain an initial estimate of ρ. 3 The initial estimate for ρ can be used to perform a spatial FGLS to obtain estimates of the regression coefficients, the error variance and a new set of estimated GLS residuals. 4 An iterative procedure may then be used in which the concentrated likelihood and the GLS estimators for, respectively, ρ and β are alternately estimated until convergence.

FE Models GM Approach

Error Components model GM approach Consider again the panel model written stacking the observations by cross-section: and y N = X N β + u N (5) u N = ρ(i T W N )u N + ε N (6) Kapoor et al. (2007) consider an error component structure for the whole innovation vector in (6), that is: ε N = (e T I N )µ N + ν N (7) i.e., the individual error components are spatially correlated as well (different economic interpretation).

Error Components model GM RE Estimation Procedure The estimation procedure can be summarized by the following steps: 1 Estimate the regression equation by OLS to obtain an estimate of the vector of residuals to employ in the GM procedure 2 Estimate ρ and the variance components σ 2 1 and σ2 ν using one of the three set of GM estimators proposed 3 Use the estimators of ρ, σν 2 and σ1 2 to define a corresponding feasible GLS estimator of β.

Spatial simultaneous equation model Spatial simultaneous equation model The system of spatially interrelated cross sectional equations corresponding to N cross sectional units can be represented as (Kelejian and Prucha, 2004) : Y = YB + XC + Qλ + U (8) with Y = (y 1,..., y m ), X = (x 1,..., x m ), U = (u 1,..., u m ), Q = (Q 1,..., Q m ) and q j = Wy j, j = 1,... m. y j is an N 1 vector of cross-sectional observations on the dependent variable in the jth equation, x l is an N 1 vector of observations on the lth exogenous variable, u j is the N 1 disturbance vector in the jth equation. Finally, W is an N N weights matrix of known constants, and B, C and λ are corresponding matrices of parameters.

Spatial simultaneous equation model Spatial simultaneous equation model The model also allows for spatial autocorrelation in the disturbances. In particular, it is assumed that the disturbances are generated by the following spatial autoregressive process: U = MR + E (9) where E = (ε 1,..., ε m ), R = diag m j=1 (ρ j), M = (m 1,..., m m ) and m j = Wu j. ε j and ρ j denotes, respectively, the vector of innovations and the spatial autoregressive parameter in the jth equation.

Spatial simultaneous equation model TESTS

LM tests Baltagi et al. 2003; 2007 Consider the general random effect model with serial and cross-sectional correlation in the error term: y = Xβ + u u = (ı T µ) + ɛ ɛ = λ(i T W 2 )ɛ + ν ν t = ρν t 1 + e t Baltagi et al. (2003) derive tests for the following hypothesis (for a model where ρ is zero): λ, µ (needs OLS estimates of û) λ µ (needs SEM estimates of û) µ λ (needs RE estimates of û)

LM tests Baltagi et al. 2007 Baltagi et al. (2007) derive tests for the following hypothesis: λ ρ, µ (needs SSRRE estimates of û) ρ λ, µ (needs SEMRE estimates of û) µ λ, ρ (needs SEMSR estimates of û) So a viable and computationally parsimonious strategy for the error model can be to test in the three directions by means of conditional LM tests and see whether one can estimate a simpler model than the general one.

LM tests DEMONSTRATIONS

Munnel (1990) data set and spatial weights matrix Munnell s (1990) data on public capital productivity: 48 continental US states 17 years (1970-1987): but we consider the subset 1970-74 binary proximity (contiguity) matrix Is the coefficient on pcap in this production function significantly positive?

ML procedure ML estimator of the Munnell (1990) model (Random Effects)

ML procedure Munnell s model (Random Effects and Spatial Dependence)

ML procedure Munnell s model (Complete Model)

GM procedure Munnell s model (Error components GM estimator)

Tests Baltagi et al. 2003

Tests Baltagi et al. 2003

Tests CONCLUSIONS

Recap and Next Steps The functionalities of the package allow to conduct specification search and model estimation according to current practice in a straightforward way. Most future developments will take place under the hood. Directions for future work: Complete Montecarlo checks Improve interface consistency Including different estimators and additional models that are already available in the literature Improve the efficiency of ML estimation...complete packaging and put package on CRAN