Comparing estimation methods. econometrics

Similar documents
Regional Research Institute

Journal of Statistical Software

splm: econometric analysis of spatial panel data

Analyzing spatial autoregressive models using Stata

Lecture 6: Hypothesis Testing

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

GMM Estimation of Spatial Error Autocorrelation with and without Heteroskedasticity

Spatial Econometrics. Wykªad 6: Multi-source spatial models. Andrzej Torój. Institute of Econometrics Department of Applied Econometrics

Lecture 7: Spatial Econometric Modeling of Origin-Destination flows

Spatial Tools for Econometric and Exploratory Analysis

sphet: Spatial Models with Heteroskedastic Innovations in R

Testing Random Effects in Two-Way Spatial Panel Data Models

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

Spatial Econometrics

Lecture 3: Spatial Analysis with Stata

Greene, Econometric Analysis (6th ed, 2008)

After "Raising the Bar'': applied maximum likelihood estimation of families of models in spatial econometrics ( )

A SPATIAL CLIFF-ORD-TYPE MODEL WITH HETEROSKEDASTIC INNOVATIONS: SMALL AND LARGE SAMPLE RESULTS

Journal of Statistical Software

Regional Science and Urban Economics

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Computing the Jacobian in Gaussian spatial models: an illustrated comparison of available methods

Visualize and interactively design weight matrices

Applied Spatial Econometrics. Professor Bernard Fingleton Director of Research Department of Land Economy University of Cambridge

POLI 8501 Introduction to Maximum Likelihood Estimation

Interpreting dynamic space-time panel data models

Finite Sample Properties of Moran s I Test for Spatial Autocorrelation in Probit and Tobit Models - Empirical Evidence

Geographically weighted regression approach for origin-destination flows

HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

Spatial Relationships in Rural Land Markets with Emphasis on a Flexible. Weights Matrix

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

1 Overview. 2 Data Files. 3 Estimation Programs

SIMULATION AND APPLICATION OF THE SPATIAL AUTOREGRESSIVE GEOGRAPHICALLY WEIGHTED REGRESSION MODEL (SAR-GWR)

the error term could vary over the observations, in ways that are related

RAO s SCORE TEST IN SPATIAL ECONOMETRICS

No

Spatial Regression Models: Identification strategy using STATA TATIANE MENEZES PIMES/UFPE

A SPATIAL ANALYSIS OF A RURAL LAND MARKET USING ALTERNATIVE SPATIAL WEIGHT MATRICES

Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters, and a Comparison to Other Clustering Algorithms

Lecture 4: Maximum Likelihood Estimator of SLM and SEM

Spatial Autocorrelation and Interactions between Surface Temperature Trends and Socioeconomic Changes

GMM estimation of spatial panels

arxiv: v1 [stat.co] 3 Mar 2017

Chapter 2 Linear Spatial Dependence Models for Cross-Section Data

A command for estimating spatial-autoregressive models with spatial-autoregressive disturbances and additional endogenous variables

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

Introduction to PySAL and Web Based Spatial Statistics

Heteroskedasticity in Panel Data

Heteroskedasticity in Panel Data

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

Spatial Regression. 6. Specification Spatial Heterogeneity. Luc Anselin.

ISSN Article

Proceedings of the 8th WSEAS International Conference on APPLIED MATHEMATICS, Tenerife, Spain, December 16-18, 2005 (pp )

Christopher Dougherty London School of Economics and Political Science

Maximum Likelihood (ML) Estimation

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

Instrumental Variables/Method of

Luc Anselin and Nancy Lozano-Gracia

Creating and Managing a W Matrix

COLUMN. Spatial Analysis in R: Part 2 Performing spatial regression modeling in R with ACS data

A Spatial Cliff-Ord-type Model with Heteroskedastic Innovations: Small and Large Sample Results 1

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer

Lecture 2: Spatial Models

Spatial Autocorrelation (2) Spatial Weights

Spatial Effects and Externalities

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Spatial Econometrics The basics

Network data in regression framework

Pitfalls in higher order model extensions of basic spatial regression methodology

Introduction to Spatial Statistics and Modeling for Regional Analysis

Agricultural and Applied Economics 637 Applied Econometrics II

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Omitted Variable Biases of OLS and Spatial Lag Models

1 Motivation for Instrumental Variable (IV) Regression

Spatial Filtering with EViews and MATLAB

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

A correlated randon effects spatial Durbin model

Spatial Interdependence and Instrumental Variable Models

Single Equation Linear GMM with Serially Correlated Moment Conditions

P1: JYD /... CB495-08Drv CB495/Train KEY BOARDED March 24, :7 Char Count= 0 Part II Estimation 183

Short T Panels - Review

Instrumental variables estimation using heteroskedasticity-based instruments

Spatial Panel Data Analysis

Lecture 7 Autoregressive Processes in Space

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Title. Description. var intro Introduction to vector autoregressive models

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors

Econometrics. 9) Heteroscedasticity and autocorrelation

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

Spatial Regression. 13. Spatial Panels (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Reading Assignment. Distributed Lag and Autoregressive Models. Chapter 17. Kennedy: Chapters 10 and 13. AREC-ECON 535 Lec G 1

After Raising the Bar : applied maximum likelihood estimation of families of models in spatial econometrics

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Introduction to Eco n o m et rics

ON THE NEGATION OF THE UNIFORMITY OF SPACE RESEARCH ANNOUNCEMENT

Transcription:

for spatial econometrics Recent Advances in Spatial Econometrics (in honor of James LeSage), ERSA 2012 Roger Bivand Gianfranco Piras NHH Norwegian School of Economics Regional Research Institute at West Virginia University Thursday, 23 August 2012

Outline Background Recent advances in spatial econometrics model fitting techniques have made it more desirable to be able to compare results

Outline Background Recent advances in spatial econometrics model fitting techniques have made it more desirable to be able to compare results Results should correspond between implementations using different applications

Outline Background Recent advances in spatial econometrics model fitting techniques have made it more desirable to be able to compare results Results should correspond between implementations using different applications A broad range of model fitting techniques are provided by the contributed R packages for spatial econometrics

Outline Background Recent advances in spatial econometrics model fitting techniques have made it more desirable to be able to compare results Results should correspond between implementations using different applications A broad range of model fitting techniques are provided by the contributed R packages for spatial econometrics These model fitting techniques are associated with methods for estimating impacts and some tests, which will also be presented and compared

Background The use of spatial econometrics tools was widened by the ease with which methods and examples presented in Anselin (1988) could be reproduced using SpaceStat TM, written in Gauss TM

Background The use of spatial econometrics tools was widened by the ease with which methods and examples presented in Anselin (1988) could be reproduced using SpaceStat TM, written in Gauss TM It was rapidly complemented by the Spatial Econometrics toolbox for Matlab TM, provided as source code together with extensive documentation

Background The use of spatial econometrics tools was widened by the ease with which methods and examples presented in Anselin (1988) could be reproduced using SpaceStat TM, written in Gauss TM It was rapidly complemented by the Spatial Econometrics toolbox for Matlab TM, provided as source code together with extensive documentation A suite of commands for spatial data analysis for use with Stata TM was provided by Maurizio Pisati, and macros for Minitab TM and SAS TM were also made available

Background The use of spatial econometrics tools was widened by the ease with which methods and examples presented in Anselin (1988) could be reproduced using SpaceStat TM, written in Gauss TM It was rapidly complemented by the Spatial Econometrics toolbox for Matlab TM, provided as source code together with extensive documentation A suite of commands for spatial data analysis for use with Stata TM was provided by Maurizio Pisati, and macros for Minitab TM and SAS TM were also made available The thrust of SpaceStat TM has largely been taken over by GeoDa (Anselin et al. 2006), and more recently by OpenGeoDa

Today s software There is now much more software available for spatial econometrics

Today s software There is now much more software available for spatial econometrics Stata TM with sppack and Matlab TM with Spatial Econometrics Toolbox are mainstream programmes; the Matlab TM toolbox remains in the public domain, and has a community of contributors

Today s software There is now much more software available for spatial econometrics Stata TM with sppack and Matlab TM with Spatial Econometrics Toolbox are mainstream programmes; the Matlab TM toolbox remains in the public domain, and has a community of contributors OpenGeoDa and PySAL are open source, with code hosted on Google, binary versions for common platforms, and a community of users

Today s software There is now much more software available for spatial econometrics Stata TM with sppack and Matlab TM with Spatial Econometrics Toolbox are mainstream programmes; the Matlab TM toolbox remains in the public domain, and has a community of contributors OpenGeoDa and PySAL are open source, with code hosted on Google, binary versions for common platforms, and a community of users R with spdep, sphet, McSpatial and other contributed packages is open source, and the packages are cross-platform; the packages also have a community of users and developers

Why compare? In the spirit of Rey (2009), this comparison will attempt to examine some features of the implementation of functions for fitting spatial econometrics models

Why compare? In the spirit of Rey (2009), this comparison will attempt to examine some features of the implementation of functions for fitting spatial econometrics models Firstly, it may be useful to show which kinds of functions for creating spatial weights, for diagnostics, and for model fitting are available

Why compare? In the spirit of Rey (2009), this comparison will attempt to examine some features of the implementation of functions for fitting spatial econometrics models Firstly, it may be useful to show which kinds of functions for creating spatial weights, for diagnostics, and for model fitting are available Next, it is comforting when one can show that fitting the same model on the same data using different implementations gives the same results

Why compare? In the spirit of Rey (2009), this comparison will attempt to examine some features of the implementation of functions for fitting spatial econometrics models Firstly, it may be useful to show which kinds of functions for creating spatial weights, for diagnostics, and for model fitting are available Next, it is comforting when one can show that fitting the same model on the same data using different implementations gives the same results Finally, if the results are not the same, it is helpful to be able to show why they vary, possibly because of different design choices in implementation

Data set: Ward and Gleditsch 2008 The data set used for comparison here is taken from a Sage volume Spatial Regression Models by Ward and Gleditsch (2008), with political science data for 158 countries

Data set: Ward and Gleditsch 2008 The data set used for comparison here is taken from a Sage volume Spatial Regression Models by Ward and Gleditsch (2008), with political science data for 158 countries The model they explore is the relationship between democracy scores (POLITY IV indicators) and the logarithm of country GDP per capita in 2002

Data set: Ward and Gleditsch 2008 The data set used for comparison here is taken from a Sage volume Spatial Regression Models by Ward and Gleditsch (2008), with political science data for 158 countries The model they explore is the relationship between democracy scores (POLITY IV indicators) and the logarithm of country GDP per capita in 2002 They treat countries as neighbours with non-zero spatial weights if their borders are closer than 200km from each other; data and weights are available for download from their site

Data set: Ward and Gleditsch 2008 The data set used for comparison here is taken from a Sage volume Spatial Regression Models by Ward and Gleditsch (2008), with political science data for 158 countries The model they explore is the relationship between democracy scores (POLITY IV indicators) and the logarithm of country GDP per capita in 2002 They treat countries as neighbours with non-zero spatial weights if their borders are closer than 200km from each other; data and weights are available for download from their site They were among the first to examine the effects of feedback in spatial regression models

Spatial weights Creating spatial weights is a necessary step in using areal data, perhaps just to check that there is no remaining spatial patterning in residuals

Spatial weights Creating spatial weights is a necessary step in using areal data, perhaps just to check that there is no remaining spatial patterning in residuals The Matlab SE toolbox provides functions for creating nearest neighbour and triangulation contiguity weights, and for reading GAL files

Spatial weights Creating spatial weights is a necessary step in using areal data, perhaps just to check that there is no remaining spatial patterning in residuals The Matlab SE toolbox provides functions for creating nearest neighbour and triangulation contiguity weights, and for reading GAL files A number of functions are included in the R spdep package to create neighbours, and from them weights; GAL and GWT files may be read and written

Spatial weights Creating spatial weights is a necessary step in using areal data, perhaps just to check that there is no remaining spatial patterning in residuals The Matlab SE toolbox provides functions for creating nearest neighbour and triangulation contiguity weights, and for reading GAL files A number of functions are included in the R spdep package to create neighbours, and from them weights; GAL and GWT files may be read and written Weights may be constructed using functions in GeoDa, and in Pysal using Python programming; GAL and GWT files may be read and written

Spatial weights Creating spatial weights is a necessary step in using areal data, perhaps just to check that there is no remaining spatial patterning in residuals The Matlab SE toolbox provides functions for creating nearest neighbour and triangulation contiguity weights, and for reading GAL files A number of functions are included in the R spdep package to create neighbours, and from them weights; GAL and GWT files may be read and written Weights may be constructed using functions in GeoDa, and in Pysal using Python programming; GAL and GWT files may be read and written The Stata spmat command provides for the creation of a number of different kinds of weights, and for file import and export

The symptom residual autocorrelation A map of the residuals of a least squares regression may help us to see whether neighbouring observations appear to have residuals od similar value, indicating positive autocorrelation, or possibly dissimilar values, indicating negative autocorrelation: OLS residuals 15 10 5 0 5 10 15

Spatial lag and Durbin models The spatial lag model is not dissimilar to traditional time series models, with the autocorrelation process in the dependent variable controlled by the exogenous matrix of spatial weights W (Ord 1975): y = ρwy + Xβ + ε, where y is an (N 1) vector of observations on a dependent variable taken at each of N locations, X is an (N k) matrix of exogenous variables, β is an (k 1) vector of parameters, ε is an (N 1) vector of disturbances and ρ is a scalar spatial error parameter (but called λ in Stata)

Spatial lag and Durbin models The spatial lag model is not dissimilar to traditional time series models, with the autocorrelation process in the dependent variable controlled by the exogenous matrix of spatial weights W (Ord 1975): y = ρwy + Xβ + ε, where y is an (N 1) vector of observations on a dependent variable taken at each of N locations, X is an (N k) matrix of exogenous variables, β is an (k 1) vector of parameters, ε is an (N 1) vector of disturbances and ρ is a scalar spatial error parameter (but called λ in Stata) The spatial Durbin model is the lag model augmented by spatially lagged right hand side variables: y = ρwy + Xβ + WXθ + ε, where θ is an ((k 1) 1) vector of parameters where W is row-standardised, and a (k 1) vector otherwise

Spatial lag model log-likelihood function l(β, ρ, σ 2 ) = N 2 ln 2π N 2 ln σ2 + ln I ρw 1 2σ 2 [ y (I ρw) (I X(X X) 1 X )(I ρw)y ] and β = (X X) 1 (I ˆρW)y, where ˆρ is the ML estimate. Unlike the time series case, the logarithm of the determinant of the (N N) asymmetric matrix (I ρw) does not tend to zero with increasing sample size; it constrains the parameter values to their feasible range between the inverses of the smallest and largest eigenvalues of W, since for positive autocorrelation, as ρ 1, ln I ρw

ML spatial lag model results The SE toolbox uses a pre-computed grid of log determinant values, choosing the nearest rather than computing exactly at each call to the log likelihood function; this accounts for slightly different coefficient estimates compated to other implementations. There are two implementations of the spatial lag in R in spdep and McSpatial respectively: SE toolbox R spdep R McSpatial Stata spatreg OpenGeoDa (Intercept) -6.2168-6.2034-6.2034-6.2034-6.2034-6.2034 log GDP pc 1.0014 0.9988 0.9988 0.9988 0.9988 0.9988 ρ 0.5610 0.5632 0.5632 0.5632 0.5632 0.5632 SE ρ 0.0760 0.0758 0.0703 0.0717 0.0717 0.0758 σ 2 27.1795 27.1605 27.1605 27.1605 27.1605 27.1605 LL -436.3166-491.0994-436.3408-491.0994-491.0994-491.0994

ML spatial lag model results The SE toolbox uses a pre-computed grid of log determinant values, choosing the nearest rather than computing exactly at each call to the log likelihood function; this accounts for slightly different coefficient estimates compated to other implementations. There are two implementations of the spatial lag in R in spdep and McSpatial respectively: SE toolbox R spdep R McSpatial Stata spatreg OpenGeoDa (Intercept) -6.2168-6.2034-6.2034-6.2034-6.2034-6.2034 log GDP pc 1.0014 0.9988 0.9988 0.9988 0.9988 0.9988 ρ 0.5610 0.5632 0.5632 0.5632 0.5632 0.5632 SE ρ 0.0760 0.0758 0.0703 0.0717 0.0717 0.0758 σ 2 27.1795 27.1605 27.1605 27.1605 27.1605 27.1605 LL -436.3166-491.0994-436.3408-491.0994-491.0994-491.0994

ML spatial lag model differences I There are two major discrepancies in the table of results: the first is that the log-likelihood values at the optimimum differ between R McSpatial and the SE toolbox and the rest. SE toolbox R spdep R McSpatial Stata spatreg OpenGeoDa (Intercept) -6.2168-6.2034-6.2034-6.2034-6.2034-6.2034 log GDP pc 1.0014 0.9988 0.9988 0.9988 0.9988 0.9988 ρ 0.5610 0.5632 0.5632 0.5632 0.5632 0.5632 SE ρ 0.0760 0.0758 0.0703 0.0717 0.0717 0.0758 σ 2 27.1795 27.1605 27.1605 27.1605 27.1605 27.1605 LL -436.3166-491.0994-436.3408-491.0994-491.0994-491.0994 The reason appears to be that π in the log likelihood calculation is not multiplied by 2 in these cases, but is in the remainder. If we convert the R McSpatial value of 436.3408 by subtracting n 2 log(π) (line 65 in file McSpatial/R/sarml.R), and adding n 2 log(2π), we get 491.0752. Similarly, correcting the SE toolbox value of 436.3166, we get 491.0752 (line 453 file spatial/sar models/sar.m). The same kind of difference appears in other reported SE toolbox log likelihood values

ML spatial lag model differences II The other discrepancy is in the coefficient standard errors, given here in full: SE toolbox R spdep R McSpatial Stata spatreg OpenGeoDa SE (Intercept) 2.0831 2.0823 2.0579 2.0597 2.0598 2.0823 SE log GDP pc 0.2784 0.2783 0.2729 0.2734 0.2734 0.2783 SE ρ 0.0760 0.0758 0.0703 0.0717 0.0717 0.0758 We see that R spdep and OpenGeoDa agree, with the SE toolbox function very close the code in spatial/sar models/sar.m following line 202 indicates that for N 500, asymptotic calculations will be used based on Anselin, (1980, 1988). The same source is used in R spdep and OpenGeoDa (probably after line 861, Regression/smile2.cpp), but what about the others?

ML spatial lag model differences II The other discrepancy is in the coefficient standard errors, given here in full: SE toolbox R spdep R McSpatial Stata spatreg OpenGeoDa SE (Intercept) 2.0831 2.0823 2.0579 2.0597 2.0598 2.0823 SE log GDP pc 0.2784 0.2783 0.2729 0.2734 0.2734 0.2783 SE ρ 0.0760 0.0758 0.0703 0.0717 0.0717 0.0758 We see that R spdep and OpenGeoDa agree, with the SE toolbox function very close the code in spatial/sar models/sar.m following line 202 indicates that for N 500, asymptotic calculations will be used based on Anselin, (1980, 1988). The same source is used in R spdep and OpenGeoDa (probably after line 861, Regression/smile2.cpp), but what about the others?

ML spatial lag model differences II more The standard errors reported by R McSpatial are taken from the Hessian returned by the optimization function nlm. The R spdep function lagsarlm can return a similar numerical Hessian, computed by as a finite difference Hessian by fdhess in nlme, or as the Hessian output by numerical optimization function optim, using by default a quasi-newton method due to Broyden, Fletcher, Goldfarb and Shanno (BFGS). Stata spreg ml can also use a bfgs technique, but the default is a modified Newton-Raphson method nr; spatreg uses optimization method lf for easy fitting of maximum likelihood models. Since the R and Stata BFGS standard errors agree, it is confirmed that Stata is reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n. R nlm R fdhess R BFGS Stata BFGS Stata NR Stata lf SE (Intercept) 2.0579 2.0605 2.0598 2.0598 2.0597 2.0598 SE log GDP pc 0.2729 0.2735 0.2734 0.2734 0.2734 0.2734 SE ρ 0.0703 0.0717 0.0717 0.0717 0.0717 0.0717

ML spatial lag model differences II more The standard errors reported by R McSpatial are taken from the Hessian returned by the optimization function nlm. The R spdep function lagsarlm can return a similar numerical Hessian, computed by as a finite difference Hessian by fdhess in nlme, or as the Hessian output by numerical optimization function optim, using by default a quasi-newton method due to Broyden, Fletcher, Goldfarb and Shanno (BFGS). Stata spreg ml can also use a bfgs technique, but the default is a modified Newton-Raphson method nr; spatreg uses optimization method lf for easy fitting of maximum likelihood models. Since the R and Stata BFGS standard errors agree, it is confirmed that Stata is reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n. R nlm R fdhess R BFGS Stata BFGS Stata NR Stata lf SE (Intercept) 2.0579 2.0605 2.0598 2.0598 2.0597 2.0598 SE log GDP pc 0.2729 0.2735 0.2734 0.2734 0.2734 0.2734 SE ρ 0.0703 0.0717 0.0717 0.0717 0.0717 0.0717

ML spatial lag model differences II more The standard errors reported by R McSpatial are taken from the Hessian returned by the optimization function nlm. The R spdep function lagsarlm can return a similar numerical Hessian, computed by as a finite difference Hessian by fdhess in nlme, or as the Hessian output by numerical optimization function optim, using by default a quasi-newton method due to Broyden, Fletcher, Goldfarb and Shanno (BFGS). Stata spreg ml can also use a bfgs technique, but the default is a modified Newton-Raphson method nr; spatreg uses optimization method lf for easy fitting of maximum likelihood models. Since the R and Stata BFGS standard errors agree, it is confirmed that Stata is reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n. R nlm R fdhess R BFGS Stata BFGS Stata NR Stata lf SE (Intercept) 2.0579 2.0605 2.0598 2.0598 2.0597 2.0598 SE log GDP pc 0.2729 0.2735 0.2734 0.2734 0.2734 0.2734 SE ρ 0.0703 0.0717 0.0717 0.0717 0.0717 0.0717

Maximum likelihood fitting differences The differences identified in the spatial lag case follow through for the other model specifications examined here.

Maximum likelihood fitting differences The differences identified in the spatial lag case follow through for the other model specifications examined here. The Matlab SE toolbox uses a grid rather than a line search/optimization to fit the spatial coefficient(s), so they usually agree only for the first few digits.

Maximum likelihood fitting differences The differences identified in the spatial lag case follow through for the other model specifications examined here. The Matlab SE toolbox uses a grid rather than a line search/optimization to fit the spatial coefficient(s), so they usually agree only for the first few digits. The Matlab SE toolbox also reports a log likelihood value using π rather than 2π.

Maximum likelihood fitting differences The differences identified in the spatial lag case follow through for the other model specifications examined here. The Matlab SE toolbox uses a grid rather than a line search/optimization to fit the spatial coefficient(s), so they usually agree only for the first few digits. The Matlab SE toolbox also reports a log likelihood value using π rather than 2π. Stata (both spreg ml and spatreg) reports coefficient standard errors taken from the coefficient covariance matrix (Hessian) used in optimization, rather than analytical values reported for small n by R spdep functions, Matlab SE toolbox functions, and OpenGeoDa.

ML spatial Durbin model results Only R and the Matlab SE toolbox provide the spatial Durbin model directly, Stata and OpenGeoDa can fit it after creating the lagged right-hand side variable(s) by hand. Here, all applications agree except SE toolbox, again because of the gridded log determinant value causing the numerical optimization to exit before finding the exact optimum. This is justified because the SE toolbox is intended to provide for Bayesian methods, and in that context the distribution of values is more important than point estimates: SE toolbox R spdep Stata spatreg OpenGeoDa (Intercept) -5.4303-5.4143-5.4143-5.4143-5.4143 log GDP pc 1.1639 1.1641 1.1641 1.1641 1.1641 lag log GDP pc -0.2726-0.2755-0.2755-0.2755-0.2755 ρ 0.5720 0.5734 0.5734 0.5734 0.5734 SE ρ 0.0775 0.0773 0.0741 0.0741 0.0773 σ 2 27.0420 27.0296 27.0296 27.0296 27.0296 LL -436.1960-490.9801-490.9801-490.9801-490.9801

Spatial error model There are a number of alternative forms of spatial regression models; here we will also consider the spatial error model (also known as the simultaneous autoregressive (SAR) model); the model may be written as (Ord 1975): y = Xβ + u, u = λwu + ε, where y is an (N 1) vector of observations on a dependent variable taken at each of N locations, X is an (N k) matrix of exogenous variables, β is an (k 1) vector of parameters, ε is an (N 1) vector of disturbances and λ is a scalar spatial error parameter (except by Kelejian and Prucha, who term it ρ),

Spatial error model There are a number of alternative forms of spatial regression models; here we will also consider the spatial error model (also known as the simultaneous autoregressive (SAR) model); the model may be written as (Ord 1975): y = Xβ + u, u = λwu + ε, where y is an (N 1) vector of observations on a dependent variable taken at each of N locations, X is an (N k) matrix of exogenous variables, β is an (k 1) vector of parameters, ε is an (N 1) vector of disturbances and λ is a scalar spatial error parameter (except by Kelejian and Prucha, who term it ρ), and u is a spatially autocorrelated disturbance vector with constant variance and covariance terms specified by a fixed spatial weights matrix and a single coefficient λ: u N(0, σ 2 (I λw) 1 (I λw ) 1 )

Spatial error model log-likelihood function The log-likelihood function for the spatial error model: l(β, λ, σ 2 ) = n 2 ln(2π) n 2 ln(σ2 ) + ln( I λw ) 1 2σ 2 [ (y Xβ) (I λw) (I λw)(y Xβ) ]

Spatial error model log-likelihood function The log-likelihood function for the spatial error model: l(β, λ, σ 2 ) = n 2 ln(2π) n 2 ln(σ2 ) + ln( I λw ) 1 2σ 2 [ (y Xβ) (I λw) (I λw)(y Xβ) ] As we can see, the problem is one of balancing the log determinant term ln( I λw ) against the sum of squares term. When λ approaches the ends of its feasible range, the log determinant term may swamp the sum of squares term

ML spatial error model results Once again, the line search with exact calculation of the log determinant for R, Stata and OpenGeoDa agrees fully. There are minor differences in the standard errors between R and OpenGeoDa on the one hand and Stata on the other, because of the use of analytical standard errors for small n in OpenGeoDa and R, with Stata using a numerical Hessian. The SE toolbox estimates differ somewhat, with λ truncated to its gridded value: SE toolbox R spdep Stata spatreg OpenGeoDa (Intercept) -7.4860-7.4865-7.4865-7.4865-7.4865 log GDP pc 1.3870 1.3871 1.3871 1.3871 1.3871 λ 0.5820 0.5819 0.5819 0.5819 0.5819 SE λ 0.0765 0.0765 0.0737 0.0737 0.0765 σ 2 27.1388 27.1399 27.1399 27.1399 27.1399 LL -436.7681-491.5267-491.5267-491.5267-491.5267

ML general spatial model results The general model includes two spatial processes and faces identification issues: y = ρwy + Xβ + u, u = λwu + ε, It is however used in some analyses, often in GM estimators, so the ML version is useful for comparison, but does need care in selecting starting values for numerical optimization; here R spdep and Stata spreg ml agree in all but coefficient standard errors (spdep and SE toolbox use analytical calculations, spreg uses the optimization Hessian): SE toolbox R spdep Stata (Intercept) -3.9367-4.0211-4.0210 log GDP pc 0.6161 0.6301 0.6301 ρ 0.7810 0.7735 0.7735 SE ρ 0.0742 0.0771 0.0804 λ -0.4650-0.4479-0.4479 SE λ 0.1846 0.1880 0.1937 σ 2 23.1479 23.3251 23.3251 LL -434.7130-489.5175-489.5175

ML general spatial model results The general model includes two spatial processes and faces identification issues: y = ρwy + Xβ + u, u = λwu + ε, It is however used in some analyses, often in GM estimators, so the ML version is useful for comparison, but does need care in selecting starting values for numerical optimization; here R spdep and Stata spreg ml agree in all but coefficient standard errors (spdep and SE toolbox use analytical calculations, spreg uses the optimization Hessian): SE toolbox R spdep Stata (Intercept) -3.9367-4.0211-4.0210 log GDP pc 0.6161 0.6301 0.6301 ρ 0.7810 0.7735 0.7735 SE ρ 0.0742 0.0771 0.0804 λ -0.4650-0.4479-0.4479 SE λ 0.1846 0.1880 0.1937 σ 2 23.1479 23.3251 23.3251 LL -434.7130-489.5175-489.5175

ML general model standard error differences The R spdep function sacsarlm can return a similar numerical Hessian, computed by as a finite difference Hessian by fdhess in nlme. Stata spreg ml can use a bfgs technique, but the default is a modified Newton-Raphson method nr. Since the R finite difference values and Stata BFGS standard errors agree closely, it appears that Stata is reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n. R asymptotic R fdhess Stata BFGS Stata NR SE (Intercept) 1.5972 1.6659 1.6660 1.6651 SE log GDP pc 0.2246 0.2350 0.2350 0.2348 SE ρ 0.0771 0.0805 0.0805 0.0804 SE λ 0.1880 0.1939 0.1939 0.1937

ML general model standard error differences The R spdep function sacsarlm can return a similar numerical Hessian, computed by as a finite difference Hessian by fdhess in nlme. Stata spreg ml can use a bfgs technique, but the default is a modified Newton-Raphson method nr. Since the R finite difference values and Stata BFGS standard errors agree closely, it appears that Stata is reporting standard errors taken from the Hessian returned by the optimization function, rather than the analytical calculations even for small n. R asymptotic R fdhess Stata BFGS Stata NR SE (Intercept) 1.5972 1.6659 1.6660 1.6651 SE log GDP pc 0.2246 0.2350 0.2350 0.2348 SE ρ 0.0771 0.0805 0.0805 0.0804 SE λ 0.1880 0.1939 0.1939 0.1937

The recent introduction of Stata TM and GeoDaSpace functions makes it helpful to compare them with SE toolbox and R functions

The recent introduction of Stata TM and GeoDaSpace functions makes it helpful to compare them with SE toolbox and R functions Within R, some functions have been contributed to spdep by Luc Anselin, and modified by the authors, and others are in sphet, which now uses the function wrapper spreg

The recent introduction of Stata TM and GeoDaSpace functions makes it helpful to compare them with SE toolbox and R functions Within R, some functions have been contributed to spdep by Luc Anselin, and modified by the authors, and others are in sphet, which now uses the function wrapper spreg The functions use different parts of the literature as bases for implementation, and the consequences of these choices will be made clear here

The recent introduction of Stata TM and GeoDaSpace functions makes it helpful to compare them with SE toolbox and R functions Within R, some functions have been contributed to spdep by Luc Anselin, and modified by the authors, and others are in sphet, which now uses the function wrapper spreg The functions use different parts of the literature as bases for implementation, and the consequences of these choices will be made clear here Once again, we examine spatial lag, spatial error, and general spatial models

GMM spatial lag models Using two stage least squares with Wy instrumented by [WX, WWX], all the functions yield the same coefficient estimates. In the two R and SE toolbox functions, the error variance is calculated as σ 2 = e e n k, while in the other two implementations is simply calculated as e e n : SE toolbox R spdep R sphet spreg Stata GeoDaSpace (Intercept) -5.7466-5.7466-5.7466-5.7466-5.7466 (2.4576) (2.4576) (2.4576) (2.4341) (2.4341) log GDP pc 0.9097 0.9097 0.9097 0.9097 0.9097 (0.3783) (0.3783) (0.3783) (0.3747) (0.3747) ρ 0.6370 0.6370 0.6370 0.6370 0.6370 (0.2283) (0.2283) (0.2283) (0.2261) (0.2261)

GMM spatial lag models Using two stage least squares with Wy instrumented by [WX, WWX], all the functions yield the same coefficient estimates. In the two R and SE toolbox functions, the error variance is calculated as σ 2 = e e n k, while in the other two implementations is simply calculated as e e n : SE toolbox R spdep R sphet spreg Stata GeoDaSpace (Intercept) -5.7466-5.7466-5.7466-5.7466-5.7466 (2.4576) (2.4576) (2.4576) (2.4341) (2.4341) log GDP pc 0.9097 0.9097 0.9097 0.9097 0.9097 (0.3783) (0.3783) (0.3783) (0.3747) (0.3747) ρ 0.6370 0.6370 0.6370 0.6370 0.6370 (0.2283) (0.2283) (0.2283) (0.2261) (0.2261)

The heteroskedastic error case I White standard errors may be calculated in most of the functions directly, in which the asymptotic VC matrix can be estimated consistently by the sandwich form: (Ẑ Ẑ) 1 (Ẑ ˆΣẐ)(Ẑ Ẑ) 1, where ˆΣ is a diagonal matrix whose elements are the ei 2 ; the results are the same in all cases: R spdep R sphet Stata GeoDaSpace (Intercept) 2.4417 2.4417 2.4417 2.4417 log GDP pc 0.3829 0.3829 0.3829 0.3829 ρ 0.2239 0.2239 0.2239 0.2239

The heteroskedastic error case II GeoDaSpace and sphet also implement the Kelejian and Prucha (2007) HAC estimator of the variance covariance matrix. Here we compare standard error estimates using a Triangular kernel with a variable bandwidth of the six nearest neighbours. The available options for the kernel function in R are the Epanechnikov, Triangular, Bisquare, Parzen, Tukey-Hanning and Quadratic Spectral. The options available in GeoDa space are the Uniform, Triangular, Epanechnikov, Quartic and Gaussian. GeoDa space only allows for the implementation of adaptive kernel. R sphet spreg GeoDaSpace (Intercept) 2.7512 2.7512 log GDP pc 0.4445 0.4445 ρ 0.2314 0.2314

GMM spatial error models In the GMM spatial error model, we depend on the first stage residuals, the implementation of the moment coment conditions, and the tuning of the optimiser finding the spatial parameter λ, as well as defintions for finding the standard error of λ. We see that there are three cases, the first for the SE toolbox, spdep, and GeoDaSpace-2, using the Kelejian and Prucha (1999) moment conditions (OLS first stage), and the standard error of λ from Pruch (2004): SE toolbox R spdep R sphet Stata GeoDaSpace-1 GeoDaSpace-2 (Intercept) -7.5798-7.5798-7.5858-8.8787-7.5664-7.5798 (3.0403) (3.0403) (3.0319) (3.2603) (3.0788) (3.0403) log GDP pc 1.3993 1.3993 1.4001 1.5688 1.3976 1.3993 (0.3767) (0.3767) (0.3758) (0.4072) (0.3796) (0.3767) λ 0.5621 0.5621 0.5604 0.5586 0.5850 0.5621 (0.1820) (0.1820) (0.0661) (0.0668) (0.0660)

GMM spatial error models The next case is for the sphet and GeoDaSpace-1 implementations, using Drukker et al., which use a TSLS first stage. The results are close, and differences seem to come from numerical optimization. The final case is Stata, which appears to use a restricted instrument set work is continuing to establish why this is chosen: SE toolbox R spdep R sphet Stata GeoDaSpace-1 GeoDaSpace-2 (Intercept) -7.5798-7.5798-7.5858-8.8787-7.5664-7.5798 (3.0403) (3.0403) (3.0319) (3.2603) (3.0788) (3.0403) log GDP pc 1.3993 1.3993 1.4001 1.5688 1.3976 1.3993 (0.3767) (0.3767) (0.3758) (0.4072) (0.3796) (0.3767) λ 0.5621 0.5621 0.5604 0.5586 0.5850 0.5621 (0.1820) (0.1820) (0.0661) (0.0668) (0.0660)

GMM spatial error models The next case is for the sphet and GeoDaSpace-1 implementations, using Drukker et al., which use a TSLS first stage. The results are close, and differences seem to come from numerical optimization. The final case is Stata, which appears to use a restricted instrument set work is continuing to establish why this is chosen: SE toolbox R spdep R sphet Stata GeoDaSpace-1 GeoDaSpace-2 (Intercept) -7.5798-7.5798-7.5858-8.8787-7.5664-7.5798 (3.0403) (3.0403) (3.0319) (3.2603) (3.0788) (3.0403) log GDP pc 1.3993 1.3993 1.4001 1.5688 1.3976 1.3993 (0.3767) (0.3767) (0.3758) (0.4072) (0.3796) (0.3767) λ 0.5621 0.5621 0.5604 0.5586 0.5850 0.5621 (0.1820) (0.1820) (0.0661) (0.0668) (0.0660)

Heteroskedasticity Results from sphet and GeoDaSpace are quite similar and the very minor differences (in the estimated value of the spatial parameter) seem to be due to differences in the optimizers. This confirms our intuition on the error model with homoskedastic errors. The standard error results can be made closer by making the same implementation choices, which differ slightly with regard to simplifications. Stata differs as before: R sphet-spreg Stata GeoDaSpace (Intercept) -7.5664-8.8695-7.5664 (2.9878) (3.1150) (2.9888) log GDP pc 1.3976 1.5676 1.3976 (0.3613) (0.3664) (0.3614) ρ 0.5731 0.5703 0.5735 (0.0743) (0.0754) (0.0742)

GMM implementations of the general (SARAR) model There are various implementations of the GMM general model. Some of them are based on the Kelejian and Prucha (1999) moment conditions (SE toolbox, gstsls in spdep and GeoDaSpace-2), the others are based on the Drukker, Egger and Prucha moments conditions (sphet, Stata and GeoDaSpace-1) with big differences in λ (ML: ρ 0.78, λ 0.45): SE toolbox R spdep R sphet Stata GeoDaSpace-1 GeoDaSpace-2 (Intercept) -5.8763-5.1817-5.1780-5.1780-5.1889-5.1817 (2.6631) (2.3185) (2.2101) (2.2101) (2.0800) (2.2963) log GDP pc 0.7986 0.8199 0.8193 0.8193 0.8210 0.8199 (0.3556) (0.3600) (0.3314) (0.3314) (0.3255) (0.3566) ρ 0.6938 0.6779 0.6781 0.6781 0.6774 0.6779 (0.2033) (0.2072) (0.1814) (0.1814) (0.1763) (0.2053) λ -0.1596-0.1596-0.4095-0.4095-0.4748-0.1596 (0.0363) (0.2502) (0.2502) (0.2224)

The general model under heteroskedasticity Three implementation are available: one from sphet, one from Stata, and the one from GeoDaSpace. The results are the same for all of the implementations. Again, sphet and GeDaSpace have the option of performing step 1.c from Arraiz et al. Even in this case, the results match, and, therefore, are not reported. R sphet-spreg Stata GeoDaSpace (Intercept) -5.1889-5.1889-5.1889 (2.2119) (2.2119) (2.2119) log GDP pc 0.8210 0.8210 0.8210 (0.3527) (0.3527) (0.3527) ρ 0.6774 0.6774 0.6774 (0.1845) (0.1845) (0.1845) λ -0.4497-0.4497-0.4497 (0.2562) (0.2562) (0.2562)

Interpreting spatial lag, Durbin and general models It has emerged over time, however, that the spatial dependence in the parameter ρ feeds back

Interpreting spatial lag, Durbin and general models It has emerged over time, however, that the spatial dependence in the parameter ρ feeds back This feedback comes from the fact that the reduced form model is y = (I ρw) 1 Xβ + (I ρw) 1 ε

Interpreting spatial lag, Durbin and general models It has emerged over time, however, that the spatial dependence in the parameter ρ feeds back This feedback comes from the fact that the reduced form model is y = (I ρw) 1 Xβ + (I ρw) 1 ε In the spatial lag model, y i / x jr = ((I ρw) 1 Iβ r ) ij, where I is the N N identity matrix, and (I ρw) 1 is known to be dense

Interpreting spatial lag, Durbin and general models It has emerged over time, however, that the spatial dependence in the parameter ρ feeds back This feedback comes from the fact that the reduced form model is y = (I ρw) 1 Xβ + (I ρw) 1 ε In the spatial lag model, y i / x jr = ((I ρw) 1 Iβ r ) ij, where I is the N N identity matrix, and (I ρw) 1 is known to be dense In the spatial Durbin model, y i / x jr = ((I ρw) 1 Iβ r Wθ r ) ij

Implementing impact measures The awkward S r (W) = ((I ρw) 1 Iβ r ) matrix term needed to calculate impact measures for the lag model, and S r (W) = ((I ρw) 1 (Iβ r Wθ r )) for the spatial Durbin model, may be approximated using traces of powers of the spatial weights matrix as well as analytically

Implementing impact measures The awkward S r (W) = ((I ρw) 1 Iβ r ) matrix term needed to calculate impact measures for the lag model, and S r (W) = ((I ρw) 1 (Iβ r Wθ r )) for the spatial Durbin model, may be approximated using traces of powers of the spatial weights matrix as well as analytically The average direct impacts are represented by the sum of the diagonal elements of the matrix divided by N for each exogenous variable

Implementing impact measures The awkward S r (W) = ((I ρw) 1 Iβ r ) matrix term needed to calculate impact measures for the lag model, and S r (W) = ((I ρw) 1 (Iβ r Wθ r )) for the spatial Durbin model, may be approximated using traces of powers of the spatial weights matrix as well as analytically The average direct impacts are represented by the sum of the diagonal elements of the matrix divided by N for each exogenous variable The average total impacts are the sum of all matrix elements divided by N for each exogenous variable

Implementing impact measures The awkward S r (W) = ((I ρw) 1 Iβ r ) matrix term needed to calculate impact measures for the lag model, and S r (W) = ((I ρw) 1 (Iβ r Wθ r )) for the spatial Durbin model, may be approximated using traces of powers of the spatial weights matrix as well as analytically The average direct impacts are represented by the sum of the diagonal elements of the matrix divided by N for each exogenous variable The average total impacts are the sum of all matrix elements divided by N for each exogenous variable The average indirect impacts are the differences between the direct and total impact vectors

Total impacts Total impacts are defined as the sum of the elements of S r (W) divided by N. Here we only have a single variable, so the table of total impacts is simple. The SE toolbox implementation uses Monte Carlo simulation to provide a measure of the significance of the impacts, and by default reports the mean of the simulated values, which may differ a little from the computed value. SE toolbox (simulated) SE toolbox (computed) R spdep Stata Spatial lag 2.3384 2.2809 2.2863 2.2863 Spatial Durbin 2.0998 2.0823 2.0827 2.0827 General 3.2103 2.8132 2.7803 2.7816 These are only headline values, and do not do justice to the possibilities for interpretation offered by this analytical advance.

Finding total impacts in Stata In Stata, we use the difference in predictions from the reduced form model when incrementing a chosen right-hand side variable, or even a single observation on that variable:. spreg ml democracy x, id(id) dlmat(w). predict y0. generate x_orig = x. quietly replace x = x + 1. predict y1. generate deltay = y1-y0. mean deltay. quietly replace x = x_orig We can do the same in R, using EXP to increment the variable in the scope of the objects: > EXP <- exp(0) > form <- formula(democracy ~ + log((gdp_2002/population) * + EXP)) > sldv.lag <- lagsarlm(form, data = sldv, + listw = lw) > p0 <- predict(sldv.lag, newdata = sldv, + listw = lw) > EXP <- exp(1) > p1 <- predict(sldv.lag, newdata = sldv, + listw = lw) > d <- p1 - p0 > mean(d) [1] 2.286267

Spatial lag model impacts We use Monte Carlo methods to infer from the impacts, by drawing from the fitted model using the variance-covariance matrix of fitted coefficients: > set.seed(120823) > imp.lag <- impacts(sldv.lag, tr = trmat, R = 1999) > summary(imp.lag, short = TRUE, zstats = TRUE) Impact measures (lag, trace): Direct Indirect Total log((gdp_2002/population) * EXP) 1.088804 1.197463 2.286267 ======================================================== Simulation results (asymptotic variance matrix): ======================================================== Simulated z-values: Direct Indirect Total log((gdp_2002/population) * EXP) 3.70437 2.665298 3.429894 Simulated p-values: Direct Indirect Total log((gdp_2002/population) * EXP) 0.00021192 0.007692 0.00060382 > invirw <- invirw(lw, rho = coef(sldv.lag)[1]) > N <- nrow(sldv) > Sr <- invirw %*% (diag(n) * coef(sldv.lag)[3]) > c(direct = sum(diag(sr))/n, indirect = sum(sr)/n - sum(diag(sr))/n, + total = sum(sr)/n) direct indirect total 1.088804 1.197463 2.286267

Spatial Durbin model impacts The spatial Durbin impacts are less significant than those of the spatial lag model: > EXP <- exp(0) > sldv.sd <- lagsarlm(form, data = sldv, listw = lw, type = "mixed") > imp.sd <- impacts(sldv.sd, tr = trmat, R = 1999) > summary(imp.sd, short = TRUE, zstats = TRUE) Impact measures (mixed, trace): Direct Indirect Total log((gdp_2002/population) * EXP) 1.228741 0.85399 2.082731 ======================================================== Simulation results (asymptotic variance matrix): ======================================================== Simulated z-values: Direct Indirect Total log((gdp_2002/population) * EXP) 3.01615 1.033488 2.618732 Simulated p-values: Direct Indirect Total log((gdp_2002/population) * EXP) 0.0025601 0.30138 0.0088257 > invirw <- invirw(lw, rho = coef(sldv.sd)[1]) > Sr <- invirw %*% ((diag(n) * coef(sldv.sd)[3]) + (W * coef(sldv.sd)[4])) > c(direct = sum(diag(sr))/n, indirect = sum(sr)/n - sum(diag(sr))/n, + total = sum(sr)/n) direct indirect total 1.2287411 0.8539901 2.0827312

General model impacts The general model impacts are similar to those of the spatial lag model: > EXP <- exp(0) > sldv.sac <- sacsarlm(form, data = sldv, listw = lw) > imp.sac <- impacts(sldv.sac, tr = trmat, R = 1999) > summary(imp.sac, short = TRUE, zstats = TRUE) Impact measures (sac, trace): Direct Indirect Total log((gdp_2002/population) * EXP) 0.78292 1.997377 2.780297 ======================================================== Simulation results (asymptotic variance matrix): ======================================================== Simulated z-values: Direct Indirect Total log((gdp_2002/population) * EXP) 3.088464 2.049188 2.476218 Simulated p-values: Direct Indirect Total log((gdp_2002/population) * EXP) 0.0020119 0.040444 0.013278 > invirw <- invirw(lw, rho = coef(sldv.sac)[1]) > Sr <- invirw %*% ((diag(n) * coef(sldv.sac)[4])) > c(direct = sum(diag(sr))/n, indirect = sum(sr)/n - sum(diag(sr))/n, + total = sum(sr)/n) direct indirect total 0.7829471 1.9986018 2.7815488

Impact measures At present, there is no provision for measures of impact in OpenGeoDa or Pysal

Impact measures At present, there is no provision for measures of impact in OpenGeoDa or Pysal The total impact (emanating effect, equilibrium effect) can be calculated in Stata, but not broken down into direct and indirect

Impact measures At present, there is no provision for measures of impact in OpenGeoDa or Pysal The total impact (emanating effect, equilibrium effect) can be calculated in Stata, but not broken down into direct and indirect Only SE toolbox and R provide full support with Monte Carlo simulation for inference

Impact measures At present, there is no provision for measures of impact in OpenGeoDa or Pysal The total impact (emanating effect, equilibrium effect) can be calculated in Stata, but not broken down into direct and indirect Only SE toolbox and R provide full support with Monte Carlo simulation for inference They draw samples from the fitted model using the coefficient values and covariance matrix, and present summaries of the sample values

Distributions of general model impact measures Once one has the samples, it is possible to show how the distributions shift. In this case, the direct impacts lie further from zero than the coefficient, followed by indirect impacts even further from zero, with the total impacts shifted substantially beyond the shape of the distribution of the coefficient: 0.0 0.5 1.0 1.5 Direct Indirect Total Coefficient 2 0 2 4 6 8 Log GDP pc

Conclusions Background In this case, impact measures were not needed, because the LR and Hausman tests pointed to the spatial error specification; a recent paper by Pace and Zhu (2012) points to a enhanced error Durbin model as being of promise (it didn t help here).

Conclusions Background In this case, impact measures were not needed, because the LR and Hausman tests pointed to the spatial error specification; a recent paper by Pace and Zhu (2012) points to a enhanced error Durbin model as being of promise (it didn t help here). We have not considered Bayesian estimation methods, which will be covered in a separate study, where the SE toolbox is the only alternative so far, but an R GSoC project has been carried out in 2012

Conclusions Background In this case, impact measures were not needed, because the LR and Hausman tests pointed to the spatial error specification; a recent paper by Pace and Zhu (2012) points to a enhanced error Durbin model as being of promise (it didn t help here). We have not considered Bayesian estimation methods, which will be covered in a separate study, where the SE toolbox is the only alternative so far, but an R GSoC project has been carried out in 2012 The arrival of Stata s sppack opens up the alternatives a lot, but its spatial weights are dense or banded, limiting maximum likelihood estimation to smaller data sets

Conclusions Background In this case, impact measures were not needed, because the LR and Hausman tests pointed to the spatial error specification; a recent paper by Pace and Zhu (2012) points to a enhanced error Durbin model as being of promise (it didn t help here). We have not considered Bayesian estimation methods, which will be covered in a separate study, where the SE toolbox is the only alternative so far, but an R GSoC project has been carried out in 2012 The arrival of Stata s sppack opens up the alternatives a lot, but its spatial weights are dense or banded, limiting maximum likelihood estimation to smaller data sets Estimating models with maximum likelihood for large data sets is possible in the SE toolbox, OpenGeoDa and R using sparse matrix methods; GM models are not as limited by the size of data sets given care in avoiding handling n n matrices