W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

Similar documents
W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

Performance of W-AMOEBA and W-Contiguity matrices in Spatial Lag Model

Network data in regression framework

splm: econometric analysis of spatial panel data

Lecture 7: Spatial Econometric Modeling of Origin-Destination flows

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

SIMULATION AND APPLICATION OF THE SPATIAL AUTOREGRESSIVE GEOGRAPHICALLY WEIGHTED REGRESSION MODEL (SAR-GWR)

SEM 2: Structural Equation Modeling

Exploring Cultural Differences with Structural Equation Modelling

Accounting for Population Uncertainty in Covariance Structure Analysis

Assignment 1. SEM 2: Structural Equation Modeling

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Departamento de Economía Universidad de Chile

A correlated randon effects spatial Durbin model

Bayesian Estimation of Input Output Tables for Russia

No

Plausible Values for Latent Variables Using Mplus

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

Interpreting dynamic space-time panel data models

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Modeling conditional distributions with mixture models: Theory and Inference

Nesting and Equivalence Testing

1 Estimation of Persistent Dynamic Panel Data. Motivation

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Measuring The Benefits of Air Quality Improvement: A Spatial Hedonic Approach. Chong Won Kim, Tim Phipps, and Luc Anselin

Quantile Regression for Dynamic Panel Data

Spatial Econometrics

Chapter 4: Factor Analysis

A DYNAMIC SPACE-TIME PANEL DATA MODEL OF STATE-LEVEL BEER CONSUMPTION

Regional Science and Urban Economics

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Analyzing spatial autoregressive models using Stata

Lecture 6: Hypothesis Testing

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

Using Mplus individual residual plots for. diagnostics and model evaluation in SEM

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

Lecture Stat Information Criterion

Spatial Dependence in Regressors and its Effect on Estimator Performance

GMM estimation of spatial panels

Spatial Autocorrelation (2) Spatial Weights

A Test of Cointegration Rank Based Title Component Analysis.

Econometrics of Panel Data

Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters, and a Comparison to Other Clustering Algorithms

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Estimating the validity of administrative and survey variables through structural equation modeling

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Specification Testing for Panel Spatial Models with Misspecifications

Heteroskedasticity in Panel Data

Heteroskedasticity in Panel Data

Latent variable interactions

SUPPLEMENTARY TABLES FOR: THE LIKELIHOOD RATIO TEST FOR COINTEGRATION RANKS IN THE I(2) MODEL

Using Estimating Equations for Spatially Correlated A

RAO s SCORE TEST IN SPATIAL ECONOMETRICS

Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Structural Equation Modeling An Econometrician s Introduction

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

Short Questions (Do two out of three) 15 points each

Latent Factor Regression Models for Grouped Outcomes

ISSN Article

Bayesian Analysis of Latent Variable Models using Mplus

Pitfalls in higher order model extensions of basic spatial regression methodology

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Spatial Analysis 2. Spatial Autocorrelation

Short T Panels - Review

Introduction to Spatial Statistics and Modeling for Regional Analysis

miivfind: A command for identifying model-implied instrumental variables for structural equation models in Stata

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Testing Random Effects in Two-Way Spatial Panel Data Models

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

Link to Paper. The latest iteration can be found at:

IEOR165 Discussion Week 5

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety

SEM 2: Structural Equation Modeling

. a m1 a mn. a 1 a 2 a = a n

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

A Dimension Reduction Technique for Estimation in Linear Mixed Models

Using Matrix Exponentials to Explore Spatial Structure in Regression Relationships

Information Theoretic Estimators of the First-Order Spatial Autoregressive Model

Do not copy, quote, or cite without permission LECTURE 4: THE GENERAL LISREL MODEL

Indirect estimation of a simultaneous limited dependent variable model for patient costs and outcome

Spatial Effects and Externalities

I. Multinomial Logit Suppose we only have individual specific covariates. Then we can model the response probability as

Comparing estimation methods. econometrics

On the econometrics of the Koyck model

Global Model Fit Test for Nonlinear SEM

Linear Models and Estimation by Least Squares

Vector error correction model, VECM Cointegrated VAR

Proceedings of the 8th WSEAS International Conference on APPLIED MATHEMATICS, Tenerife, Spain, December 16-18, 2005 (pp )

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test

Quantile regression and heteroskedasticity

Appendix A: The time series behavior of employment growth

The Economics of European Regions: Theory, Empirics, and Policy

Transformed Maximum Likelihood Estimation of Short Dynamic Panel Data Models with Interactive Effects

Spatial Econometric STAR Models: Lagrange Multiplier Tests and Monte Carlo Simulations

Testing Structural Equation Models: The Effect of Kurtosis

Transcription:

1 W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS An Liu University of Groningen Henk Folmer University of Groningen Wageningen University Han Oud Radboud University Nijmegen

2 Objective To evaluate the performances of W-based and latent variables spatial modeling approaches by means of Monte Carlo simulations in a more comprehensive setting

Motivated by previous simulation studies: 3 Two data generation schemes - on the basis of the structure of a classical W-based spatial autoregressive model using a contiguity or inverse distance weight matrix - also based on the W-based model structure, but with spatial dependence incorporated as spillover from hotspots weighted by inverse distance Evaluation of estimators in terms of bias and RMSE - each approach outperformed the other in some cases, but neither of them was obviously dominant in both settings

W-based spatial autoregressive model 4 y = ρ Wy + Xβ + ε ε ~ N(, σ 2 I n ) Maximum likelihood estimation: N N 2 1 L = lnπ lnσ + ln A 2 2 2σ A = I ρw 2 ( Ay Xβ ) ( Ay Xβ ) Jacobian correction: ln w i I ρw = ln Π ( 1 ρw ) = Σ ln( 1 : the eigenvalues of W i i i ρw ) i

5 Latent variables Refer to those phenomena that are supposed to exist but cannot be directly observed Can be measured by means of observables Example: socio-economic status, measured by income, education level and employment status

Structural equation modeling (SEM) - structural model: relationships between the latent variables η = Bη + ζ with cov( ζ ) = Ψ - measurement model: relationships between the latent variables and their observable indicators y = Λη + ε with cov( ε ) = Θ Maximum likelihood estimation: N N 1 pn l θ Y = ln Σ tr SΣ ln 2 2 2 2 Σ: theoretical covariance matrix ( ) ( ) π 1 1 Σ = Λ( I B) Ψ( I B' ) Λ ' +Θ S: observed covariance matrix 6

7 SEM representation of the observed spatial lag model Structural equation: y = ρη + γ ' x + ζ Measurement equation: y = Λη + ε Jacobian correction: ~ N N l 2 2 ρ ρ A = I S1 S2 L mλ mλ ( θ y ) = ln A ln Σ tr( SΣ ) ln 2π S m 1 2 m ρ λ 1 pn 2 : the selection function for the mth indicator λ m : the factor loading of the mth indicator m S m

8 Simulation study design Rationale: two types of spatial dependence are considered in order to solve a common problem, e.g. economic activity in a region could usually be influenced by both neighbors and economic centre (hotspot) By introducing two spatial lag parameters ρ 1 (for hotspot) and ρ 2 (for neighbors), we get - a broader and more inclusive definition of spatial dependence in sample generation from a practical perspective - a more comprehensive comparison of the performance of the classical W-based model and the SEM approach

9 Simulation study design (cont d) Map: regular lattice structures of dimensions 7 7, 1 1, 15 15 (N = 49, 1, 225) Samples generated based on the structure of the standard spatial lag model: y = ρ 1W 1y + ρ2w2 y + xβ + ε 1 W 1 is the inverse distance matrix with elements equal to for cell i and hotspot j and zero elsewhere; d ij W 2 is a first-order contiguity or inverse distance matrix. Hotspot needs to be fixed before sample generation: according to the values of x (largest value)

Sample generation procedure 1 1. Generate the exogenous variable x by drawing from U(,1); 2. Fix β =1 for all simulation runs; 3. ρ1 and ρ2 take values,.1,.3,.5,.7 and.9 consecutively under constraint: I ρ W ρ W ; - ML estimation requires I ρw 1 1 2 2 > 4. Generate values for the error term by randomly drawing from N(, 2); 5. Choose the hotspot according to the values of x in step 1 and 1 compute y as: y = ( I ρ W ρ W ) ( xβ + ). > ε 1 1 2 2 ε

Estimation and analysis 11 Repeat estimation procedure of W-based models (I. true model; II. first order contiguity or inverse distance, depends on W 2 used in sample generation) and SEM (first three nearest neighbors and spillover from hotspot as indicators) Number of replications: 5 Compute bias and RMSE of the estimators for β, the only comparable regression coefficient Compare two approaches over the dimensions of different value combinations of spatial lag parameters, specifications of weight matrices and sample sizes

Simulation results in graphs Bias in absolute value (N = 49, W 2 = contiguity).8.7.6.5.4.3.2.1 12.1.1.1.1.1.1.3.3.3.3.3.5.5.5.5.7 abs(bias).1.3.5.7.9.1.3.5.7.9.1.3.5.7.1.3.5.7.7.9.9.1.3.1 TRUE CONT SEM rho2-contiguity rho1-hotspot

RMSE (N = 49, W 2 = contiguity).21.19.17.15.13.11.9.7.5 13.1.1.1.1.1.1.3.3.3.3.3.5 RMSE.1.3.5.7.9.1.3.5.7.9.1.3.5.7.5.5.5.1.3.5.7.7.7.9.9.1.3.1 TRUE CONT SEM rho2-contiguity rho1-hotspot

Bias in absolute value (N = 49, W 2 = inverse distance).5.45.4.35.3.25.2.15.1.5. 14.9.1.1.1.1.1.1.3.3.3.3.3.5.5.5.5.7.7.7 abs(bias).1.3.5.7.9.1.3.5.7.9.1.3.5.7.1.3.5.1.3.9.1 TRUE INVD SEM rho2-inverse distance rho1-hotspot

RMSE (N = 49, W 2 = inverse distance).5.45.4.35.3.25.2.15.1.5 15.1.1.1.1.1.1.3.3.3.3.3.5 RMSE.1.3.5.7.9.1.3.5.7.9.1.3.5.7.5.5.5.1.3.5.7.7.7.9.9.1.3.1 TRUE INVD SEM rho2-inverse distance rho1-hotspot

Mean of bias grouped by ρ 1 (N = 49, 1, 225) 16

Mean of RMSE grouped by ρ 1 (N = 49, 1, 225) 17

18 Conclusions SEM frequently has smaller bias and RMSE than the misspecified W-based models SEM increasingly outperforms W-based models as the spatial lag parameter for spillover from hotspot goes up Both approaches perform better and their differences get smaller in terms of RMSE with larger sample sizes The leading chances of SEM grows by sample size SEM is also more stable than misspecified W-based models in terms of variations in bias and RMSE

19 Discussions SEM: not all model search options were exploited - indicators were fixed whereas the optimal choice and number could be tested and identified for each sample - the option of using more than one latent variable would bring SEM closer to the correctly specified model, i.e. one for spillover from hotspots, one for neighbors

Thank you for your attention! 2