Spatial Autocorrelation

Similar documents
Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Autocorrelation (2) Spatial Weights

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Global Spatial Autocorrelation Clustering

Local Spatial Autocorrelation Clusters

Outline ESDA. Exploratory Spatial Data Analysis ESDA. Luc Anselin

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

Creating and Managing a W Matrix

Nature of Spatial Data. Outline. Spatial Is Special

Lecture 1: Introduction to Spatial Econometric

Basics of Geographic Analysis in R

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

Introduction to Spatial Statistics and Modeling for Regional Analysis

Exploratory Spatial Data Analysis Using GeoDA: : An Introduction

Clusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Geographical Information Systems Institute. Center for Geographic Analysis, Harvard University. GeoDa: Spatial Autocorrelation

Mapping and Analysis for Spatial Social Science

Spatial Data Mining. Regression and Classification Techniques

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

Exploratory Spatial Data Analysis (And Navigating GeoDa)

Spatial Regression. 6. Specification Spatial Heterogeneity. Luc Anselin.

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

2/7/2018. Module 4. Spatial Statistics. Point Patterns: Nearest Neighbor. Spatial Statistics. Point Patterns: Nearest Neighbor

Lecture 8. Spatial Estimation

Exploratory Spatial Data Analysis (ESDA)

SASI Spatial Analysis SSC Meeting Aug 2010 Habitat Document 5

Construction Engineering. Research Laboratory. Approaches Towards the Identification of Patterns in Violent Events, Baghdad, Iraq ERDC/CERL CR-09-1

Lecture 5 Geostatistics

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis:

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

KAAF- GE_Notes GIS APPLICATIONS LECTURE 3

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

The Study on Trinary Join-Counts for Spatial Autocorrelation

Overview of Spatial analysis in ecology

Spatial Analysis 2. Spatial Autocorrelation

Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters, and a Comparison to Other Clustering Algorithms

Spatial Effects and Externalities

On A Comparison between Two Measures of Spatial Association

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

The Use of Spatial Weights Matrices and the Effect of Geometry and Geographical Scale

SUPPLEMENTARY INFORMATION

Modeling the Ecology of Urban Inequality in Space and Time

CHAPTER 3 APPLICATION OF MULTIVARIATE TECHNIQUE SPATIAL ANALYSIS ON RURAL POPULATION UNDER POVERTYLINE FROM OFFICIAL STATISTICS, THE WORLD BANK

Attribute Data. ArcGIS reads DBF extensions. Data in any statistical software format can be

Temporal vs. Spatial Data

A Space-Time Model for Computer Assisted Mass Appraisal

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems

Representation of Geographic Data

What s special about spatial data?

Box-Jenkins ARIMA Advanced Time Series

Math 378 Spring 2011 Assignment 4 Solutions

The Use of Local Moran s I in Determining the Effectiveness of Location for Gas Extraction

Node Failure Localization: Theorem Proof

Testing for global spatial autocorrelation in Stata

Title. Description. var intro Introduction to vector autoregressive models

Geovisualization. Luc Anselin. Copyright 2016 by Luc Anselin, All Rights Reserved

Lecture 6: Hypothesis Testing

Community Health Needs Assessment through Spatial Regression Modeling

Introduction to statistical analysis of Social Networks

Appendix: Modeling Approach

Areal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield

Writing: Algorithms, Mathematics, Definitions

Geostatistical Interpolation: Kriging and the Fukushima Data. Erik Hoel Colligium Ramazzini October 30, 2011

Non-independence due to Time Correlation (Chapter 14)

Soc/Anth 597 Spatial Demography March 14, GeoDa 0.95i Exercise A. Stephen A. Matthews. Outline. 1. Background

Spatial Regression. 10. Specification Tests (2) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

Charles J. Fisk * Naval Base Ventura County - Point Mugu, CA 1. INTRODUCTION

University of Florida CISE department Gator Engineering. Clustering Part 1

Challenges in Geocoding Socially-Generated Data

Spatial Analysis with ArcGIS Pro STUDENT EDITION

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

Working Paper No Introduction to Spatial Econometric Modelling. William Mitchell 1. April 2013

Spatial Tools for Econometric and Exploratory Analysis

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Identification of Economic Clusters Using ArcGIS Spatial Statistics. Joseph Frizado Bruce Smith Michael Carroll

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

Sets. Alice E. Fischer. CSCI 1166 Discrete Mathematics for Computing Spring, Outline Sets An Algebra on Sets Summary

Molecular computer experiment

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging

Chap 2: Classical models for information retrieval

One-way ANOVA (Single-Factor CRD)

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Contents. Learning Outcomes 2012/2/26. Lecture 6: Area Pattern and Spatial Autocorrelation. Dr. Bo Wu

The Analysis of Sustainability Development of Eastern and South Eastern Europe in the Post Socialist Period

Lecture 3: Spatial Analysis with Stata

Local Search & Optimization

Application of the Getis-Ord Gi* statistic (Hot Spot Analysis) to seafloor organisms

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

Rethinking the Migration Effects of Natural Amenities: Part II

GIS and Spatial Statistics: One World View or Two? Michael F. Goodchild University of California Santa Barbara

In matrix algebra notation, a linear model is written as

Understanding Generalization Error: Bounds and Decompositions

Transcription:

Spatial Autocorrelation Luc Anselin http://spatial.uchicago.edu

spatial randomness positive and negative spatial autocorrelation spatial autocorrelation statistics spatial weights

Spatial Randomness

The Null Hypothesis spatial randomness is absence of any pattern spatial randomness is not very interesting if rejected, then there is evidence of spatial structure

Interpreting Spatial Randomness observed spatial pattern of values is equally likely as any other spatial pattern value at one location does not depend on values at other (neighboring) locations

Operationalizing Spatial Randomness under spatial randomness, the location of values may be altered without affecting the information content of the data random permutation or reshuffling of values

true map randomly reshuffled

Tobler s First Law of Geography everything depends on everything else, but closer things more so structures spatial dependence importance of distance decay

Positive and Negative Spatial Autocorrelation

Rejecting the Null Hypothesis rejecting spatial randomness (s.r.) like values in neighboring locations occur more frequently than for s.r. = positive spatial autocorrelation dissimilar (e.g., high vs low) in neighboring locations occur more frequently than for s.r. = negative spatial autocorrelation

Positive Spatial Autocorrelation impression of clustering clumps of like values like values can be either high (hot spots) or low (cold spots) difficult to rely on human perception

< positive s.a. random >

Negative Spatial Autocorrelation checkerboard pattern hard to distinguish from spatial randomness

< negative s.a. random >

Spatial Autocorrelation Statistics

What is a Test Statistic? a statistic is any value that summarizes characteristics of a distribution calculated from the data test statistic: calculated from the data and compared to a reference distribution how likely is the value if it had occurred under the null hypothesis (spatial randomness) when unlikely (low p value) the null hypothesis is rejected

compare value of test statistic to its distribution under the null hypothesis of spatial randomness

Spatial Autocorrelation Statistic captures both attribute similarity and locational similarity how to construct an index from the data that captures both attribute similarity and locational similarity (i.e., neighbors are alike)

Attribute Similarity summary of the similarity (or dissimilarity) of observations for a variable at different locations variable y locations i, j how to construct f(y i, yj)

Similarity Measure cross product: yi.yj under randomness, cross product is not systematically large or small when large values are systematically together, product will be larger, and vice versa

Dissimilarity Measure squared difference: (yi -yj) 2 absolute difference: y i - yj under randomness, difference measure will not be systematically large or small when small values or large values are systematically together, difference measures will be smaller

Locational Similarity formalizing the notion of neighbors = spatial weights (wij) when are two spatial units i and j a priori likely to interact not necessarily a geographical notion, can be based on social network concepts or general distance concepts (distance in multivariate space)

General Spatial Autocorrelation Statistic general form sum over all observations of an attribute similarity measure with the neighbors f(x ixj) is attribute similarity between i and j for x w ij is a spatial weight between i and j statistic = Σij f(xixj).wij

Spatial Weights

Basic Concepts

Why Spatial Weights formal expression of locational similarity spatial autocorrelation is about interaction n x (n - 1)/2 pairwise interactions but only n observations in a cross-section insufficient information to extract pattern of interaction from cross-section example: North Carolina has 100 counties 5,000 pairwise interactions, 100 observations

Solution impose structure limit the number of parameters to be estimated incidental parameter problem = number of parameters grows with sample size for spatial interaction, number of parameters grows with n 2

Spatial Weights exclude some interactions constrain the number of neighbors, e.g., only those locations that share a border single parameter = spatial autocorrelation coefficient strength of interaction = combined effect of coefficient and weights small coefficient with large weights large coefficient with small weights

six polygons - neighbors share common border

link node neighbor structure as a graph

Spatial Weights Matrix Definition N by N positive matrix W with elements wij w ij non-zero for neighbors wij = 0, i and j are not neighbors wii = 0, no self-similarity

spatial weights matrix W with elements wij

Geography-Based Spatial Weights

Binary Contiguity Weights contiguity = common border i and j share a border, then w ij = 1 i and j are not neighbors, then wij = 0 weights are 0 or 1, hence binary

binary contiguity weights matrix for six-region example

1 2 3 4 5 6 7 8 9 contiguity on a regular grid - different definitions

1 2 3 4 5 6 7 8 9 rook contiguity - edges only 2, 4, 6, 8 are neighbors of 5

1 2 3 4 5 6 7 8 9 bishop contiguity - corners only 1, 3, 7, 9 are neighbors of 5

1 2 3 4 5 6 7 8 9 queen contiguity - edges and corners 5 has eight neighbors

rook queen Solano county, CA contiguity

Distance-Based Weights distance between points distance between polygon centroids or central points in general, can be any function of distance that implies distance decay, e.g., inverse distance in practice, mostly based on a notion of contiguity defined by distance

Distance-Band Weights wij nonzero for dij < d less than a critical distance d potential problem: isolates = no neighbors make sure critical distance is max-min, i.e., the largest of the nearest neighbor distance for each observation

distance-band weights Solano county, CA, distance-band neighbors d = 90 mi

distance-band weights d = 80 mi no neighbors d = 90 mi one neighbor Elko county, NV, distance band neighbors

k-nearest Neighbor Weights k nearest observations, irrespective of distance fixes isolates problem for distance bands same number of neighbors for all observations in practice, potential problem with ties needs tie-breaking rule (random, include all)

k-nearest neighbor weights k = 4 k = 6 Elko county, NV, k-nearest neighbors

many spatial weights file formats

Spatial Weights Transformations

Row-Standardized Weights rescale weights such that Σj wij = 1 wij * = wij / Σj wij constrains parameter space makes analyses comparable spatial lag = average of the neighbors

row-standardized weights matrix

Stochastic Weights double standardization wij * = wij / Σi Σj wij rescaled such that Σi Σj wij = 1 similar to probability

stochastic weights matrix

second order contiguity: neighbor of neighbor

redundancy in higher order contiguity paths of lenght 2 between 1 and other cells

Higher Order Weights recursive definition k-th order neighbor is first order neighbor of (k-1)th order neighbor avoid duplication, only unique neighbors of a given order (not both first and second order) pure contiguity or cumulative contiguity, i.e., lower order neighbors included in weights

exclusive of first order inclusive of first order Solano county, CA, second order contiguity

Properties of Weights

Connectivity Histogram histogram of number of neighbors neighbor cardinality diagnostic for isolates or neighborless units assess characteristics of the distribution Copyright 2012 by Luc Anselin, All Rights Reserved

Things to Watch for isolates need to be removed for proper spatial analysis do not need to be removed for standard analysis very large number of neighbors bimodal distribution Copyright 2012 by Luc Anselin, All Rights Reserved

rook queen connectivity histogram - contiguity weights U.S. counties

default distance 90 mi critical distance 80 mi connectivity histogram - contiguity weights U.S. counties

contiguity histogram for k-nearest neighbors or, what did you expect?