Nature of Spatial Data. Outline. Spatial Is Special

Similar documents
Lecture 8. Spatial Estimation

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Lecture 5 Geostatistics

What s special about spatial data?

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

Spatial Data Mining. Regression and Classification Techniques

Contents. Learning Outcomes 2012/2/26. Lecture 6: Area Pattern and Spatial Autocorrelation. Dr. Bo Wu

Temporal vs. Spatial Data

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis:

Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

GIST 4302/5302: Spatial Analysis and Modeling

Interpolating Raster Surfaces

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis

Overview of Statistical Analysis of Spatial Data

Spatial Autocorrelation

Dr Arulsivanathan Naidoo Statistics South Africa 18 October 2017

The Nature of Geographic Data

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Everything is related to everything else, but near things are more related than distant things.

SPATIAL ANALYSIS. Transformation. Cartogram Central. 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference

KAAF- GE_Notes GIS APPLICATIONS LECTURE 3

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

Areal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield

Spatial Statistics or Why Spatial is Special?

GIS and Spatial Statistics: One World View or Two? Michael F. Goodchild University of California Santa Barbara

Spatial Analysis 1. Introduction

GIST 4302/5302: Spatial Analysis and Modeling

Basics of Geographic Analysis in R

2/7/2018. Module 4. Spatial Statistics. Point Patterns: Nearest Neighbor. Spatial Statistics. Point Patterns: Nearest Neighbor

Introducing GIS analysis

Introduction to Spatial Statistics and Modeling for Regional Analysis

Combining Incompatible Spatial Data

Statistics for cities

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World

Neighborhood effects in demography: measuring scales and patterns

Spatial Data, Spatial Analysis and Spatial Data Science

Guilty of committing ecological fallacy?

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

Representation of Geographic Data

Spatial Analysis II. Spatial data analysis Spatial analysis and inference

Global Spatial Autocorrelation Clustering

multilevel modeling: concepts, applications and interpretations

A Micro-Spatial Analysis of Violent Crime

Quantitative Methods Geography 441. Course Requirements

A spatial literacy initiative for undergraduate education at UCSB

Concepts and Applications of Kriging. Eric Krause Konstantin Krivoruchko

Hypothesis Testing hypothesis testing approach

Tracey Farrigan Research Geographer USDA-Economic Research Service

Spatial Analyst. By Sumita Rai

GIST 4302/5302: Spatial Analysis and Modeling

ENGRG Introduction to GIS

The Trade Area Analysis Model

Concepts and Applications of Kriging. Eric Krause

Spatial Statistics A Framework for Analyzing Geographically Referenced Data in Insurance Ratemaking

Spatial Analytics Workshop

NEW YORK DEPARTMENT OF SANITATION. Spatial Analysis of Complaints

Interaction Analysis of Spatial Point Patterns

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

Spatial Analysis 2. Spatial Autocorrelation

Comparison of spatial methods for measuring road accident hotspots : a case study of London

Community Health Needs Assessment through Spatial Regression Modeling

An Introduction to Spatial Autocorrelation and Kriging

The Use of Spatial Weights Matrices and the Effect of Geometry and Geographical Scale

Are Travel Demand Forecasting Models Biased because of Uncorrected Spatial Autocorrelation? Frank Goetzke RESEARCH PAPER

A Method to Reduce the Modifiable Areal Unit Problem in Measuring Urban Form Metrics

Exploratory Spatial Data Analysis (ESDA)

GIS Spatial Statistics for Public Opinion Survey Response Rates

Spatial analysis. 0 move the objects and the results change

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Modeling Spatial Relationships Using Regression Analysis. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Outline. Geographic Information Analysis & Spatial Data. Spatial Analysis is a Key Term. Lecture #1

In matrix algebra notation, a linear model is written as

A cellular automata model for the study of small-size urban areas

Exploratory Spatial Data Analysis Using GeoDA: : An Introduction

DATA DISAGGREGATION BY GEOGRAPHIC

Introduction to GIS. Dr. M.S. Ganesh Prasad

Lecture 1: Introduction to Spatial Econometric

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

Neighborhood social characteristics and chronic disease outcomes: does the geographic scale of neighborhood matter? Malia Jones

Ecological Inference

Exploring Digital Welfare data using GeoTools and Grids

Creating and Managing a W Matrix

Introduction to Geostatistics

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Geog183: Cartographic Design and Geovisualization Winter Quarter 2017 Lecture 6: Map types and Data types

The Attractive Side of Corpus Christi: A Study of the City s Downtown Economic Growth

Lab #3 Background Material Quantifying Point and Gradient Patterns

Variables and Variable De nitions

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

Understanding the modifiable areal unit problem

Defining Statistically Significant Spatial Clusters of a Target Population using a Patient-Centered Approach within a GIS

Data Collection: What Is Sampling?

This report details analyses and methodologies used to examine and visualize the spatial and nonspatial

Introduction to Spatial Analysis. Spatial Analysis. Session organization. Learning objectives. Module organization. GIS and spatial analysis

Transcription:

Nature of Spatial Data Outline Spatial is special Bad news: the pitfalls of spatial data Good news: the potentials of spatial data Spatial Is Special Are spatial data special? Why spatial data require spatial analytic techniques, distinct from standard statistical analysis that might be applied to any ordinary data? Bad news & good news Number of cases of Lyme disease

Bad News First Many of the standard techniques and methods documented in standard statistics textbooks have significant problems when we try to apply them directly to the analysis of the spatial distributions and phenomena Bad News First (Cont.) The fundamental requirements of conventional statistical analysis Identifiable study objects Independence of cases Normal distribution Linearity Stationary Data accuracy Bad News First (Cont.) Spatial data always violate many of these fundamental requirements, due to: Spatial autocorrelation Modifiable areal unit problem Ecology fallacy Scale Nonuniformity of space Edge effect

Spatial Autocorrelation Waldo Tobler s 1st Law of Geography (1970): Everything is related to everything else but nearby things are more related than distant things Housing market Elevation change Air temperature African American Population Concentration in Buffalo, NY Spatial Autocorrelation (Cont.) Violation of independency assumption: The nonrandom distribution of phenomena in space dependence among data in different locations violate the independence assumption Data redundancy (affecting the calculation of confidence intervals y Biased parameter estimates x Types of Spatial Autocorrelation Positive autocorrelation: nearby locations are likely to be similar from one another Negative autocorrelation: observations from nearby observations are likely to be different from one another Zero autocorrelation: no spatial effect is discernible, and observations seem to vary randomly through space Standard statistical methods

Types of Spatial Autocorrelation Positive Negative Zero (Random) Detecting Spatial Autocorrelation Spatial autocorrelation diagnostic measures Joins count statistics Moran s I Geary s C Variogram cloud Why Spatial Autocorelation? Types of Spatial Variation First order SV: occurs when observations across a study region vary from space to space due to changes in the underlying properties of the local environment Second order SV: due to local interaction effects between observations In practice, difficult to distinguish them

Modifiable Areal Unit Problem (MAUP) Spatial analysis often relies on spatially aggregate data e.g. census data: collected at the household level but reported for practical and privacy reasons at various levels of aggregation (block, block group, tract, county, state, etc.) e.g. traffic analysis zone (TAZ): relies on census data to predict future traffic demand Potential problems in almost every field that utilizes spatial data violation of identifiable study objects MAUP (cont.) MAUP: the aggregation units used are often arbitrary with respect to the phenomena under investigation, yet the aggregation units used will affect statistics determined on the basis of data reported in this way If the spatial units in a particular study were specified differently, very different patterns and relationships might be observed Many standard statistical analysis are sensitive to the analysis units Understanding MAUP Modifiable Area: Units are arbitrarily defined different organization of the units may create different analytical results

Understanding MAUP Scale issue: involves the aggregation of smaller units into larger ones. Generally speaking, the larger the units, the stronger the relationship among variables Aggregation (smoothed) Illustration Openshaw and Taylor (1979) showed that with the same underlying data it is possible to aggregate units together in ways that can produce correlations anywhere between -1.0 to +1.0 Special Consideration with MAUP Reasons? Problems of data? Problems of spatial units? Solutions? Using the most disaggregated data Produce a optimal zoning system Others?

Edge Effects Edge effects arise where an artificial boundary is imposed on a study, often just to keep it manageable Entities at the edge of study area only neighbors in one direction (towards the middle) Spatial interpolation Ecological Fallacy A situation that can occur when people makes an inference about an individual based on aggregate data for a group (Reference: http://jratcliffe.net/research/ecolfallacy.htm) Ecological Fallacy We might observe a strong relationship between income and crime at the county level, with lower-income counties being associated with higher crime rate. Which conclusion? Lower-income persons are more likely to commit crime Lower-income counties tend to experience higher crime rates

Ecological Fallacy We might observe a strong relationship between income and crime at the county level, with lower-income counties being associated with higher crime rate. Which conclusion? Lower-income persons are more likely to commit crime Lower-income counties tend to experience higher crime rates Ecological Fallacy Two related issues: Identifying associations between aggregate figures is defective? Inferences drawn about associations between the characteristics of an aggregate population and the characteristics of sub-units within the population are wrong? What should we do? Be aware of the process of aggregating or disaggregating data may conceal the variations that are not visible at the larger aggregate level (Simpson s Paradox) Ecological Fallacy Relationship between ecological fallacy and MAUP? Simpson s Paradox

Scale Besides MAUP, the geographical scale at which we examine a phenomenon can also affect the observations we make and must always be considered prior to spatial analysis Multiple Representation Problem Is there an optimal scale? Non-uniformity of Space Non-uniformity: space is not uniform: First Order SV e.g. population is not evenly distributed. In most case, we are indeed interested in this non-uniformity Area with high crime rates? Or simply more people live there Crime locations Why are we interested in spatial data then? We make an assumption that location matters. Spatial pattern in data is of interest: cluster of crime events Statistical distribution is of interest: different from population at risk? The relationships btw them is what counts: Why? Any other factors?

Finally, Good News Potential insights provided by the examination of locational attributes of data Distance Adjacency Interaction Neighborhood Proximity polygon Distance Distance between the spatial entities of interest can be calculated with spatial data Euclidean distance: Pythagoras Theorem Manhattan distance Network distance Others (e.g. travel time) ij i j i j d = d = x x + y y ij 2 ( xi x j ) + ( yi y j 2 ) Distance Matrix A B C D E F A 0 66 68 68 24 41 B 66 0 51 110 99 101 C 68 51 0 67 91 116 D 68 110 67 0 60 108 E 24 99 91 60 0 45 F 41 101 116 108 45 0

Adjacency Adjacency can be thought of as the nominal, or binary, equivalent of distance. Two spatial entities are either adjacent or not 1 or 0 adjacency matrix Can be defined differently Example 1: two entities are adjacent if they share a common boundary (e.g. Kentucky and Tennessee) Example 2: two entities are adjacent if they are within a specified distance Adjacency Matrix Adjacency d<= 50 = A B C D E F 0 66 68 68 24 41 A * 0 0 0 1 1 66 0 51 110 99 101 B 0 * 0 0 0 0 68 51 0 67 91 116 C 0 0 * 0 0 0 68 110 67 0 60 108 D 0 0 0 * 0 0 24 99 91 60 0 45 E 1 0 0 0 * 1 41 101 116 108 45 0 F 1 0 0 0 1 * Queen vs Rook (occasionally bishop)

A Simple Example Interaction Interaction may be considered as a combination of distance and adjacency and rests on the intuitively obvious idea that nearer things are more related than distant things, Waldo Tobler s 1st Law of Geography Many geographic structures are indeed results of spatial interaction ω ij p p i j k d Neighborhood Different definitions Example 1: a particular spatial entity as the set of all others adjacent to the entity we are interested in Example 2: a region of space associated with that entity and defined by distance from it

Proximity Polygon The proximity polygon of any entity is that region of the space which is closer to the entity than it is to any other Often called Thession Polygon Applications: Service area delineation (e.g. schools, hospital, supermarket, etc.) Dual Model: Delaunay Triangulation Delaunay Triangulation Potential applications: 1. TIN model 2. Others Review Bad News Spatial autocorrelation Modifiable areal unit problem Ecology fallacy Scale Nonuniformity of space Edge effect Good News Distance Adjacency Interaction Neighborhood Spatial Is Special!!!