Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Similar documents
Lecture 4. Spatial Statistics

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Objectives Define spatial statistics Introduce you to some of the core spatial statistics tools available in ArcGIS 9.3 Present a variety of example a

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

SPATIAL ANALYSIS. Transformation. Cartogram Central. 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

Spatial Analysis II. Spatial data analysis Spatial analysis and inference

Exploratory Spatial Data Analysis Using GeoDA: : An Introduction

GIST 4302/5302: Spatial Analysis and Modeling

KAAF- GE_Notes GIS APPLICATIONS LECTURE 3

Why Is It There? Attribute Data Describe with statistics Analyze with hypothesis testing Spatial Data Describe with maps Analyze with spatial analysis

Lecture 8. Spatial Estimation

Lab #3 Background Material Quantifying Point and Gradient Patterns

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

2/7/2018. Module 4. Spatial Statistics. Point Patterns: Nearest Neighbor. Spatial Statistics. Point Patterns: Nearest Neighbor

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

Nature of Spatial Data. Outline. Spatial Is Special

Geometric Algorithms in GIS

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

Overview of Spatial analysis in ecology

OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

Chapter 6 Spatial Analysis

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

GIS CONFERENCE MAKING PLACE MATTER Decoding Health Data with Spatial Statistics

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis:

Spatial Analysis 1. Introduction

Lecture 5 Geostatistics

Geoprocessing Tools at ArcGIS 9.2 Desktop

Outline ESDA. Exploratory Spatial Data Analysis ESDA. Luc Anselin

In matrix algebra notation, a linear model is written as

CyberGIS: What Still Needs to Be Done? Michael F. Goodchild University of California Santa Barbara

What s special about spatial data?

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Spatial Pattern Analysis: Mapping Trends and Clusters. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Mapping and Analysis for Spatial Social Science

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall

Spatial and Temporal Geovisualisation and Data Mining of Road Traffic Accidents in Christchurch, New Zealand

Exploratory Spatial Data Analysis (ESDA)

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

ENGRG Introduction to GIS

Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Construction Engineering. Research Laboratory. Approaches Towards the Identification of Patterns in Violent Events, Baghdad, Iraq ERDC/CERL CR-09-1

An Overview of Solving Spatial Problems Using ArcGIS

Overview of Statistical Analysis of Spatial Data

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Chapter 6. Fundamentals of GIS-Based Data Analysis for Decision Support. Table 6.1. Spatial Data Transformations by Geospatial Data Types

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

GIS Spatial Statistics for Public Opinion Survey Response Rates

Geog 210C Spring 2011 Lab 6. Geostatistics in ArcMap

LAB EXERCISE #3 Quantifying Point and Gradient Patterns

Texas A&M University

Spatial Analyst. By Sumita Rai

Spatial Pattern Analysis: Mapping Trends and Clusters

Geoinformation in Environmental Modelling

Spatial analysis. 0 move the objects and the results change

Basics of Geographic Analysis in R

The Implementation of Autocorrelation-Based Regioclassification in ArcMap Using ArcObjects

Spatial Autocorrelation

What are the five components of a GIS? A typically GIS consists of five elements: - Hardware, Software, Data, People and Procedures (Work Flows)

Application of the Getis-Ord Gi* statistic (Hot Spot Analysis) to seafloor organisms

The Case for Space in the Social Sciences

Where to Invest Affordable Housing Dollars in Polk County?: A Spatial Analysis of Opportunity Areas

Introduction To Raster Based GIS Dr. Zhang GISC 1421 Fall 2016, 10/19

Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters, and a Comparison to Other Clustering Algorithms

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME:

GIS for the Non-Expert

ArcGIS for Geostatistical Analyst: An Introduction. Steve Lynch and Eric Krause Redlands, CA.

GIST 4302/5302: Spatial Analysis and Modeling

Comparison of spatial methods for measuring road accident hotspots : a case study of London

The Study on Trinary Join-Counts for Spatial Autocorrelation

ArcGIS Pro: Analysis and Geoprocessing. Nicholas M. Giner Esri Christopher Gabris Blue Raster

11. Kriging. ACE 492 SA - Spatial Analysis Fall 2003

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Georelational Vector Data Model

Universitat Autònoma de Barcelona Facultat de Filosofia i Lletres Departament de Prehistòria Doctorat en arqueologia prehistòrica

Introduction to GIS - 2

Cluster Analysis using SaTScan

Raster Spatial Analysis Specific Theory

Exploratory Spatial Data Analysis (And Navigating GeoDa)

The Nature of Geographic Data

An Introduction to Pattern Statistics

Output: -Observed Mean Distance -Expected Mean Distance - Nearest Neighbor Index -Graphic report - Test variables:

GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis

Concepts and Applications of Kriging. Eric Krause

Handling Raster Data for Hydrologic Applications

Interaction Analysis of Spatial Point Patterns

Introduction to Spatial Statistics and Modeling for Regional Analysis

Popular Mechanics, 1954

Regression Analysis. A statistical procedure used to find relations among a set of variables.

Automatic Watershed Delineation using ArcSWAT/Arc GIS

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

The CrimeStat Program: Characteristics, Use, and Audience

Introduction to Geographic Information Systems (GIS): Environmental Science Focus

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Identification of Regional Subcenters Using Spatial Data Analysis for Estimating Traffic Volume

Transcription:

Spatial Analysis I Spatial data analysis Spatial analysis and inference

Roadmap Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for analyses: working with distance Step 3: Spatial patterns analysis Step 4: Kernel density analysis Summary

Difference of Aspatial vs spatial analysis Aspatial analyses assume that where you take your sample shouldn t matter.

Spatial analysis Turns raw data into useful information by adding greater informative content and value Reveals patterns, trends, and anomalies that might otherwise be missed Provides a check on human intuition by helping in situations where the eye might deceive

Spatial analysis Spatial analysis can be inductive, to examine empirical evidence in the search for patterns that might support new theories or general principles, as with disease mapping (cancer maps) deductive, focusing on the testing of known theories or principles against data (Sky Train stations as centres of criminal activity); normative, using spatial analysis to develop or prescribe new or better designs (geodesign).

Spatial analysis A method of analysis is spatial if the results depend on the locations of the objects being analyzed move the objects / study boundaries and the results change results are not invariant under relocation Spatial analysis uses the locations of objects, and, most often, the attributes of those objects Spatial analysis is the crux of GIS Attribute linkages Spatial data Attribute data P,L,A,P x NOIR

Getting organized: Joins One of the more powerful features of a GIS is the ability to join attribute tables to spatial layers based on a common geographic location ID (such as the CTUID). ArcMap also has many different forms of spatial joins. (3-D joins and another reason not to use unprojected data [scroll down to Getting the Best Result ])

Step 1: Analysis of attributes Attribute table joins Scatterplots Other types of plots? Regression Looking for outliers or other unusual patterns in the attribute data Problem? Spatial heterogeneity Know your data

Step 2a: Preparing for analysis: getting our distances correct Pythagorean or straight-line metric Shortest distance on a sphere? (Which route?) Distance along a route represented in a GIS (a polyline) is often calculated by summing the lengths of each segment of the polyline Because there is a general tendency for polylines to short-cut corners, the length of a polyline tends to be shorter than the length of the object it represents. Length of a 3 dimensional line measured off its planimetric representation will also be shorter than its true length Unless you are working at very long distances (e.g., continental), work only with projected data (e.g., m)

Pythagoras s Theorem and the straight-line distance between two points on a plane. What is the length of D?

The effects of the Earth s curvature on the measurement of distance, and the choice of shortest paths Geodesics

The length of a path as traveled on the Earth s surface (red line) may be substantially longer than the length of its horizontal projection as evaluated in a two-dimensional GIS In the figure are shown three paths across part of Dorset in the UK. The green path is the straight route ( as the crow flies ), the red path is the modern road system, and the gray path represents the route followed by the road in 1886 (Courtesy Michael De Smith)

(Courtesy Michael De Smith) The vertical profiles of all three routes, with elevation plotted against the distance traveled horizontally in each case. 1 ft = 0.3048 m, 1 yd = 0.9144 m.

Question: how to determine the true (3D) length of a line? This used to be a complex process, but you can now achieve this result in two easy steps. You need to have a linear feature and a DEM or TIN. Use the Interpolate Shape Tool (3D Analyst) to add the Z values to the line Use the Add Z Information Tool (3D Analyst) to add fields to the linear feature s attribute table.

Identifying areas of influence: Buffers 0 Buffering is a commonly applied distance-based analysis Buffers (dilations) of constant width drawn around a point, a polyline, and a polygon

Buffers representing 1 2-mile exclusion zones around all schools in part of Los Angeles

Step 2a: Preparing for analysis: getting our neighbours correct 0 Many spatial techniques require informative data on spatial relationships (usually 1 to n values). 0 How to formally define the spatial relationships between points, polygons or grids on the surface of analysis? 0 We would like to quantify nearness in some fashion. 0 How do we want to quantify that nearness (distance, adjacency)? 0 Many approaches require a weight matrix. 0 Matrices function like maps that guide our analysis.

Weight Matrices We can use different types of weight matrices to see if there are different types of spatial relationships. Two broad types of matrices: Distance-based (obviously useful for point features, but also used for polygonal features; usually a cut-off distance is defined [e.g., distance between < 1000m]). Can also use a network to determine the distance. Contiguity-based (a key attribute of polygonal features do they share a common edge?) ArcMap s help file for generating spatial weights.

Weight Matrices Distance Distance-based creates bands around the points (perhaps 1000m) (points or centroids in polygon) to ID neighbours. K-Nearest Neighbor counts or marks the k closest neighbors (a relative distance measure, since in some areas the k points may be very close while in other areas the k points may be much further away) (think of k points or centroids of polygons; or the k adjacent polygons) K=3

Weight Matrices Contiguity-based weights Rook--counts only edge adjacencies Queen--counts edges and vertices For rasters, very easy to visualize. For polygons, the resulting size of the neighbourhoods can vary widely Rook Queen

Weight Matrices An example of a weight matrix for polygons where a vertex is not counted as adjacent (2 is not adjacent to 6) Note that polygons are not considered adjacent to themselves. 1 adjacent 0 not adjacent 3 4 2 1 6 5 1 2 3 4 5 6 1 0 1 0 0 1 0 2 1 0 1 1 1 0 3 0 1 0 1 0 0 4 0 1 1 0 0 1 5 1 1 0 0 0 1 6 0 0 0 1 1 0

Weight Matrices A weight matrix is often contained in a weight s file. We can establish such files in ArcGIS and in programs like GeoDa. For example, in GeoDa weights files include:.gal for contiguity-based weights.gwt for distance-based weights In other programs weights might be defined in the GUI rather than through a separate file.

Step 3: Spatial patterns analysis Identification of how objects cluster is often important in many different fields: Archaeology Criminology Ecology Epidemiology Looking at the distribution of spatial objects without considering their attributes. Points patterns can be identified as clustered, dispersed, or random Kinds of processes responsible for point patterns are: First-order processes involve points being located independently (rain drops) Second-order processes involve interaction between points (acorns from oaks) The K function is an example of a descriptive statistic of pattern

Point pattern analysis Point pattern of individual tree locations. A, B, and C identify the individual trees analyzed in the following graphs Here the points represent trees, but they could represent crime incidents, locations of people with a disease, store locations, etc. (Source: Getis A. and Franklin J. 1987. Secondorder neighborhood analysis of mapped point patterns. Ecology 68(3): 473 477).

Point pattern analysis: Ripley s K Summarizes spatial autocorrelation (point feature clustering or feature dispersion) over a range of distances. This is used when you want to see how changing spatial distances impact nearest neighbour counts. It can help identify an appropriate window size. In many pattern analysis studies, the selection of an appropriate scale of analysis is required. For example, a distance threshold or distance band is often needed for the analysis (e.g., kernel density analysis). When exploring spatial patterns at multiple distances and spatial scales, patterns change, often reflecting the dominance of particular spatial processes at work. Ripley's K function illustrates how the spatial clustering or dispersion of feature centroids changes when the neighborhood size changes. A local measure but on that can look at all distances. ESRI s description of Ripleyy s K

Ripley's K Function Clustered A- area (e.g., bounding box) N - # of pts d distance (classes) k(i, j) is the weight, which is 1 when the distance between i and j is less than or equal to d and 0 when the distance between i and j is greater than d. With the L(d) transformation, the expected value is equal to distance What does overdispersion mean? (Source: Getis A. and Franklin J. 1987. Second-order neighborhood analysis of mapped point patterns. Ecology 68(3): 473 477). Overdispersion

Pine trees are represented by green dots and other tree species are represented by red dots. The function counts the number of neighboring pine trees found within a given distance from each individual pine tree (Xm). The number of observed neighboring pine trees is then traditionally compared to the number of pine trees one would expect to find based on a completely spatially random point pattern. If the number of pines found within a given distance of each individual pine is greater than that for a random distribution, the distribution is clustered. If the number is smaller, the distribution is dispersed.

Permutations (over and over again) Spatial Autocorrelation measures such as Ripley s K or Moran s I usually compare your data to a theoretical random data set (whether polygon or point) in order to get a p-value. In order to determine if the existing spatial data is statistically dissimilar to the null hypothesis of complete spatial randomness (CSR), we need to simulate / create a spatially random probability distribution (this can t be done mathematically [e.g., looking up a value in a table] since each study is unique). Monte Carlo Simulation produces several (e.g., 99) random simulations (or permutations) that the software then compares against your observed data. This can be used to develop a pseudo p-value: the probability that an actual set of numbers was observed only by chance.

Permutations Each observation is given a set of randomly generated coordinates (selected using a uniform random number generator, not a normal or gaussian random distribution), which is used to relocate each observation in space. To generate a random reference distribution of Moran's I (or Ripley s K), the statistic is computed each time with a different set arrangement of points for the number of permutations specified (e.g., 99). You can then compare this reference distribution to your observed Moran's I value to determine where it falls in comparison. The upper and lower confidence bands are derived from the random permutations. A uniform distribution for the role of a single dice:

Statistical Significance A spatially random distribution (one of many simulations) Observed distribution Our p-value answers the question - what is the probability that the observed distribution could have occurred by chance?

Permutations to compute confidence envelope.

A clustered distribution Car thefts in Vancouver after 8:00 pm

Step 4: Kernel density analysis Kernel Density analysis calculates the density of features in a neighborhood around each features. It can be calculated for both point and line features. While the inputs are either point or line features, the output is a raster since a field output is being created Possible uses include calculating the density of houses, crime reports, or roads or utility lines and using that density in a regression analysis, for example. You can use a population field to weight some features more heavily than others, depending on their meaning, or to allow one point to represent several observations. Source Still looking only at the spatial objects.

(A) A collection of point objects (B) A kernel function A The kernel s shape depends on a distance parameter increasing the value of the parameter results in a broader and lower kernel, and reducing it results in a narrower and sharper kernel. When each point is replaced by a kernel and the kernels are added, the result is a density surface whose smoothness depends on the value of the distance parameter. B

Density estimation using two different distance parameters in the respective kernel functions. (A) The surface shows the density of ozone-monitoring stations in California, using a kernel radius of 150 km (B) Zoomed to an area of Southern California, a kernel radius of 16 km is too small for this dataset, as it leaves each kernel isolated from its neighbors

Car thefts in Vancouver after 8:00 pm 50 m cell size and a 500 m neighbourhood

Step 5: Spatial patterns analyses+ Cluster and Outlier Analysis identifies spatial clusters of features with high or low values, as well as identifying spatial outliers (formally: Anselin s Local Moran's I). This is just one example of the different ways we can analyze spatial patterns. ESRI provides helpful information about this and other methods on their Spatial Statistics Resources page. We are now including both the spatial object and its attribute.

Spatial Patterns of Obesity and Associated Risk Factors in the Conterminous U.S Source

Some notes on the lab.

What do these values represent? 1.96 Z-score represents 0.05% of the curve (two-tailed) 2.58 Z-score represents 0.01% of the curve WRT Lab 3 and the Moran I s interpretation. The Moran s Index in and of itself isn t important it is the z-score and the p-value that tell the tale.

Test Statistic for Normal Frequency Distribution 2.5% -1.96 1/(n-1) 0 1.96 2.5% 1% 2.58 *technically 1/(n-1) Reject null Reject null at 1% Reject null at 5% Null Hypothesis: no spatial autocorrelation *Moran s I = 0 Alternative Hypothesis: spatial autocorrelation exists (and/or dispersion exists) *Moran s I > 0 (clustering) or I < 0 (dispersion) Reject Null Hypothesis if Z score is greater than or equal to 1.96 (less than or equal to -1.96) Interpretation: less than a 5% chance that the spatial autocorrelation (dispersion) found is random, 95% confident that spatial auto correlation (dispersion) exits.

Summary 0 These are just a few of the methods available to analyse spatial data. 0 You should explore ArcMap s Spatial Analyst toolbox as well as the Spatial Statistics toolbox, since within those toolboxes you can find many additional methods that might be of use in your projects.