Spatial Point Pattern Analysis

Similar documents
Lab #3 Background Material Quantifying Point and Gradient Patterns

Overview of Spatial analysis in ecology

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

Point Pattern Analysis

Points. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Outline. Practical Point Pattern Analysis. David Harvey s Critiques. Peter Gould s Critiques. Global vs. Local. Problems of PPA in Real World

School on Modelling Tools and Capacity Building in Climate and Public Health April Point Event Analysis

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

Comparison of point pattern analysis methods for classifying the spatial distributions of spruce-fir stands in the north-east USA

Bayesian Hierarchical Models

An Introduction to Pattern Statistics

Testing of mark independence for marked point patterns

Chapter 6 Spatial Analysis

PART TWO SPATIAL PATTERN IN ANIMAL AND PLANT POPULATIONS

Spatial point processes

K1D: Multivariate Ripley s K-function for one-dimensional data. Daniel G. Gavin University of Oregon Department of Geography Version 1.

Hierarchical Modeling and Analysis for Spatial Data

Spatial Analysis 2. Spatial Autocorrelation

Spatial Clusters of Rates

Ripley s K function. Philip M. Dixon Iowa State University,

Spatial dynamics of chestnut blight disease at the plot level using the Ripley s K function

Practical Statistics

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

Estimating Timber Volume using Airborne Laser Scanning Data based on Bayesian Methods J. Breidenbach 1 and E. Kublin 2

A Spatio-Temporal Point Process Model for Firemen Demand in Twente

Tests for spatial randomness based on spacings

Geographic coordinates (longitude and latitude) of terrestrial and marine fossil collections of

BIO 682 Multivariate Statistics Spring 2008

Spatial Analysis 1. Introduction

Envelope tests for spatial point patterns with and without simulation

GET TO KNO W A NATIO NAL

Introduction to Spatial Analysis. Spatial Analysis. Session organization. Learning objectives. Module organization. GIS and spatial analysis

Probability and Stochastic Processes

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Application of the Getis-Ord Gi* statistic (Hot Spot Analysis) to seafloor organisms

Universitat Autònoma de Barcelona Facultat de Filosofia i Lletres Departament de Prehistòria Doctorat en arqueologia prehistòrica

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

On tests of spatial pattern based on simulation envelopes

6 Single Sample Methods for a Location Parameter

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Extensibility of Measurement Results of Point Samples of the Soil Protection Information and Monitoring System by different methods

Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Detection of Clustering in Spatial Data

Algorithms for Sensor-Based Robotics: Potential Functions

CONDUCTING INFERENCE ON RIPLEY S K-FUNCTION OF SPATIAL POINT PROCESSES WITH APPLICATIONS

Class 9. Query, Measurement & Transformation; Spatial Buffers; Descriptive Summary, Design & Inference

POINT PATTERN ANALYSIS OF FIA DATA. Chris Woodall 1

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Wheels Radius / Distance Traveled

Unconstrained Ordination

Multivariate Histogram Comparison

LAB EXERCISE #3 Quantifying Point and Gradient Patterns

TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

Spatial Point Pattern Analysis

Estimation of a Plant Density Exercise 4

Sensitivity of FIA Volume Estimates to Changes in Stratum Weights and Number of Strata. Data. Methods. James A. Westfall and Michael Hoppus 1

Spatial and Environmental Statistics

Variance Reduction and Ensemble Methods

10.0 REPLICATED FULL FACTORIAL DESIGN

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2014 University of California, Berkeley

LAB EXERCISE #3 Neutral Landscape Analysis Summary of Key Results and Conclusions

Temporal vs. Spatial Data

Output: -Observed Mean Distance -Expected Mean Distance - Nearest Neighbor Index -Graphic report - Test variables:

Wavelet methods and null models for spatial pattern analysis

M statistic commands: interpoint distance distribution analysis with Stata

ANALYZING THE SPATIAL STRUCTURE OF A SRI LANKAN TREE SPECIES WITH MULTIPLE SCALES OF CLUSTERING

Rangeland and Riparian Habitat Assessment Measuring Plant Density

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories

POLI 443 Applied Political Research

Interaction Analysis of Spatial Point Patterns

Geostatistical Modeling of Primary and Secondary Automobile Traffic Volume as Ecological Disturbance Proxy Across the Contiguous United States

Lecture 9 Point Processes

Introduction to Statistical Analysis

Testing Statistical Hypotheses

A new metric of crime hotspots for Operational Policing

A Comparison of Three Exploratory Methods for Cluster Detection in Spatial Point Patterns

Estimating basal area, trees, and above ground biomass per acre for common tree species across the Uncompahgre Plateau using NAIP CIR imagery

Lecture 8: PGM Inference

Unit 14: Nonparametric Statistical Methods

ASCALE-SENSITIVETESTOF ATTRACTION AND REPULSION BETWEEN SPATIAL POINT PATTERNS

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Solutions to Practice Test 2 Math 4753 Summer 2005

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Inclusion of Non-Street Addresses in Cancer Cluster Analysis

Understand the difference between truncating and rounding. Calculate with roots, and with integer and fractional indices.

Test of Complete Spatial Randomness on Networks

Multivariate analysis

Department of Mathematics, Koç University, Istanbul, Turkey. Online publication date: 19 February 2011

Mathematics Department

Session 2477: The Role of Advanced Computational Tools in Visualizing Spatiotemporal Complexity

Subject CS1 Actuarial Statistics 1 Core Principles

Calculate the volume of the sphere. Give your answer correct to two decimal places. (3)

Transcription:

Spatial Point Pattern Analysis Jiquan Chen Prof of Ecology, University of Toledo EEES698/MATH5798, UT

Point variables in nature A point process is a discrete stochastic process of which the underlying distribution in not continuous, e.g. Distribution of feeding bark beetles; Tree distribution Inland lakes in terrestrial ecosystems; Cities at regional and/or continental scales; Students in a classroom; etc.

Introduction Point pattern analysis looks for patterns in the spatial location of events Events are assigned to points in space e.g. infection by bird-flu, site where firm operates, place where crime occurs, redwood seedlings Point pattern analysis has the advantage that it is not directly dependent on zone definitions (MAUP) From Steve Gibbons

Types of Point data Univariate, Bivariate, Multivariate 1, 2, and 3 Dimensions (x,y,z) From unknown

Point Pattern Analysis Pattern may change with scale! Test statistic calculated from data vs. expected value of statistic under CSR (complete spatial randomness) From unknown

Types of Point Patterns Random (CSR) Overdispersed (spaced or regular) Underdispersed (clumped or aggregated) From unknown

From unknown

Spatial point patterns Random Aggregated From Steve Gibbons

Spatial point patterns Regular From Steve Gibbons

Spatial data exploration Point (event) based statistics Typically analysis of point-pair distances Points vs events Distance metrics: Euclidean, spherical, L p or network Weighted or unweighted events Events, NOT computed points (e.g. centroids) Classical statistical models vs Monte Carlo and other computational methods 3 rd edition www.spatialanalysisonline.com 1

Spatial data exploration Point (event) based statistics Basic Nearest neighbour (NN) model Input coordinates of all points Compute (symmetric) distances matrix D Sort the distances to identify the 1st, 2nd,...kth nearest values Compute the mean of the observed 1st, 2nd,...kth nearest values Compare this mean with the expected mean under Complete Spatial Randomness (CSR or Poisson) model 3 rd edition www.spatialanalysisonline.com 11

Spatial data exploration Point (event) based statistics NN model r+dr r Area = r 2 Width = dr Area = 2 rdr 3 rd edition www.spatialanalysisonline.com 12

Nearest Neighbor Search Structure 13

Spatial data exploration Point (event) based statistic s NN model Mean NN distance: 2 1 m Variance: 2 (4 ) 4 m NN Index (Ratio): Z-transform: R r / r z ( r e o o e r )/ e 2/ n e ~N(,1),where.261358/ mn 3 rd edition www.spatialanalysisonline.com 14

Spatial data exploration Point (event) based statistics Issues Are observations n discrete points? Sample size (esp. for k th order NN, k>1) Model requires density estimation, m Boundary definition problems (density and edge effects) affects all methods NN reflexivity of point sets Limited use of frequency distribution Validity of Poisson model vs alternative models 3 rd edition www.spatialanalysisonline.com 15

Spatial data exploration Frequency distribution of nearest neighbour distances, i.e. The frequency of NN distances in distance bands, say -1 km, 1-2 km, etc. The cumulative frequency distribution is usually denoted G(d) = #(d i < r)/n F(d) = #(d i < r)/m where d i are the NN distances and n is the number of measurements, or where m is the number of random points used in sampling 3 rd edition www.spatialanalysisonline.com 16

Spatial data exploration Computing G(d) [computing F(d) is similar] Find all the NN distances Rank them and form the cumulative frequency distribution Compare to expected cumulative frequency distribution: G( r) 1 e m r Similar in concept to K-S test with quadrat model, but compute the critical values by simulation rather than table lookup 2 3 rd edition www.spatialanalysisonline.com 17

Complete Spatial Randomness The simplest null hypothesis regarding spatial point patterns The number of events N(A) in any planar region A with area A follows a Poisson distribution with mean: n A A p( N( A) n) e n! Given N(A)= n, the events in A are an independent random sample from the uniform distribution on A Poisson process has constant intensity Intensity is the expected number of events per unit area Also mean = variance See Diggle p.47 A From Steve Gibbons

Kernel intensity/density estimates A simple kernel intensity estimate using a uniform kernel () s N( C( s, r)) 2 r 2 9 ( s).716 2 2 =.716 From Steve Gibbons

Edge effects x R1 R2 y From Steve Gibbons

Correcting Edge Effects Intensity estimated lower at point y than at point x Corrections can be based on % area of circle within R1 % circumference of circle within R1 [circumference easier to calculate] drawing buffer zones From Steve Gibbons

K function The K function is the expected number of events within distance d of an event, divided by mean intensity in the study area (i.e. number of events/ area) Kd E C N x, d E N d or From Steve Gibbons

Ripley s K Ripley s (1976) estimator of K d I s i s j d Where A means area of study area A, and means distance between s_i and s_j Also need to take care of edge effects ˆK A n 2 i n i 1 j i If events uniformly distributed with intensity then expected number of events within distance d is d 2 So expected K(d) under uniform distribution (CSR) is d 2 s i s j From Steve Gibbons

Ripley s K 5m d=1m 5m If uniform K(1) = = 3.14 N(1) 2 3 2 1 2 1 2 1 2 E[N(1)] = 16/1 =1/25=.4 K(1) = 1.6/.4 = 4 From Steve Gibbons

Checking for clustering K d d Under CSR with uniform intensity expect 2 K(d) K 2 d d From Steve Gibbons

Hypothesis tests Sampling distribution of these spatial point process statistics is often unknown Possible to derive analytical point-wise confidence intervals for kernel estimates But more generally use monte-carlo, bootstrap and random assignment methods From Steve Gibbons

Methods Distance to neighbor sample Refined Nearest Neighbor randomization Second-order point pattern analysis From unknown

Second-order Point Pattern Analysis: Ripley s K Used to analyse the mapped positions of events in the plane and assume a complete census From unknown

Chen & Bradshaw 1999

Chen & Bradshaw 1999

Bi-Ripley K Chen & Bradshaw 1999

Coherent Spatial Relationships of Species Distribution and Production in An Old-growth Pseudotsuga-tsuga Forest Jiquan Chen* 1, Bo Song 1, Melinda Moeur 2, Mark Rudnicki 1, Ken Bible 3, Dave C. Shaw 3, Malcolm North 4, Dave M. Braun 3, and Jerry F. Franklin 3 1. Dept. of Earth, Ecological and Environmental Science, University of Toledo, Toledo, OH 43662, USA. 2. USDA Forest Service, Rocky Mountain Research Station, Moscow, ID 83843, USA 3. College of Forest Resources, University of Washington, Seattle, WA 98195, USA 4. USDA Forest Service, Pacific Southwest Research Station, Fresno, CA 9371, USA

Hypotheses 1) all trees have a clumped distribution in small size classes but a regular distribution in large size classes; 2) there is no significant attraction or repulsion between different tree species at a fine scale (<5 m) but; 3) species will be significantly clumped at large scales (i.e., larger than canopy gaps); and 4) species distribution will indicate site production (i.e., above-ground biomass, AGB) because community composition and tree size directly determined AGB and other biomass measures across the stand.

Methods Stem Map Point pattern Analysis (Ripley s K function) Semivariance analysis and Kriging Spatial correlation between composition and production

Elevation Gradient across the 12 ha plot at WRCCRF Canopy Crane

Spatial distribution of tree species on a topographic map in the 12 ha (4 x 3 m) plot facing north.

Ripley s K statistics for six major species in an old-growth Douglas-fir forest based on stem-mapped data. The Monte Carlo envelope (shaded area) was constructed at the 95% confidence level with 1 Monte Carlo simulations. 1 5-5 (a1) PSME vs ABGR 8 4 (b1) PSME vs ABAM 15 (c1) PSME vs THPL 8 (d1) PSME vs TABR 1 5-5 4-1 4 2 (a2) TSHE vs PSME -4 4 2 (b2) TSHE vs ABGR -1 2 1 (c2) TSHE vs ABAM -4 4 2 (d2) TSHE vs THPL -1-2 -2-2 -2-4 L(d) -4 8 4-4 -3 (a3) ABGR vs ABAM (b3) THPL vs ABGR (c3) ABGR vs TABR 6 1 3-6 2 (d3) TSHE vs TABR -4-1 -3-2 -8 1 (a4) ABAM vs THPL -2 6 (b4) ABAM vs TABR -6 1 (c4) THPL vs TABR 2 (d4) all species 5 3 5 1-5 -1 3 6 9 12 15-3 3 6 9 12 15-1 -5 Scale (m) -2 3 6 9 12 15 3 6 9 12 15

Bivariate Ripley s K statistics for all combinations of any two of the six major species (TSHE, PSME, ABGR, ABAM, THPL, and TABR) showing significant (95%) random (L(d) falls within the 95% CI envelope), attractive (L(d) falls above the 95% CI envelope), or repulsive (L(d) falls below the 95% CI envelope) based on 1 Monte Carlo simulations. L(d) 2 1-1 12 8 4 12 6-6 (a) TSHE (b) ABAM (c) ABGR -2 3 6 9 12 15 9 6 3-3 4 3 2 1 4 2 Scale (m) (d) PSME (e) THPL (f ) TABR 3 6 9 12 15

L(d) 2 1-1 2 1-1 -2 1 2 (a) H2 (d) H4 1-1 -2 2 (b) H6 (e) H2 v s H4 1-1 1 (c) H2 v s H6 (f ) H4 v s H6-1 -1 3 6 9 12 15 3 6 9 12 15 Scale (m) Bivariate Ripley s K statistics for all combinations of any two height classes for H2 (<2 m), H4 (2-4 m), and H6 (>4 m) showing significant (95% confidence intervals) showing repulsive, attractive, and random relationships based on 1 Monte Carlo simulations. Tree heights were calculated using models developed by Song (1998) and Ishii et al. (2).

Spatial distribution of infected trees across the stand West North Fig. 6

^ L (d) 12-8 -6-4 -2 2 2 4 6 8 1 12 14 16 distance (m) Fig. 7

V2 Spatial distribution of infected trees across the stand 25 2 15 1 5 2 V1 Fig. 8

^ G (y) ij..2.4.6.8 1. ^ G (y) ij..2.4.6.8 1...2.4.6.8 1. _ G (y) ij..2.4.6.8 1. _ G (y) ij Fig. 9

R resource webpage: http://cran.r-project.org/

Questions? http://research.eeescience.utoledo.edu/lees/

Spatial data exploration Point (event) based statistics clustering (ESDA) Is the observed clustering due to natural background variation in the population from which the events arise? Over what spatial scales does clustering occur? Are clusters a reflection of regional variations in underlying variables? Are clusters associated with some feature of interest, such as a refinery, waste disposal site or nuclear plant? Are clusters simply spatial or are they spatio-temporal? 3 rd edition www.spatialanalysisonline.com 48

Spatial data exploration Point (event) based statistics clustering k th order NN analysis Cumulative distance frequency distribution, G(r) Ripley K (or L) function single or dual pattern PCP Hot spot and cluster analysis methods 3 rd edition www.spatialanalysisonline.com 49

Spatial data exploration Point (event) based statistics Ripley K or L Construct a circle, radius d, around each point (event), i Count the number of other events, labelled j, that fall inside this circle Repeat these first two stages for all points i, and then sum the results Increment d by a small fixed amount Repeat the computation, giving values of K(d) for a set of distances, d K( d) Adjust to provide normalised measure L: L( d) d 3 rd edition www.spatialanalysisonline.com 5

Spatial data exploration Point (event) based statistics comments CSR vs PCP vs other models Data: location, time, attributes, error, duplicates Duplicates: deliberate rounding, data resolution, genuine duplicate locations, agreed surrogate locations, deliberate data modification Multi-approach analysis is beneficial Methods: choice of methods and parameters Other factors: borders, areas, metrics, background variation, temporal variation, non-spatial factors Rare events and small samples Process-pattern vs cause-effect ESDA in most instances 3 rd edition www.spatialanalysisonline.com 51