Spatial Statistics and Analysis Methods (for GEOG 104 class).

Similar documents
Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Cathy Walker March 5, 2010

/ n ) are compared. The logic is: if the two

Chapter 3 Describing Data Using Numerical Measures

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Central tendency. mean for metric data. The mean. "I say what I means and I means what I say!."

Machine Learning. Measuring Distance. several slides from Bryan Pardo

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Statistics Chapter 4

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Basically, if you have a dummy dependent variable you will be estimating a probability.

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Lecture 3: Probability Distributions

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Uncertainty as the Overlap of Alternate Conditional Distributions

Economics 130. Lecture 4 Simple Linear Regression Continued

VQ widely used in coding speech, image, and video

A Robust Method for Calculating the Correlation Coefficient

Comparison of Regression Lines

Statistics MINITAB - Lab 2

Kernel Methods and SVMs Extension

Midterm Examination. Regression and Forecasting Models

Chapter 9: Statistical Inference and the Relationship between Two Variables

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Statistical Evaluation of WATFLOOD

Lecture 12: Classification

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Negative Binomial Regression

Modeling and Simulation NETW 707

Statistics for Economics & Business

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

The path of ants Dragos Crisan, Andrei Petridean, 11 th grade. Colegiul National "Emil Racovita", Cluj-Napoca

AS-Level Maths: Statistics 1 for Edexcel

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Linear Regression Analysis: Terminology and Notation

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

e i is a random error

Week 9 Chapter 10 Section 1-5

Topic- 11 The Analysis of Variance

RELIABILITY ASSESSMENT

DUE: WEDS FEB 21ST 2018

Correlation and Regression

Lecture 6: Introduction to Linear Regression

Chapter 14 Simple Linear Regression

Properties of Least Squares

The Geometry of Logit and Probit

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

CSE4210 Architecture and Hardware for DSP

Aggregation of Social Networks by Divisive Clustering Method

Primer on High-Order Moment Estimators

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

x = , so that calculated

Sociology 301. Bivariate Regression. Clarification. Regression. Liying Luo Last exam (Exam #4) is on May 17, in class.

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

STAT 3014/3914. Semester 2 Applied Statistics Solution to Tutorial 13

Chapter 11: Simple Linear Regression and Correlation

Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

Linear Approximation with Regularization and Moving Least Squares

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Physics 181. Particle Systems

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

ˆ (0.10 m) E ( N m /C ) 36 ˆj ( j C m)

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Basic Statistical Analysis and Yield Calculations

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Continuous vs. Discrete Goods

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Lecture Nov

STATISTICS QUESTIONS. Step by Step Solutions.

Measuring the Strength of Association

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Clustering gene expression data & the EM algorithm

PES 1120 Spring 2014, Spendier Lecture 6/Page 1

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD.

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Lecture 4: November 17, Part 1 Single Buffer Management

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Gaussian Mixture Models

4.3 Poisson Regression

U-Pb Geochronology Practical: Background

Line Drawing and Clipping Week 1, Lecture 2

Supporting Information

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)

Army Ants Tunneling for Classical Simulations

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

This model contains two bonds per unit cell (one along the x-direction and the other along y). So we can rewrite the Hamiltonian as:

Identifying Vehicular Crash High Risk Locations along Highways via Spatial Autocorrelation Indices and Kernel Density Estimation

3) Surrogate Responses

NON-CENTRAL 7-POINT FORMULA IN THE METHOD OF LINES FOR PARABOLIC AND BURGERS' EQUATIONS

Transcription:

Spatal Statstcs and Analyss Methods (for GEOG 104 class). Provded by Dr. An L, San Dego State Unversty. 1

Ponts Types of spatal data Pont pattern analyss (PPA; such as nearest neghbor dstance, quadrat analyss) Moran s I, Gets G* Areas Lnes Area pattern analyss (such as jon-count statstc) Swtch to PPA f we use centrod of area as the pont data Network analyss Three ways to represent and thus to analyze spatal data:

Randomly dstrbuted data Spatal arrangement The assumpton n classcal statstc analyss Unformly dstrbuted data The most dspersed pattern the antthess of beng clustered Negatve spatal autocorrelaton Clustered dstrbuted data Tobler s Law all thngs are related to one another, but near thngs are more related than dstant thngs Postve spatal autocorrelaton Three basc ways n whch ponts or areas may be spatally arranged 3

Spatal Dstrbuton wth p value 0 ) 4

Nearest neghbor dstance Questons: What s the pattern of ponts n terms of ther nearest dstances from each other? Is the pattern random, dspersed, or clustered? Example Is there a pattern to the dstrbuton of toxc waste stes near the area n San Dego (see next slde)? [hypothetcal data] 5

6

Step 1: Calculate the dstance from each pont to ts nearest neghbor, by calculatng the hypotenuse of the trangle: NND AB ( xa xb ) ( ya yb Ste NN NND NND A 1.7 8.7 B.79 B 4.3 7.7 C 0.98 C 5. 7.3 B 0.98 D 6.7 9.3 C.50 E 5.0 6.0 C 1.3 F 6.5 1.7 E 4.55 NND n 13.1 6.19 ) 13.1 7

Step : Calculate the dstances under varyng condtons The average dstance f the pattern were random? NND R 1 1 1.9 Densty 0.068 Where densty = n of ponts / area=6/88=0.068 If the pattern were completely clustered (all ponts at same locaton), then: 0 NND C Whereas f the pattern were completely dspersed, then: NND D 1.07453 Densty 1.07453 0.61 4.1 (Based on a Posson dstrbuton) 8

Step 3: Let s calculate the standardzed nearest neghbor ndex (R) to know what our NND value means: Perfectly dspersed.15 R NND NND R.19 1.9 1.14 More dspersed than random = slghtly more dspersed than random Totally random 1 More clustered than random Perfectly clustered 0 9

Hosptals & Attractons n San Dego The map shows the locatons of hosptals (+) and tourst attractons ( ) n San Dego Questons: Are hosptals randomly dstrbuted Are tourst attractons clustered? 10

Spatal Data (wth, coordnates) Any set of nformaton (some varable z ) for whch we have locatonal coordnates (e.g. longtude, lattude; or x, y) Pont data are straghtforward, unless we aggregate all pont data nto an areal or other spatal unts Area data requre addtonal assumptons regardng: Boundary delneaton Modfable areal unt (states, countes, street blocks) Level of spatal aggregaton = scale 11

Area Statstcs Questons 003 forest fres n San Dego Gven the map of SD forests What s the average locaton of these forests? How spread are they? Where do you want to place a fre staton? 1

What can we do? Preparatons Fnd or buld a coordnate system Measure the coordnates of the center of each forest Use centrod of area as the pont data (0, 763) (580,700) (380,650) (480,60) (400,500) (500,350) (300,50) (550,00) (0,0) (600, 0) 13

Mean center The mean center s the average poston of the ponts Mean center of : C Mean center of : C n n y x (0, 763) (580 380 480 400 500 550 300) C 7 455.71 (700 650 60 500 350 50 00) C 7 467.14 (0,0) # (380,650) #3 (480,60) #4 (400,500) #1 (580,700) (456,467) Mean center #5 (500,350) #6 (300,50) #7(550,00) (600, 0) 14

15 Standard dstance The standard dstance measures the amount of dsperson Smlar to standard devaton Formula ) ( ) ( ) ( ) ( c c D c c D n n S n S Defnton Computaton

Standard dstance Forests #1 580 336400 700 490000 # 380 144400 650 4500 #3 480 30400 60 384400 #4 400 160000 500 50000 #5 500 50000 350 1500 #6 300 90000 50 6500 #7 550 30500 00 40000 Sum of 1513700 Sum of 1771900 455.71 467. 14 C C S D ( n 1513700 ( 7 c ) ( 455.71 n ) c ) 1771900 ( 7 467.14 ) 08.5 16

Standard dstance (0, 763) Mean center # (380,650) #4 (400,500) (456,467) #1 (580,700) #3 (480,60) S D =08.5 #5 (500,350) #6 (300,50) #7(550,00) (0,0) (600, 0) 17

18 Defnton of weghted mean center standard dstance What f the forests wth bgger area (the area of the smallest forest as unt) should have more nfluence on the mean center? wc f f wc f f ) ( ) ( ) ( ) ( wc wc WD wc wc WD f f f f S f f f S Defnton Computaton

Calculaton of weghted mean center What f the forests wth bgger area (the area of the smallest forest as unt) should have more nfluence? Forests f(area) f (Area*) f (Area*) #1 5 580 900 700 3500 # 0 380 7600 650 13000 #3 5 480 400 60 3100 #4 10 400 4000 500 5000 #5 0 500 10000 350 7000 #6 1 300 300 50 50 #7 5 550 13750 00 5000 f 86 f 40950 f 36850 f 40950 f wc 36850 476.16 wc 48. 49 f 86 f 86 19

Calculaton of weghted standard dstance What f the forests wth bgger area (the area of the smallest forest as unt) should have more nfluence? S WD Forests f (Area) f f #1 5 580 336400 168000 700 490000 450000 # 0 380 144400 888000 650 4500 8450000 #3 5 480 30400 115000 60 384400 19000 #4 10 400 160000 1600000 500 50000 500000 #5 0 500 50000 5000000 350 1500 450000 #6 1 300 90000 90000 50 6500 6500 #7 5 550 30500 756500 00 40000 1000000 f 86 f 19974500 f 18834500 ( f f 19974500 ( 86 wc ) ( 476.16 ) f f wc ) 18834500 ( 86 48.49 ) 0.33 0

Standard dstance (0, 763) # (380,650) #1 (580,700) #3 (480,60) Mean center Weghted mean center #4 (400,500) (456,467) (476,48) #6 (300,50) #5 (500,350) #7(550,00) Standard dstance =08.5 Weghted standard Dstance=0.33 (0,0) (600, 0) 1

Standard dstance (0, 763) # (380,650) #1 (580,700) #3 (480,60) Mean center Weghted mean center #4 (400,500) (456,467) (476,48) #6 (300,50) #5 (500,350) #7(550,00) Standard dstance =08.5 Weghted standard Dstance=0.33 (0,0) (600, 0)

Spatal clustered? Gven such a map, s there strong evdence that housng values are clustered n space? Lows near lows Hghs near hghs 3

More than ths one? Does household ncome show more spatal clusterng, or less? 4

Moran s I statstc Global Moran s I Characterze the overall spatal dependence among a set of areal unts Covarance 5

Summary Global Moran s I and local I have dfferent equatons, one for the entre regon and one for a locaton. But for both of them (I and I ), or the assocated scores (Z and Z ) Bg postve values postve spatal autocorrelaton Bg negatve values negatve spatal autocorrelaton Moderate values random pattern 6

Network Analyss: Shortest routes (0, 763) Eucldean dstance #1 (580,700) # (380,650) #3 (480,60) #4 (400,500) d ( j ) ( j ) Mean center (456,467) #5 (500,350) (500 400) 180.8 (350 500) #6 (300,50) #7(550,00) (0,0) (600, 0) 7

Manhattan Dstance Eucldean medan Fnd ( e, e ) such that d e ( e) ( e ) s mnmzed Need teratve algorthms Locaton of fre staton Manhattan medan d 400 300 500 50 j 350 j j (0,0) (0, 763) # (380,650) #6 (300,50) #4 (400,500) Mean center (456,467) ( e, e ) #5 (500,350) #7(550,00) (600, 0) 8

Summary What are spatal data? Mean center Weghted mean center Standard dstance Weghted standard dstance Eucldean medan Manhattan medan Calculate n GIS envronment 9

Spatal resoluton Patterns or relatonshps are scale dependent Herarchcal structures (blocks block groups census tracks ) Cell sze: # of cells vary and spatal patterns masked or overemphaszed Vegetaton types at large (left) and small cells (rght) How to decde The goal/context of your study Test dfferent szes (Weeks et al. artcle: 50, 500, and 1,000 m) % of senors at block groups (left) and census tracts (rght) 30

Admnstratve unts Default unts of study May not be the best Many events/phenomena have nothng to do wth boundares drawn by humans How to handle Include events/phenomena outsde your study ste boundary Use other methods to reallocate the events /phenomena (Weeks et al. artcle; see next page) 31

A. Locate human settlements B. Fnd ther centrods C. Impose grds. usng RS data 3

Edge effects What t s Features near the boundary (regardless of how t s defned) have fewer neghbors than those nsde The results about near-edge features are usually less relable How to handle Buffer your study area (outward or nward), and nclude more or fewer features Varyng weghts for features near boundary a. Medan ncome by census tracts b. Sgnfcant clusters (Z-scores for I ) 33

Dfferent! c. More census tracts wthn the buffer d. More areas are sgnfcant (between brown and black boxes) ncluded 34

Applyng Spatal Statstcs Vsualzng spatal data Closely related to GIS Other methods such as Hstograms Explorng spatal data Random spatal pattern or not Tests about randomness Modelng spatal data Correlaton and Regresson analyss 35