A GEOSTATISTICAL APPROACH TO PREDICTING A PHYSICAL VARIABLE THROUGH A CONTINUOUS SURFACE

Similar documents
Modeling Spatial Relationships using Regression Analysis

Modeling Spatial Relationships Using Regression Analysis

Modeling Spatial Relationships Using Regression Analysis. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Exploratory Spatial Data Analysis (ESDA)

GIS Analysis: Spatial Statistics for Public Health: Lauren M. Scott, PhD; Mark V. Janikas, PhD

Using Spatial Statistics Social Service Applications Public Safety and Public Health

Summary of OLS Results - Model Variables

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis:

GEOSTATISTICAL METHODS FOR PREDICTING SOIL MOISTURE CONTINUOUSLY IN A SUBALPINE BASIN

Spatial Analysis with ArcGIS Pro STUDENT EDITION

GeoDa-GWR Results: GeoDa-GWR Output (portion only): Program began at 4/8/2016 4:40:38 PM

Concepts and Applications of Kriging. Eric Krause

Lecture 8. Spatial Estimation

Using Spatial Statistics and Geostatistical Analyst as Educational Tools

Lecture 5 Geostatistics

Spatial Pattern Analysis: Mapping Trends and Clusters

ArcGIS Pro: Analysis and Geoprocessing. Nicholas M. Giner Esri Christopher Gabris Blue Raster

Spatial Pattern Analysis: Mapping Trends and Clusters. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

envision Technical Report Archaeological Prediction Maps Kapiti Coast

Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

APPLICATION OF GEOGRAPHICALLY WEIGHTED REGRESSION ANALYSIS TO LAKE-SEDIMENT DATA FROM AN AREA OF THE CANADIAN SHIELD IN SASKATCHEWAN AND ALBERTA

Umeå University Sara Sjöstedt-de Luna Time series analysis and spatial statistics

Regression Analysis. A statistical procedure used to find relations among a set of variables.

Terms ABBR Definition

An Introduction to Pattern Statistics

ENGRG Introduction to GIS

GIST 4302/5302: Spatial Analysis and Modeling

A geographically weighted regression

This report details analyses and methodologies used to examine and visualize the spatial and nonspatial

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Spatial Data Mining. Regression and Classification Techniques

Concepts and Applications of Kriging. Eric Krause Konstantin Krivoruchko

Statistics: A review. Why statistics?

Gis Based Analysis of Supply and Forecasting Piped Water Demand in Nairobi

Why Is It There? Attribute Data Describe with statistics Analyze with hypothesis testing Spatial Data Describe with maps Analyze with spatial analysis

Evaluating sustainable transportation offers through housing price: a comparative analysis of Nantes urban and periurban/rural areas (France)

Geog 469 GIS Workshop. Data Analysis

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

KAAF- GE_Notes GIS APPLICATIONS LECTURE 3

Spatial Data Analysis in Archaeology Anthropology 589b. Kriging Artifact Density Surfaces in ArcGIS

Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

CSISS Tools and Spatial Analysis Software

ArcGIS for Geostatistical Analyst: An Introduction. Steve Lynch and Eric Krause Redlands, CA.

Interpolating Raster Surfaces

Soil Moisture Modeling using Geostatistical Techniques at the O Neal Ecological Reserve, Idaho

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

Single and multiple linear regression analysis

Watershed Delineation

Geospatial dynamics of Northwest. fisheries in the 1990s and 2000s: environmental and trophic impacts

review session gov 2000 gov 2000 () review session 1 / 38

Shana K. Pascal Department of Resource Analysis, Saint Mary s University of Minnesota, Minneapolis, MN 55408

GRAD6/8104; INES 8090 Spatial Statistic Spring 2017

Spatial Data Analysis with ArcGIS Desktop: From Basic to Advance

Attribute Data. ArcGIS reads DBF extensions. Data in any statistical software format can be

Objectives Define spatial statistics Introduce you to some of the core spatial statistics tools available in ArcGIS 9.3 Present a variety of example a

Concepts and Applications of Kriging

Spatial Interpolation & Geostatistics

It s a Model. Quantifying uncertainty in elevation models using kriging

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

11. Kriging. ACE 492 SA - Spatial Analysis Fall 2003

Introduction. Part I: Quick run through of ESDA checklist on our data

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Ridge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Multiple Regression Analysis

GIS Semester Project Working With Water Well Data in Irion County, Texas

Types of Spatial Data

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Geog 210C Spring 2011 Lab 6. Geostatistics in ArcMap

Extent of Radiological Contamination in Soil at Four Sites near the Fukushima Daiichi Power Plant, Japan (ArcGIS)

Iris Wang.

Statistics 203: Introduction to Regression and Analysis of Variance Course review

SPATIAL-TEMPORAL TECHNIQUES FOR PREDICTION AND COMPRESSION OF SOIL FERTILITY DATA

Application of the Getis-Ord Gi* statistic (Hot Spot Analysis) to seafloor organisms

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Course in Data Science

Regression Analysis of 911 call frequency in Portland, OR Urban Areas in Relation to Call Center Vicinity Elyse Maurer March 13, 2015

Spatial Interpolation Comparison Evaluation of spatial prediction methods

Report on Kriging in Interpolation

Manual for a computer class in ML

Combining Regressive and Auto-Regressive Models for Spatial-Temporal Prediction

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Spatial Analyst. By Sumita Rai

GEOGRAPHICAL STATISTICS & THE GRID

Applied Regression Modeling

Working with Digital Elevation Models in ArcGIS 8.3

Experimental Design and Data Analysis for Biologists

Forecasting with R A practical workshop

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

Data Structures & Database Queries in GIS

Empirical Bayesian Kriging

A Second Course in Statistics: Regression Analysis

Outline ESDA. Exploratory Spatial Data Analysis ESDA. Luc Anselin

LECTURE 15: SIMPLE LINEAR REGRESSION I

Outline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren

Spatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach

Geo-statistical Dengue Risk Model Case Study of Lahore Dengue Outbreaks 2011

LECTURE 11. Introduction to Econometrics. Autocorrelation

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Transcription:

Katherine E. Williams University of Denver GEOG3010 Geogrpahic Information Analysis April 28, 2011 A GEOSTATISTICAL APPROACH TO PREDICTING A PHYSICAL VARIABLE THROUGH A CONTINUOUS SURFACE

Overview Data Regression model Ordinary Least Squares Diagnostic Statistics Exploratory Regression Geographically Weighted Regression Interpolation Measured vs. Predicted

Pre-Processing Recommendations Make sure data is clean & organized Use consistent projections Work in a File Geodatabase (if possible) Explore & understand your data Distributions, variables, etc. Read the statistical documentation Be ready to analyze outcomes The software will not do everything for you

Data Soil Moisture Spatial Survey (SMSS) 117 points Dependent Variable: Volumetric Water Content (VWC) 21 Candidate Explanatory Variables Goal: Create a continuous VWC surface that models changes induced by microtopographic features (less than 10 meters)

SMSS Study Area Map

SMSS Plan 1. Find a model from my explanatory variables that best predicts VWC 2. Create an interpolated surface for VWC What method should I use? How do I find available options? How do I choose between methods?

Ordinary Least Squares Regression Describes relationship between dependent variable and explanatory variable(s) Global model, single regression equation Dependent Variable y= X + X + + X + β β β ε Coefficients 1 1 2 2... n n Explanatory Variables Random Error

Ordinary Least Squares Regression Minimizes squared distance between observed and predicted Residuals are the portion of the dependent variable that is unaccounted for by the regression model

OLS Pitfalls Misspecification Missing variables Outliers (scatter plot) Non-linear relationships (scatter plot) Multicollinearity (VIF) Normal distribution bias (Jarque-Bera) Nonstationarity (Kroenker BP) Residual spatial autocorrelation (Moran s I)

Exploratory Regression Tool Supplementary Spatial Statistics Toolbox http://blogs.esri.com/dev/blogs/geoprocessing/archive/2010/10/11/supplem entary-spatial-statistics-now-available-for-download_2100_.aspx Iterates through OLS models Identifies properly specified OLS models Diagnostic Statistics assess model performance, redundancy, bias, and residual distribution

Exploratory Regression for SMSS Passing Model 3 explanatory variables: Slope, Vegetation Type, Percent Cover

Scatter Plot Simple way to look at data distribution Patterns should be linear Transformation Outliers Remove data points & re-run model Dependent variable 250 200 150 100 50 0 0 20 40 60 80 100 Random Linear Non-Linear Explanatory Variable

Coefficient of Determination (Adjusted R 2 ) Ability of model to predict variability in dependent variable Value range: 0 to 1 Higher values explain more Above 0.50 is considered passing for the ER Tool SMSS Value: 0.61 (final OLS)

Variance Inflation Factor (VIF) Explanatory Variable Redundancy High values indicate that explanatory variables are interacting Multicollinearity Passing VIF value < 7.5 SMSS Values: Slope (1.23), Vegetation Type (1.46), and Percent Cover (1.21)

Corrected Akaike s Information Criterion Measure of model fit Can be used to compare models with same dependent variable Lower values are better Interpolation bandwidth SMSS Value: -122.176 (final OLS)

Jarque-Bera Statistic Residual Numeric Distribution (Bias) Null Hypothesis is a normal distribution Significant p-value means non-normal distribution SMSS p-value: 0.08

Kroenker s BP Statistic Consistency of dependent/explanatory variable relationship Significance indicates the relationship is not consistent Nonstationarity A Geographically Weighted Regression should be considered SMSS p-value: 0.000002*

Moran s Index Spatial autocorrelation in residuals Value Range -1 to 1 (-1=dispersion, 0=CSR, 1=clustering) Significant values indicate deviation from CSR Clustering of under & over prediction Missing key explanatory variable SMSS Value: 0.555171

Moran s I Report

Spatial Weights Matrix Conceptualization of spatial relationships Structure for Moran s I, Hot Spot, & Clustering Statistics Neighbors or distance Should reflect real relationship in data SMSS SWM: Inverse Distance Squared

OLS Diagnostic Summary for SMSS Missing variables Outliers (scatter plot) Non-linear relationships (scatter plot) Multicollinearity (VIF) Normal distribution bias (Jarque-Bera) Nonstationarity (Kroenker BP) Residual spatial autocorrelation (Moran s I)

Geographically Weighted Regression Local Regression Model Calculates individual equations for each point Multilevel model Average value Random component Local coefficients Disaggregate local variations from overall model

Geographically Weighted Regression Based on neighborhood or distance search Kernel and Bandwidth GWR does not provide robust diagnostic statistics Relationships should be inspected in OLS first!

SMSS Study GWR Bandwidth 266 m Residual Squared 1.26 Effectiveness Number 28.5 Adjusted R 2 0.71 AICc -140.9467 Moran s I p-value 0.865530

What s Next? Have a properly specified OLS/GWR model Says something about the why in soil moisture variation Physical variables that control or are indicators of soil moisture Only says something about soil moisture at the sample points where soil moisture is already known Does not inform about locations where soil moisture has not already been measured

Interpolation Estimates values from limited input point data for a continuous raster surface Works on concept that spatial objects are spatially correlated Many types: IDW, Kriging, Natural Neighbor, Spline, Trend, etc. Best technique often depends on the data SMSS compares IDW and Ordinary Kriging

Inverse Distance Weighting Exact interpolator Distance determines weight Exponential distance decay with higher power designation Search Window Shape Bandwidth Neighbors

IDW Parameters

Ordinary Kriging Uses spatial relationships in data to build prediction model Build Semiovariogram Fit Model Predict Need to explore data to specify most appropriate method Search Window Model Type

Semivariogram & Kriging Parameters

IDW Cross Validation

Kriging Cross Validation

SMSS IDW Surface

SMSS OK Surface

SMSS Cross Validation Field Points Measured n=117 Summary Statistics Ordinary Kriging Inverse Distance Weighting Predicted Error Mean 0.00632622 0.00467828 Predicted Error RMS 0.20752391 0.20251504 Predicted Regression Function Error Regression Function y = 0.2345 * x + 0.2019 y = 0.2362 * x + 0.1994 y = -0.7655 * x + 0.2019 y = -0.7638 * x + 0.1994

Where are we at? Continuous Interpolated Surface for VWC Is it sufficient? Am I modeling variance related to topography?

Regression Model Predict Regression models are equations Unknown dependent variables can be computed with known explanatory variables Built into GWR interface as Additional Parameters Explanatory variables must be input in the same order as input data

Predicted Points If, Explanatory Variables are known. Then, the Dependent Variable can be computed with OLS or GWR Field points 10 & 5 meter grids

Defining Explanatory Variables Create Continuous Surfaces Extract Values to Points Input into GWR Does not further inform the GWR model Re-run Interpolation

SMSS 10 & 5 meter IDW Surfaces

SMSS 10 & 5 meter OK Surfaces

Cross Validation 10m Grid Points GWR Predicted n=10557 5m Grid Points GWR Predicted n=42531 Summary Statistics Ordinary Kriging Inverse Distance Weighting Predicted Error Mean -0.00005312 0.00006528 Predicted Error RMS 0.12523941 0.12139920 Predicted Regression Function Error Regression Function y = 0.6253 * x + 0.0762 y = -0.3747 * x + 0.0762 y = 0.6603 * x + 0.0692 y = -0.3397 * x + 0.0692 Predicted Error Mean 0.00002023 0.00000738 Predicted Error RMS 0.10854328 0.10241228 Predicted Regression Function Error Regression Function y = 0.8153 * x + 0.0409 y = -0.1847 * x + 0.0409 y = 0.8419 * x + 0.0353 y = -0.1581 * x + 0.0353

Now what do we have? Modeled surface at high resolution from limited data Picking up midslope variability

Summary of SMSS

Lessons Learned Spatial statistical methods are powerful However, all statistical methods must be carefully evaluated to avoid misspecification Think out of the box Make sure you are answering the right questions Use your resources! ArcGIS has extensive documentation & experts who are willing to help

Additional Resources ArcGIS Desktop Help ArcGIS Resource Center Spatial Statistics Toolbox http://help.arcgis.com/en/arcgisdesktop/10.0/hel p/#/an_overview_of_the_spatial_statistics_too lbox/005p00000002000000/ Geoprocessing Blog http://blogs.esri.com/dev/blogs/geoprocessing/ default.aspx Documentation with Supplementary Spatial Statistics Toolbox download