Using Spatial Statistics Social Service Applications Public Safety and Public Health

Similar documents
GIS Analysis: Spatial Statistics for Public Health: Lauren M. Scott, PhD; Mark V. Janikas, PhD

Modeling Spatial Relationships using Regression Analysis

Modeling Spatial Relationships Using Regression Analysis

Modeling Spatial Relationships Using Regression Analysis. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Summary of OLS Results - Model Variables

Exploratory Spatial Data Analysis (ESDA)

A GEOSTATISTICAL APPROACH TO PREDICTING A PHYSICAL VARIABLE THROUGH A CONTINUOUS SURFACE

GeoDa-GWR Results: GeoDa-GWR Output (portion only): Program began at 4/8/2016 4:40:38 PM

Spatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach

Regression Analysis. A statistical procedure used to find relations among a set of variables.

Models for Count and Binary Data. Poisson and Logistic GWR Models. 24/07/2008 GWR Workshop 1

Objectives Define spatial statistics Introduce you to some of the core spatial statistics tools available in ArcGIS 9.3 Present a variety of example a

1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX

This report details analyses and methodologies used to examine and visualize the spatial and nonspatial

Shana K. Pascal Department of Resource Analysis, Saint Mary s University of Minnesota, Minneapolis, MN 55408

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Terms ABBR Definition

Explorative Spatial Analysis of Coastal Community Incomes in Setiu Wetlands: Geographically Weighted Regression

The multiple regression model; Indicator variables as regressors

CHAPTER 6: SPECIFICATION VARIABLES

ESRI 2008 Health GIS Conference

Gis Based Analysis of Supply and Forecasting Piped Water Demand in Nairobi

Spatial Pattern Analysis: Mapping Trends and Clusters

Steps in Regression Analysis

Daniel Fuller Lise Gauvin Yan Kestens

A geographically weighted regression

Statistics: A review. Why statistics?

Spatial Regression Modeling

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics

Running head: GEOGRAPHICALLY WEIGHTED REGRESSION 1. Geographically Weighted Regression. Chelsey-Ann Cu GEOB 479 L2A. University of British Columbia

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Answers to Problem Set #4

Econometrics Review questions for exam

GeoDa and Spatial Regression Modeling

In matrix algebra notation, a linear model is written as

envision Technical Report Archaeological Prediction Maps Kapiti Coast

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java)

Modeling the Ecology of Urban Inequality in Space and Time

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

Geographically Weighted Regression as a Statistical Model

Empirical Economic Research, Part II

Statistical and Econometric Methods for Transportation Data Analysis

Modeling the Spatial Effects on Demand Estimation of Americans with Disabilities Act Paratransit Services

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Pattern Analysis: Mapping Trends and Clusters. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS

Geographically weighted regression approach for origin-destination flows

Spatial Investigation of Mineral Transportation Characteristics in the State of Washington

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Statistical and Econometric Methods for Transportation Data Analysis

GIS in Locating and Explaining Conflict Hotspots in Nepal

Why do people want to get married?

Model Estimation Example

Day 4: Shrinkage Estimators

ECON 497 Midterm Spring

Ordinary Least Squares Regression Explained: Vartanian

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England

Spatial Pattern Analysis: Mapping Trends and Clusters

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Regression of Inflation on Percent M3 Change

Section 2 NABE ASTEF 65

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

CS6220: DATA MINING TECHNIQUES

Regression Analysis of 911 call frequency in Portland, OR Urban Areas in Relation to Call Center Vicinity Elyse Maurer March 13, 2015

Geographically Weighted Regression and Kriging: Alternative Approaches to Interpolation A Stewart Fotheringham

Prospect. February 8, Geographically Weighted Analysis - Review and. Prospect. Chris Brunsdon. The Basics GWPCA. Conclusion

BOOtstrapping the Generalized Linear Model. Link to the last RSS article here:factor Analysis with Binary items: A quick review with examples. -- Ed.

APPLICATION OF GEOGRAPHICALLY WEIGHTED REGRESSION ANALYSIS TO LAKE-SEDIMENT DATA FROM AN AREA OF THE CANADIAN SHIELD IN SASKATCHEWAN AND ALBERTA

Spatial Modeling of Residential Crowding in Alexandria Governorate, Egypt: A Geographically Weighted Regression (GWR) Technique

Problem Set 10: Panel Data

Heteroscedasticity 1

Finding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics

Geospatial dynamics of Northwest. fisheries in the 1990s and 2000s: environmental and trophic impacts

Introduction to Econometrics

Techniques for Science Teachers: Using GIS in Science Classrooms.

Föreläsning /31

Lecture 1: OLS derivations and inference

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Dr Arulsivanathan Naidoo Statistics South Africa 18 October 2017

Statistical and Econometric Methods for Transportation Data Analysis

Practice Questions for the Final Exam. Theoretical Part

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Ph.D. Preliminary Examination Statistics June 2, 2014

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Statistical and Econometric Methods for Transportation Data Analysis

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Applied Microeconometrics (L5): Panel Data-Basics

Mapping Your Educational Research: Putting Spatial Concepts into Practice with GIS. Mark Hogrebe Washington University in St.

Fixed and Random Effects Models: Vartanian, SW 683

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka

An overview of applied econometrics

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Attribute Data. ArcGIS reads DBF extensions. Data in any statistical software format can be

Introduction. Part I: Quick run through of ESDA checklist on our data

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

Spatial Analysis 1. Introduction

Lecture 4: Multivariate Regression, Part 2

Final Exam - Solutions

Transcription:

Using Spatial Statistics Social Service Applications Public Safety and Public Health Lauren Rosenshein 1

Regression analysis Regression analysis allows you to model, examine, and explore spatial relationships, in order to better understand the factors behind observed spatial patterns or to predict outcomes. Ordinary Least Square Geographically Weighted Regression 100 Population Feature Class Income Feature Class Output Feature Class 80 60 40 20 Observed Values 0 Predicted Values 0 20 40 60 Intercept + Coefficient Surface + Coefficient Surface = Crime 80 100 2

Regression analysis terms and concepts Dependent variable (Y): what you are trying to model or predict (Residential Burglary, for example). Explanatory variables (X): variables you believe cause or explain the dependent variable (like: income, vandalism, households). Coefficients (β): values, computed by the regression tool, reflecting explanatory to dependent variable relationships. Residuals (ε): the portion of the dependent variable that isn t explained by the model; the model under and over predictions. 3

DEMO Mortality Data Analysis 4

Use OLS to test hypotheses Why are people dying young in South Dakota? Do economic factors explain this spatial pattern? Poverty rates explain 66% of the variation in the average age of death dependent variable: Adjusted R-Squared [2]: 0.659 However, significant spatial autocorrelation among model residuals indicates important explanatory variables are missing from the model. 5

Build a multivariate regression model Explore variable relationships using the scatterplot matrix Consult theory and field experts Look for spatial variables Run OLS (this is an iterative, often tedious, trial and error, process) 6

Interpreting OLS results Use the notes on interpretation as a guide to understanding OLS model output. 7

Coefficient significance Look for statistically significant explanatory variables. Consult the robust probabilities when the Koenker test is statistically significant * Statistically significant at the 0.05 level. Probability Robust_Prob 0.000000* 0.000000* 0.000000* 0.001219* 0.000035* 0.079514 0.000000* 0.000000* 0.000000* 0.005990* 0.001994* 0.067555 Koenker(BP) Statistic [5]: 38.994033 Prob(>chi-squared),(5) degrees of freedom: 0.00000* 8

Multicollinearity Find a set of explanatory variables that have low VIF values. In a strong model, each explanatory variable gets at a different facet of the dependent variable. What did one regression coefficient say to the other regression coefficient? I m partial to you! VIF -------------2.351229 1.556498 1.051207 1.400358 3.232363 [1] Large VIF (> 7.5, for example) indicates explanatory variable redundancy. 9

Model performance Compare models by looking for the lowest AIC value. As long as the dependent variable remains fixed, the AIC value for different OLS/GWR models are comparable Look for a model with a high Adjusted R-Squared value. [2] Measure of model fit/performance. Akaike s Information Criterion (AIC) [2]: 524.976 Adjusted R-Squared [2]: 0.864823 10

Model significance The Joint F-Statistic and Joint Wald Statistic measure overall model significance. Consult the Joint Wald statistic when the Koenker test is statistically significant. Joint F-Statistic [3]: Joint Wald Statistic [4]: Koenker (BP) Statistic [5]: 151.985705 Prob(>F), (4,113) degrees of freedom: 0.000000* 496.057428 Prob(>chi-sq), 5 degrees of freedom: 0.000000* 21.590491 Prob(>chi-sq), 5 degrees of freedom: 0.000626* 11

Model bias [6] Significant p-value indicates residuals deviate from a normal distribution. When the Jarque-Bera test is statistically significant: the model is biased results are not reliable often this indicates that a key variable is missing from the model Jarque-Bera Statistic [6]: 4.207198 Prob(>chi-sq), (2) degrees of freedom: 0.122017 12

Spatial Autocorrelation Statistically significant clustering of under and over predictions. Random spatial pattern of under and over predictions. 13

Check OLS results 1 Coefficients have the expected sign. 2 No redundancy among model explanatory variables. 3 Coefficients are statistically significant. 4 Residuals are normally distributed. 5 Strong Adjusted R-Square value. 6 Relationships do not vary significantly across the study area. 14

Run Geographically Weighted Regression (GWR) GWR is a local, spatial, regression model Global Regression methods, like OLS, break down when the strength of model relationships vary across the study area GWR variables are the same as OLS, except: Do not include spatial regime (dummy) variables Do not include variables with little value variation Selecting a bandwidth and kernel Fixed or Adaptive AIC, Cross Validation (CV), bandwidth parameter Condition numbers 15

Interpreting GWR results Compare GWR R2 and AIC values to OLS R2 and AIC values. The better model has a lower AIC and a high R2. Residual maps show model under and over predictions. They shouldn t be clustered. Coefficient maps show how modeled relationships vary across the study area. Model predictions, residuals, standard errors, coefficients, and condition numbers are written to the output feature class. 16

GWR prediction Calibrate the GWR model using known values for the dependent variable and all of the explanatory variables. Observed Modeled Predicted Provide a feature class of prediction locations containing values for all of the explanatory variables. GWR will create an output feature class with the computed predictions. 17

Resources for learning more The ESRI Guide to GIS Analysis, Vol. 2 Geographically Weighted Regression, by Fotheringham, Brundson, and Charlton 911 emergency call analysis demo: http://www.esri.com/software/arcgis/arcinfo/about/demos.html Virtual campus free web seminar http://campus.esri.com/ Articles (keyword search: Spatial Statistics ) http://www.esri.com/news/arcuser/0405/ss_crimestats1of2.html ArcGIS 9.3 Web Help: Regression Analysis Basics Interpreting OLS Results Interpreting GWR Results Watch for updates GP Resource Center LScott@ESRI.com 18

QUESTIONS? Lrosenshein@ESRI.com 19