Spatial Data, Spatial Analysis and Spatial Data Science

Similar documents
Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

Spatial Regression. 1. Introduction and Review. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Analysis 1. Introduction

GIS Spatial Statistics for Public Opinion Survey Response Rates

Mapping and Analysis for Spatial Social Science

SPACE Workshop NSF NCGIA CSISS UCGIS SDSU. Aldstadt, Getis, Jankowski, Rey, Weeks SDSU F. Goodchild, M. Goodchild, Janelle, Rebich UCSB

GIS and Spatial Statistics: One World View or Two? Michael F. Goodchild University of California Santa Barbara

GIS Test Drive What a Geographic Information System Is and What it Can Do. Alison Davis-Holland

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

Michael Harrigan Office hours: Fridays 2:00-4:00pm Holden Hall

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

The Case for Space in the Social Sciences

Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad

What s special about spatial data?

Everything is related to everything else, but near things are more related than distant things.

Question 1. Feedback Lesson 4 Quiz. You submitted this quiz on Tue 27 May :04 PM MYT. You got a score of out of

A spatial literacy initiative for undergraduate education at UCSB

GIST 4302/5302: Spatial Analysis and Modeling Lecture 2: Review of Map Projections and Intro to Spatial Analysis

Welcome. C o n n e c t i n g

GIST 4302/5302: Spatial Analysis and Modeling

Exploratory Spatial Data Analysis and GeoDa

CS : Spatial Data Modeling and Analysis. Geovisualization

Outline ESDA. Exploratory Spatial Data Analysis ESDA. Luc Anselin

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Nature of Spatial Data. Outline. Spatial Is Special

Spatial Variation in Hospitalizations for Cardiometabolic Ambulatory Care Sensitive Conditions Across Canada

Techniques for Science Teachers: Using GIS in Science Classrooms.

Spatial Analysis in CyberGIS

GIS = Geographic Information Systems;

Geovisualization. Luc Anselin. Copyright 2016 by Luc Anselin, All Rights Reserved

The Changing Landscape of Land Administration

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

SPATIAL ANALYSIS. Transformation. Cartogram Central. 14 & 15. Query, Measurement, Transformation, Descriptive Summary, Design, and Inference

Spatial Data Science. Soumya K Ghosh

Combining Incompatible Spatial Data

Introduction to Spatial Big Data Analytics. Zhe Jiang Office: SEC 3435

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence

SOCI 20253/GEOG 20500, SOCI 30253, MACS Introduction to Spatial Data Science SYLLABUS

Where Do Overweight Women In Ghana Live? Answers From Exploratory Spatial Data Analysis

Exploring Digital Welfare data using GeoTools and Grids

Medical GIS: New Uses of Mapping Technology in Public Health. Peter Hayward, PhD Department of Geography SUNY College at Oneonta

Spatial Analysis with Web GIS. Rachel Weeden

Introduction to Spatial Statistics and Modeling for Regional Analysis

Exploring the Impact of Ambient Population Measures on Crime Hotspots

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

From Research Objects to Research Networks: Combining Spatial and Semantic Search

The Use of Spatial Weights Matrices and the Effect of Geometry and Geographical Scale

GIST 4302/5302: Spatial Analysis and Modeling

A Micro-Spatial Analysis of Violent Crime

Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data

Geographic Data Science - Lecture II

Spatial Modeling, Regional Science, Arthur Getis Emeritus, San Diego State University March 1, 2016

GIS for ChEs Introduction to Geographic Information Systems

Web Visualization of Geo-Spatial Data using SVG and VRML/X3D

Geog 469 GIS Workshop. Data Analysis

Exploratory Spatial Data Analysis Using GeoDA: : An Introduction

An Introduction to Geographic Information System

Combining Geospatial and Statistical Data for Analysis & Dissemination

Finding Hot Spots in ArcGIS Online: Minimizing the Subjectivity of Visual Analysis. Nicholas M. Giner Esri Parrish S.

Detecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area

GEOG 3340: Introduction to Human Geography Research

Implications for the Sharing Economy

Advanced Algorithms for Geographic Information Systems CPSC 695

Roger S. Bivand Edzer J. Pebesma Virgilio Gömez-Rubio. Applied Spatial Data Analysis with R. 4:1 Springer

GIS and the Built Environment

Oak Ridge Urban Dynamics Institute

What are the five components of a GIS? A typically GIS consists of five elements: - Hardware, Software, Data, People and Procedures (Work Flows)

Introduction to GIS I

GIST 4302/5302: Spatial Analysis and Modeling

The Building Blocks of the City: Points, Lines and Polygons

WHAT IS GIS? Source: Longley et al (2005) Geographic Information Systems and Science. 2nd Edition. John Wiley and Sons Ltd.

Understanding the modifiable areal unit problem

1. INTRODUCTION 2. BIG DATA IN URBAN STUDIES 3. LESSONS FROM BIG DATA

Migration Clusters in Brazil: an Analysis of Areas of Origin and Destination Ernesto Friedrich Amaral

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

GIST 4302/5302: Spatial Analysis and Modeling

Big Data Discovery and Visualisation Insights for ArcGIS

ArcGIS Pro: Analysis and Geoprocessing. Nicholas M. Giner Esri Christopher Gabris Blue Raster

GEOGRAPHIC INFORMATION SYSTEMS

A GEOSTATISTICAL APPROACH TO PREDICTING A PHYSICAL VARIABLE THROUGH A CONTINUOUS SURFACE

Statistics for cities

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Multidimensional Poverty in Colombia: Identifying Regional Disparities using GIS and Population Census Data (2005)

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Exploratory Spatial Data Analysis (ESDA)

Introduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum

CSISS Resources for Research and Teaching

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

Datahoods or Data-Driven Neighborhoods. Using GIS to better understand neighborhoods

NEW YORK DEPARTMENT OF SANITATION. Spatial Analysis of Complaints

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

Tools to Assess Local Health Needs. Richard Leadbeater, Esri NACo 2011 Healthy Counties Forum December 1, 2011

A FOSS Web Tool for Spatial Regression Techniques and its Application to Explore Bike Sharing Usage Patterns

GIS Quick Facts. CIVL 1101 GIS Quick Facts 1/5.

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

1. Richard Milton 2. Steven Gray 3. Oliver O Brien Centre for Advanced Spatial Analysis (UCL)

Orbital Insight Energy: Oil Storage v5.1 Methodologies & Data Documentation

Why Is It There? Attribute Data Describe with statistics Analyze with hypothesis testing Spatial Data Describe with maps Analyze with spatial analysis

Transcription:

Spatial Data, Spatial Analysis and Spatial Data Science Luc Anselin http://spatial.uchicago.edu 1

spatial thinking in the social sciences spatial analysis spatial data science spatial data types and research questions pitfalls examples 2

Spatial Thinking in the Social Sciences 3

Motivation - Substantive from atomistic decision units to social-spatial interaction peer-effects, copy-catting, diffusion spatial imprint of social networks/interaction spatial externalities spatial spillovers, spatial multipliers 4

Motivation - Substantive (continued) social facts are located (Abbott 1997) spatial mismatch spatial disparities spatial context neighborhood effects 5

Motivation - Data geo-located observations street addresses of crimes, sensor data, social media data mismatch between the spatial scale of the process and the spatial scale of the observations administrative units (e.g., census tracts) are not behavioral units (e.g., neighborhoods) 6

Motivation - Data (continued) error terms show systematic patterns neighborhood effects in individual house price models distance decay (precision decreases with distance from sensors) change of (spatial) support problem data at different spatial scales, nested or overlapping census block groups into census tracts school districts and census tracts 7

Some Examples 8

Abbott (1997) Chicago School 9

spatially integrated social science Goodchild et al (2000), Goodchild and Janelle (2004) 10

spatial turn in health research Richardson et al (2013) 11

OMB Circular M-09-28 Effective Place-Based Policies 12

Spatial Analysis 13

Dr. John Snow Map of 1854 London Cholera 14

Modern-Day Version (Chicago Tribune) 15

What is Spatial Analysis beyond mapping added value transformations, manipulations and application of analytical methods to spatial (geographic) data (Goodchild et al, Geospatial Analysis) (geospatial) knowledge discovery specialized form of KDD, knowledge discovery from data(bases) from data to information to knowledge to wisdom 16

Spatial Analytics Questions where do things happen: patterns, clusters, hot spots, disparities why do they happen where they happen: location decisions how does where things happen affect other things (context, environment) and how does context affect what happens: interaction where should things be located: optimization 17

When is Analysis Spatial? geo-spatial data: location + value (attribute) non-spatial analysis: location does NOT matter = locational invariance spatial analysis: when the location changes, the information content of the data changes 18

Spatial Distribution 19

A-Spatial Distribution Histogram 20

Spatial Analysis Global Spatial Autocorrelation Moran Scatter Plot 21

Components of Spatial Data Analytics mapping and geovisualization showing interesting patterns exploratory spatial data analysis discovering interesting patterns spatial modeling explaining interesting patterns optimization, simulation, prediction 22

Spatial Data Science 23

Big Data 24

The Big Data Phenomenon ill-defined, you know it when you see it cannot be handled with current x x = hardware, memory, software, methodology, etc. the five v volume, velocity, variety, value, veracity 6 25

26

Big Data Issues sample size = population or is N = 1 size of data set compensates for imprecision, lack of sampling framework, measurement error, etc., or does it? correlation, not causation prediction rather than explanation 6 27

Big Data for the Social Sciences new and big (or not so big) data sources ubiquitous sensors - smart cities open data portals - administrative data social media data - Twitter analytics 311 calls cell phone data geo-located and time stamped 28

Some Examples 29

City of Chicago open data portal https://data.cityofchicago.org 30

311 calls - abandoned buildings 31

array of things sensor network https://arrayofthings.github.io 32

array of things sensor locations 33

bike share locations 34

pulse of Rome - movement of cell phone calls http://senseable.mit.edu/realtimerome 35

relative intensity of Foursquare check ins - NYC neighborhoods Anselin and Williams (2016) Journal of Urbanism 36

Analytic Paradigm 37

Computational Social Science computation as the third approach to scientific discovery new techniques simulation machine learning, data mining visual data exploration 38

Data Driven Science The Fourth Paradigm massive new data sets many social science applications driven by industry web marketing, recommender systems, etc. 39

40

Data Science Nolan - Tempe Lang, Data Science in R (2015) data science consists of statistical computing and how to access, transform, manipulate, explore, visualize and reason about data Source: Drew Conway data science Venn diagram 41

selected operational data science texts 42

Spatial Data Science explicit treatment of spatial aspects integration of geocomputation, spatial statistics, spatial econometrics, exploratory spatial data analysis, visual spatial analytics, spatial data mining, spatial optimization 80% effort is data preparation (Dasu and Johnson 2003) algorithms, data structures, workflow 43

data science process Grolemund and Wickham (2017) 44

What s Involved in Spatial Data Science? data manipulation (munging, wrangling) data integration data exploration, pattern recognition, associations visualization modeling (prediction and explanation), classification, simulation, optimization lots of different software tools 45

Example 46

Digital Neighborhoods (with Sarah Williams) twitter and foursquare locations in NYC first week of Feb 2014 573,278 tweets and 589,091 foursquare check-ins 6 47

Data Manipulation parse Twitter JSON files and convert to csv 5760 files, more than 5 million messages > 20Gb of memory (Python or R) convert text file to spatial data base (PostGIS) spatially aggregate points to block group totals (n = 6454) (PostGIS or R) run local Moran statistics + visualize (GeoDa) 6 48

Williamsburg Flushing Smith and Court Street 49

Spatial Data Types and Research Questions 50

Spatial Data Structures formal representation of geographic features abstracted to points, lines and polygons spatial databases spatial index: speed up search 51

OGC Simple Features Specification 52

Spatial Data Types for Analysis points location as a random event locations of crimes, accidents, grocery stores surfaces continuous spatial field air quality surface, noise surface, price surface 53

Spatial Data Types for Analysis (2) discrete spatial data - lattice data areal units census tracts, counties, countries networks nodes and links street network, river network, social network 54

Space-Time Data fixed spatial locations over time time-in-space panel data = pooled cross-section and time series e.g., crime by neighborhood over time 55

Space-Time Data (2) changing spatial locations over time space-in-time moving objects e.g., taxi with GPS, cell phone calls, animal tracking 56

Data Types and Data Analysis the type of data determines what analysis can be carried out types of research questions are traffic accidents located randomly in space or clustered > point pattern analysis given sensor measurements on air quality, what is an air quality surface for a region > spatial interpolation where are hot spots of mortgage foreclosure in the city > cluster detection how are house prices affected by unobserved neighborhood effects > spatial regression 57

Some Important Characteristics are the data sampled (e.g., sensor locations) or is it the population (e.g., all the census tracts) are the spatial units discrete (areal units) or continuous (surface) are the locations given (e.g., areal units) or themselves random (e.g., location of events) 58

Pitfalls 59

Ecological Fallacy individual behavior cannot be explained at the aggregate level issue of interpretation e.g., county homicide rates do not explain individual criminal behavior model aggregate dependent variables with aggregate explanatory variables alternative: multilevel modeling 60

Modifiable Areal Unit Problem (MAUP) what is the proper spatial scale of analysis? a million spatial autocorrelation coefficients (Openshaw) spatial heterogeneity - different processes at different locations/scales both size and spatial arrangement of spatial units matter 61

Source: Scholl and Brenner (2012) 62

Change of Support Problem (COSP) variables measured at different spatial scales nested, hierarchical structures non-nested, overlapping solutions aggregate up to a common scale interpolate/impute - Bayesian approach 63

Examples 64

Events: Car thefts in San Francisco 65

cluster detection San Francisco car thefts - Heat Map (KDE) 66

flu hot spots and flows from tweets 67

LA Basin air quality monitoring stations 68

Interpolated air quality surface (Kriging) - pm10 69

Franklin County census tract foreclosure rate outlier map 70

Local Moran cluster map Franklin county census tract foreclosure rate 71

uber Boston ridership flows 72

uber ridership cartogram (NYC) 73

74