SOCI 20253/GEOG 20500, SOCI 30253, MACS Introduction to Spatial Data Science SYLLABUS

Similar documents
GIST 4302/5302: Spatial Analysis and Modeling

GIST 4302/5302: Spatial Analysis and Modeling

GIST 4302/5302: Spatial Analysis and Modeling

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week:

GIST 4302/5302: Spatial Analysis and Modeling Lecture 1: Overview

GEOG 508 GEOGRAPHIC INFORMATION SYSTEMS I KANSAS STATE UNIVERSITY DEPARTMENT OF GEOGRAPHY FALL SEMESTER, 2002

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Spatial Data, Spatial Analysis and Spatial Data Science

Geographic Systems and Analysis

SYLLABUS SEFS 540 / ESRM 490 B Optimization Techniques for Natural Resources Spring 2017

GTECH 380/722 Analytical and Computer Cartography Hunter College, CUNY Department of Geography

ENVIRONMENT AND NATURAL RESOURCES 3700 Introduction to Spatial Information for Environment and Natural Resources. (2 Credit Hours) Semester Syllabus

GIS FOR PLANNING. Course Overview. Schedule. Instructor. Prerequisites. Urban Planning 792 Thursday s 5:30-8:10pm SARUP 158

CSISS Tools and Spatial Analysis Software

AGRY 545/ASM 591R. Remote Sensing of Land Resources. Fall Semester Course Syllabus

GTECH 380/722 Analytical and Computer Cartography Hunter College, CUNY Department of Geography

ME DYNAMICAL SYSTEMS SPRING SEMESTER 2009

Introduction to Geographic Information Systems

GEOG 410: SPATIAL ANALYSIS SPRING 2016

Cornell University CSS/NTRES 6200 Spatial Modelling and Analysis for agronomic, resources and environmental applications. Spring semester 2015


Chemistry 883 Computational Quantum Chemistry

CEE461L Chemical Processes in Environmental Engineering CEE561L/ENV542L Environmental Aquatic Chemistry Fall 2017

ORGANIC CHEMISTRY 1 CHEM A FALL 2004 SYLLABUS

STATISTICAL COMPUTING USING R/S. John Fox McMaster University

LEHMAN COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF ENVIRONMENTAL, GEOGRAPHIC, AND GEOLOGICAL SCIENCES CURRICULAR CHANGE

GIS and Forest Engineering Applications FE 357 Lecture: 2 hours Lab: 2 hours 3 credits

GEO 6166 (Spring 2018) Advanced Quantitative Methods for Spatial Analysis OFFICE HOURS: Pre-requisites Course Description Selected Topics include

ICPSR Training Program McMaster University Summer, The R Statistical Computing Environment: The Basics and Beyond

Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign

GEOG 3340: Introduction to Human Geography Research

Welcome to Physics 161 Elements of Physics Fall 2018, Sept 4. Wim Kloet

HEAT AND THERMODYNAMICS PHY 522 Fall, 2010

HISTORY 1XX/ DH 1XX. Introduction to Geospatial Humanities. Instructor: Zephyr Frank, Associate Professor, History Department Office: Building

Chemistry Physical Chemistry I Fall 2018

Engineering Physics 3W4 "Acquisition and Analysis of Experimental Information" Part II: Fourier Transforms

PHYS 480/580 Introduction to Plasma Physics Fall 2017

State and National Standard Correlations NGS, NCGIA, ESRI, MCHE

Exploratory Spatial Data Analysis Using GeoDA: : An Introduction

Geological Foundations of Environmental Sciences

Course information. Required Text

Labs: Chelsea Ackroyd Office Location: FMAB 005 Office Hours: M 08:45 11:45 AM or by appointment

Stat 406: Algorithms for classification and prediction. Lecture 1: Introduction. Kevin Murphy. Mon 7 January,

The GeoDa Book. Exploring Spatial Data. Luc Anselin

COURSE OUTLINE GEOL204 MINING COMPUTING 3 CREDITS

General Chemistry I (CHE 1401)

Geog418: Introduction to GIS Fall 2011 Course Syllabus. Textbook: Introduction to Geographic Information Systems edited by Chang (6th ed.

GTECH 380/722 Analytical and Computer Cartography Hunter College, CUNY Department of Geography

Welcome to Chemistry 376

Science : Introduction to Astronomy Course Syllabus

Multivariable Calculus

ESS 102 Space and Space Travel

Spatial Analysis 1. Introduction

STRUCTURAL BIOINFORMATICS I. Fall 2015

Roger S. Bivand Edzer J. Pebesma Virgilio Gömez-Rubio. Applied Spatial Data Analysis with R. 4:1 Springer

Geoinformation in Environmental Modelling

Department of Civil and Environmental Engineering. CE Surveying

Chapter 0 Introduction. Advanced Urban Analysis E. - Spatial Analysis and GIS Educational background. 0.1 Introducing myself

University Studies Natural Science Course Renewal

Astro 32 - Galactic and Extragalactic Astrophysics/Spring 2016

MATH 341, Section 001 FALL 2014 Introduction to the Language and Practice of Mathematics

PS 150 Physics I for Engineers Embry-Riddle Aeronautical University Fall 2018

Geography 3410: Urban Applications of GIS Spring 2015 Boettcher West #125 Mon & Wed 2:00p 3:50p

Dr. LeGrande M. Slaughter Chemistry Building Rm. 307E Office phone: ; Tues, Thurs 11:00 am-12:20 pm, CHEM 331D

Syllabus: CHEM 4610/5560 Advanced Inorganic Chemistry I Fall Semester credit hours; lecture only

Tuesday 6:30 9:30 (First/Last classes) Home Phone: SYLLABUS. I. Focus of Course

Development of Integrated Spatial Analysis System Using Open Sources. Hisaji Ono. Yuji Murayama

Texts 1. Required Worboys, M. and Duckham, M. (2004) GIS: A Computing Perspective. Other readings see course schedule.

Chemistry Physical Chemistry I Fall 2017

Physics 162a Quantum Mechanics

GIS and Forest Engineering Applications FE 257 Lecture and laboratory, 3 credits

GEOG 100E Introduction to Geography (5 credits)

URP 4273 Section 8233 Introduction to Planning Information Systems (3 Credits) Fall 2017

Economics 390 Economic Forecasting

Monday May 12, :00 to 1:30 AM

Introductory Geographic Information Systems #John R. Jensen, Ryan R. Jensen

SYLLABUS Stratigraphy and Sedimentation GEOL 4402 Fall, 2015 MWF 10:00-10:50AM VIN 158 Labs T or W 2-4:50 PM

CAS GE 365 Introduction to Geographical Information Systems. The Applications of GIS are endless

Sul Ross State University Syllabus for General Chemistry I: CHEM 1311 (Fall 2017)

SYLLABUS Sedimentology GEOL 3402 Spring, 2017 MWF 9:00-9:50AM VIN 158 Labs W 2-4:50 PM

GTECH Advanced GIS Fall 2013 Wednesday, 5:35 9:15 PM

Arcgis Tutorial Manual READ ONLINE

Chemistry Syllabus Fall Term 2017

[DOC] GIS TRANING OPERATION MANUAL EBOOK

Dr. Stephen J. Walsh Department of Geography, UNC-CH Fall, 2007 Monday 3:30-6:00 pm Saunders Hall Room 220. Introduction

USP/PLSI 493: Methods of Planning Data Analysis (4 credits) San Francisco State University Spring 2011

ESM Geographic Information Systems - Spring 2014

MATH 345 Differential Equations

AMSC/MATH 673, CLASSICAL METHODS IN PDE, FALL Required text: Evans, Partial Differential Equations second edition

Emerging Issues in Geographic Information Science (GEP680): Projections, Scale, Accuracy, and Interpolation Lehman College, Spring 2017

Calculus from Graphical, Numerical, and Symbolic Points of View Overview of 2nd Edition

MATH 122 SYLLBAUS HARVARD UNIVERSITY MATH DEPARTMENT, FALL 2014

CHEM 102 Fall 2012 GENERAL CHEMISTRY

KAAF- GE_Notes GIS APPLICATIONS LECTURE 3

Introduction to Engineering Analysis - ENGR1100 Course Description and Syllabus Monday / Thursday Sections. Fall '10.

Health and Medical Geography (GEOG 222)

Ph 1a Fall General Information

Master Syllabus Department of Geography GEOG 448: Geographic Information System Design Course Description

EXPLORATORY SPATIAL DATA ANALYSIS OF BUILDING ENERGY IN URBAN ENVIRONMENTS. Food Machinery and Equipment, Tianjin , China

Transcription:

University of Chicago Department of Sociology Autumn 2017 SOCI 20253/GEOG 20500, SOCI 30253, MACS 54000 Introduction to Spatial Data Science Luc Anselin Meet: Office: E-Mail: Office Hours: Prerequisite: Mon, Wed 1:30-2:50pm Mondays: lab in Biological Sciences Learning Center 018 and Crerar Library CSIL 3-4 (Linux Lab) Wednesdays: lecture in Stuart Hall 101 254 Saieh Hall anselin@uchicago.edu Mon, Wed 3:00-4:00pm and by appointment STAT 22000 (or equivalent), familiarity with GIS is helpful, but not necessary SYLLABUS Course Description Spatial data science is an evolving field that can be thought of as a collection of concepts and methods drawn from both statistics/spatial statistics and computer science/geocomputation. These techniques deal with accessing, transforming, manipulating, visualizing, exploring and reasoning about data where the locational component is important (spatial data). The course introduces the types of spatial data relevant in social science inquiry and reviews a range of methods to explore these data. We will primarily focus on data gathered for aggregate units, such as census tracts or counties (e.g., unemployment rates, disease rates by area, crime rates), and will only briefly consider data measured at spatially located sampling points (such as air quality monitoring stations and urban sensors) and observations at the point level (e.g., locations of crimes, commercial establishments, traffic accidents). Specific topics covered include the special nature of spatial data, geovisualization and visual analytics, spatial autocorrelation analysis, cluster detection and regionalization. An important aspect of the course is to learn and apply open source geospatial software tools, primarily GeoDa, but also R.

Objectives 1. Learn principles of spatial data science and its application to social science research questions 2. Learn to distinguish which methods are appropriate for a given research question 3. Gain an appreciation for the assumptions and limitations associated with each technique 4. Learn how to interpret and present the results of a spatial data analysis in a coherent fashion 5. Learn how to use appropriate open source software tools to carry out spatial data analytical applications Organization The class will meet twice a week, alternating as a lecture and a lab. The lab sessions are on Mondays, the lecture is on Wednesdays. The computer lab will have all required software installed. However, you are strongly encouraged to use your own laptop and install the software yourself (everything is cross-platform, free and open source). The course will use Canvas as the main communication mechanism. All materials, including software guides and data will be available from the course site. Note that the Canvas course number used is SOCI 20253. All assignments, papers etc. must be submitted as a pdf digital file to Canvas: NO PAPER copy and no Word docs, no exceptions. Requirements The main goal for the course is for you to complete a final project/paper that carries out an in-depth spatial data analysis of a research problem of your choice. You will apply a subset of the methods covered in class and use your own data or data provided by the instructor (your own data is preferred). This paper will be due at the end of the semester, December 2. More details will be provided in class and on the Canvas course web site as the quarter progresses. In addition, there will be seven weekly assignments, each consisting of short computational exercises that use specific spatial analytical software (primarily GeoDa, and sometimes R). The assignments are graded pass/fail and you must pass all seven (and complete on time) in order to be able to receive an A grade in the 2

course. If you don t pass at least six, you will fail the course. You can re-do an assignment as many times as necessary in order to reach the required six, but only on-time assignments can result in an A grade. Software The class uses only open source software (free and cross-platform). The software will be installed in the computer labs, but you are strongly encouraged to install it on your own machine. Everything can be readily downloaded from the web. GeoDa, available from http://geodacenter.github.io/download.html. Make sure to have the latest version (after October 1) R (3.4.1. or later) and its associated spatial data analysis packages, especially foreign, spdep and gstat, and the tidyverse for data management, everything available from http://cran.r-project.org Recommended: RStudio, a graphical user interface to R, available from https://www.rstudio.com/products/rstudio/download3/ See also the separate installation note handout for more details. Readings There is no text for the course. There are many excellent books on data science, but to date the treatment of spatial aspects is still in its infancy. The lecture slides provide the formal background. Specific readings will be assigned each week and made available on the course Canvas web site. These readings are complementary and provide additional background on the material. They are not required, but if you are serious about spatial data science, you may want to take a look at them. General background can be found in the following annotated bibliography (these are not required reading, but provided as a guide to the literature) GeoDa https://spatial.uchicago.edu/geoda o a brief description of the GeoDa functionality http://geodacenter.github.io/documentation.html o slides with an overview of the functionality in Version 1.8 (may be updated to the latest version in the course of the quarter) Luc Anselin (2005). Exploring Spatial Data with GeoDa, A Workbook. (available on the course web site) o a bit dated in terms of the interface, but the substance hasn t changed; new version in the works, parts of which may be made available during the course of the semester 3

Data Science Cathy O Neil and Rachel Schutt (2013). Doing Data Science, Straight Talk from the Frontline. O Reilly. o very readable introduction to data science (among many others) Garrett Grolemund and Hadley Wickham (2017). R for Data Science. O Reilly. o also available online from the book s web site http://r4ds.had.co.nz o a collection of data science skills using R, with an emphasis on data munging using specialized R packages highly recommended if you are serious about data science using R David Donoho (2015). 50 Years of Data Science. o available from http://courses.csail.mit.edu/18.337/2015/docs/50yearsdatascience. pdf o a critical look at the data science hype from a statistician s perspective; excellent perspective on historical precedents in the work of Tukey (EDA), Chambers (computing with data), Cleveland (graphics), etc. Quick introduction to R W.N. Venables, D.M. Smith and the R Core Team (2017). An Introduction to R. Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.4.1 (June 2017) https://cran.r-project.org/doc/manuals/r-release/r-intro.pdf o the classic introduction and overview of the R language and its use for statistical analysis Paul Torfs and Claudia Brauer (2014). A (very) short introduction to R. http://cran.r-project.org/doc/contrib/torfs+brauer-short-r-intro.pdf o an introductory overview and quick start (only 12pp.) General references for R (admittedly biased, my favorites) Robert Kabacoff (2015). R in Action (2 nd Edition). Manning Publications. o a very readable introductory overview of R functionality and R programming o associated web site for Quick R http://www.statmethods.net Michael J. Crawley (2013). The R Book (2 nd Edition). Wiley o the comprehensive guide to R (must have if you are going to do any serious work in R) R Bloggers o compilation of daily blogs pertaining to the use of R in statistics and data science o https://www.r-bloggers.com RStudio blog o for the latest on R for data science o https://blog.rstudio.org 4

Microsoft R Application Network (Microsoft s release of R and supporting materials) o https://mran.microsoft.com guides to R books, tutorials, etc. o https://cran.r-project.org/other-docs.html o https://www.r-project.org/other-docs.html for hard-core R programmers, Hadley Wickham s Advanced R web site (as well as his books on ggplot2, packages and advanced R) o http://adv-r.had.co.nz Deborah Nolan and Duncan Temple Lang (2015). Data Science in R. A Case Studies Approach to Computational Reasoning and Problem Solving. CRC Press. o worked real-live data science case studies (advanced) Spatial data analysis in R The R ecosystem for spatial data analysis o http://cran.r-project.org/web/views/spatial.html Robin Lovelace, James Cheshire, Rachel Oldroyd and others (2015). Introduction to visualizing spatial data in R o http://github.com/robinlovelace/creating-maps-in-r o a good introductory overview to GIS operations in R Guy Lansley and James Cheshire (2016). An Introduction to Spatial Data Analysis and Visualization in R o a more extensive introduction to mapping, spatial data manipulation and visualization in R o http://www.spatialanalysisonline.com/an%20introduction%20to%2 0Spatial%20Data%20Analysis%20in%20R.pdf Roger Bivand, Edzer Pebesma and Virgilio Gomez-Rubio (2013). Applied Spatial Data Analysis with R (2 nd Edition). Springer, New York, NY. o more advanced, in-depth coverage of spatial statistical packages in R (assumes quite a bit of R expertise) Chris Brunsdon and Lex Comber (2015). An Introduction to R for Spatial Analysis and Mapping. Sage. o a more in-depth intermediate overview of GIS operations and some spatial analysis in R with lots of illustrations Grading Class participation: 10% Assignments: 40% Project Paper: 50% 5

Tentative Course Outline (subject to change) Introduction and overview (first meeting 9/25 in Stuart Hall 101) Introduction and logistics Overview of the software Spatial Data Science (9/27) Spatial data science o Important concepts o Spatial data and spatial analysis Lab (10/2): Spatial Data Handling o Overview of GeoDa functionality for data handling Assignment 1: Data acquisition Visual Analytics (10/4) Visual Analytics (1) o Principles of visual analytics, EDA, ESDA o Linking and brushing o EDA basics o Scatter plot (smoothing, brushing) o Scatter plot matrix o Parallel Coordinate Plot (PCP) o Conditional plots Lab (10/9): Visual Analytics in GeoDa (1) Assignment 2: Visualizing relationships among multiple variables (10/11) Visual Analytics (2) o Map types (choropleth, outlier maps) o Principles of map design o Cartogram o Co-location maps o Conditional maps o Mapping rates Lab (10/16): Visual Analytics in GeoDa (2) Assignment 3: Assessing broad patterns, outliers, spatial heterogeneity 6

Spatial Autocorrelation (10/18) Spatial Autocorrelation Principles o Spatial randomness o Positive and negative spatial autocorrelation o Spatial weights Lab (10/23): Spatial Weights in GeoDa and R spdep Assignment 3: Properties of spatial weights (10/25) Spatial Autocorrelation Statistics o Join count, Moran s I, Geary s c o Moran scatter plot o Variogram o Nonparametric spatial correlogram Lab (10/30): Global Spatial Autocorrelation in GeoDa and R spdep, gstat Assignment 4: Sensitivity analysis of spatial autocorrelation measures (11/1) Local Spatial Autocorrelation o Local Moran cluster maps o Local Geary, Local join counts o Multivariate extensions Lab (11/6): Local Spatial Autocorrelation in GeoDa Assignment 5: Identifying local clusters and spatial outliers Cluster Detection (11/8) Unsupervised Learning o Curse of dimensionality o PCA and MDS maps o K-means clustering o Hierarchical clustering Lab (11/13): Cluster Detection with GeoDa and R Assignment 6: Cluster detection 7

(11/15) Spatially Constrained Clustering o Spatial constraints o Weighted clustering o Skater, Redcap o Max-p Lab (11/20): Spatially Constrained Clustering in GeoDa (1) (11/22) No Class Thanksgiving Lab (11/27): Spatially Constrained Clustering in GeoDa (2) Assignment 7: Spatial clusters Next Steps (11/29) More Spatial Data Science o Overview of topics not covered in the class (point patterns, flows, trajectories, scan statistics, space-time, spatial data mining) 8