Combing Open-Source Programming Languages with GIS for Spatial Data Science. Maja Kalinic Master s Thesis

Similar documents
Combining Open-source Programming Languages with GIS for Spatial Data Science

DATA SCIENCE SIMPLIFIED USING ARCGIS API FOR PYTHON

Among various open-source GIS programs, QGIS can be the best suitable option which can be used across partners for reasons outlined below.

Crime Analysis. GIS Solutions for Intelligence-Led Policing

Integrating Open-Source Statistical Packages with ArcGIS

DP Project Development Pvt. Ltd.

Karsten Vennemann, Seattle. QGIS Workshop CUGOS Spring Fling 2015

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

Approaches to Spatial Analysis. Flora Vale, Linda Beale, Mark Harrower, Clint Brown Esri Redlands

Hot spot Analysis with Clustering on NIJ Data Set The Data Set:

Crime Analyst Extension. Christine Charles

YYT-C3002 Application Programming in Engineering GIS I. Anas Altartouri Otaniemi

Geospatial Intelligence

FIRE DEPARMENT SANTA CLARA COUNTY

A Review: Geographic Information Systems & ArcGIS Basics

Lecture 2. Introduction to ESRI s ArcGIS Desktop and ArcMap

ArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics

Lecture 3 GIS outputs. Dr. Zhang Spring, 2017

ArcGIS GeoAnalytics Server: An Introduction. Sarah Ambrose and Ravi Narayanan

GIS for Crime Analysis. Building Better Analysis Capabilities with the ArcGIS Platform

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

ArcGIS & Extensions - Synergy of GIS tools. Synergy. Analyze & Visualize

Kernel Density Estimation (KDE) vs. Hot-Spot Analysis - Detecting Criminal Hot Spots in the City of San Francisco

Welcome! Power BI User Group (PUG) Copenhagen

Animating Maps: Visual Analytics meets Geoweb 2.0

Integrating ARCGIS with Datamining Software to Predict Habitat for Red Sea Urchins on the Coast of British Columbia.

Spatial Pattern Analysis: Mapping Trends and Clusters

No. of Days. Building 3D cities Using Esri City Engine ,859. Creating & Analyzing Surfaces Using ArcGIS Spatial Analyst 1 7 3,139

ArcGIS for Desktop. ArcGIS for Desktop is the primary authoring tool for the ArcGIS platform.

Introduction to ArcGIS GeoAnalytics Server. Sarah Ambrose & Noah Slocum

The CrimeStat Program: Characteristics, Use, and Audience

Analyzing the Earth Using Remote Sensing

Effective Visualization Tool for Job Searching

Esri Production Mapping: Map Automation & Advanced Cartography MADHURA PHATERPEKAR JOE SHEFFIELD

ArcGIS Web Tools, Templates, and Solutions for Defence & Intelligence. Renee Bernstein Esri Solutions Engineer

UNIT 4: USING ArcGIS. Instructor: Emmanuel K. Appiah-Adjei (PhD) Department of Geological Engineering KNUST, Kumasi

Spatial Analysis of Crime Report Datasets

Geog 469 GIS Workshop. Data Analysis

WHAT YOU WILL LEARN TODAY

Time Series Analysis with SAR & Optical Satellite Data

Python Raster Analysis. Kevin M. Johnston Nawajish Noman

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde

ArcGIS Pro Q&A Session. NWGIS Conference, October 11, 2017 With John Sharrard, Esri GIS Solutions Engineer

No. of Days. ArcGIS 3: Performing Analysis ,431. Building 3D cities Using Esri City Engine ,859

No. of Days. ArcGIS Pro for GIS Professionals ,431. Building 3D cities Using Esri City Engine ,859

ArcGIS Pro: Analysis and Geoprocessing. Nicholas M. Giner Esri Christopher Gabris Blue Raster

GEOGRAPHIC INFORMATION SYSTEMS SPECIALIST 3 DEFINITION:

Visualize and interactively design weight matrices

Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS

Online visualization of multi-dimensional spatiotemporal

Development of Univ. of San Agustin Geographic Information System (USAGIS)

ARMY ITAM GIS: Automating Standard Army Training Map Production

Implications for the Sharing Economy

K. Zainuddin et al. / Procedia Engineering 20 (2011)

Design and implementation of a new meteorology geographic information system

Geometric Algorithms in GIS

Visualizing Big Data on Maps: Emerging Tools and Techniques. Ilir Bejleri, Sanjay Ranka

Arcmap Manual Esri 10 READ ONLINE

Arcmap Manual Esri 10 READ ONLINE

Geospatial Fire Behavior Modeling App to Manage Wildfire Risk Online. Kenyatta BaRaKa Jackson US Forest Service - Consultant

ArcGIS Pro 3D Workflows. Zena Pelletier

Esri Training by Microcenter Prepare to Innovate. Microcenter Course Catalog

Multiple Representations of Geospatial Data: A Cartographic Search for the Holy Grail

Geog183: Cartographic Design and Geovisualization Spring Quarter 2017 Lecture 14: Visual analytics and data exploration

HASSET A probability event tree tool to evaluate future eruptive scenarios using Bayesian Inference. Presented as a plugin for QGIS.

Arcgis And Spatial Analysis

Geog 469 GIS Workshop. Managing Enterprise GIS Geodatabases

Esri Defense Mapping: Cartographic Production. Bo King

Assessment of Physical status and Irrigation potential of Canals using ArcPy

Teaching GIS for Land Surveying

ArcGIS Platform For NSOs

Point data in mashups: moving away from pushpins in maps

7 GEOMATICS BUSINESS SOLUTIONS - ANNUAL REPORT 2006

WHAT YOU WILL LEARN TODAY

ANDREW MCMILLAN 888 West 18 th Avenue Vancouver, BC V5Z 1W3 Tel. (604) OBJECTIVE SUMMARY

CARTOGRAPHY in a Web World

Selecting the optimal opensource GIS software for local authorities by combining the ISO 9126 standard and AHP approach

The Geodatabase Working with Spatial Analyst. Calculating Elevation and Slope Values for Forested Roads, Streams, and Stands.

The Pace of Change Is Accelerating Creating Many Challenges

The Changing Landscape of Land Administration

GIS solutions for dredging and subsurface mapping

Abstract: Traffic accident data is an important decision-making tool for city governments responsible for

Lecture 2. A Review: Geographic Information Systems & ArcGIS Basics

Land-Use Land-Cover Change Detector

Incorporating ArcGIS Pro in your Curriculum

Preprint.

GEOG 508 GEOGRAPHIC INFORMATION SYSTEMS I KANSAS STATE UNIVERSITY DEPARTMENT OF GEOGRAPHY FALL SEMESTER, 2002

Visualization of Commuter Flow Using CTPP Data and GIS

Chicago Crime Category Classification (CCCC)

Spatial Analysis with Web GIS. Rachel Weeden

Spatial Analysis in CyberGIS

Understanding Geographic Information System GIS

GEOGRAPHIC INFORMATION SYSTEM (GES203)

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Evaluating Physical, Chemical, and Biological Impacts from the Savannah Harbor Expansion Project Cooperative Agreement Number W912HZ

Analyzing Nepal earthquake epicenters

Leveraging Your Geo-spatial Data Investments with Quantum GIS: an Open Source Geographic Information System

Enabling ENVI. ArcGIS for Server

Chapter 1. GIS Fundamentals

Data Visualization (CSE 578) About this Course. Learning Outcomes. Projects

Transcription:

Combing Open-Source Programming Languages with GIS for Spatial Data Science Maja Kalinic Master s Thesis International Master of Science in Cartography 14.09.2017

Outline Introduction and Motivation Research Objectives Related Work Methodology and Case Study Outputs and Discussion Conclusion Limitations and Future Research 2

Introduction and Motivation There is GIS, specifically designed software for spatial data handling, there are programming languages, a set of instructions forwarded to a machine for getting desired outputs, and it is a challenge to know how, but important to link them. Information visualization and communication is, on the other hand, challenge for Cartographers and GIS professionals since their aim is to visually deliver useful information to aid humankind and their sustainability. 3

Research Objectives Which open-source programming language(s) is (are) the most prominent one(s) to be combined with GIS for the analysis and interpretation of spatial data? Which dataset should be used and which methodology is the best to develop a real world example scenario? How can these language(s) be combined with GIS? What is the most efficient way of dividing the work between open-source programming language and GIS? 4

Related Work GIS software ranking by: Programming languages ranking by: Popularity ArcGIS Desktop QGIS PYPL Java Python Employer ArcGIS Desktop Map Info GIS scripting Python R Research ArcGIS Desktop QGIS Data processing and analyses Python R Community ArcGIS Desktop QGIS Geospatial libraries Python R Sources: (1), (2), (3), (4), (5), (6), (7), (8) 5

Data Open-source, spatial data GIS in crime prevention Programming languages in crime prediction San Francisco criminal data 878049 rows, 9 columns Dates/ Descript/ Category/ DayOfWeek/ PdDistrict/ Resolution/ Address/ X / Y Size 0.119GB 6

Methodology Visualisation Data Collection Data organisation Exploratory analysis Inferential analyses Strategic analyses Tactical analyses Predictive policing Figure 1: Methodology 7

Case Study Goals Analyse and visualize the spatial and temporal relationship of crimes Determine when and where arrests frequently occur and analyse how variables interact with each other Identify where crimes are emerging Identify areas with unusually high number of crimes Investigate factors that might influence unlawful behavior Predict where higher numbers of crimes are likely to occur 8

Steps and Tools Data organization Strategic analyses Tactical analyses Predictive policing - Rename columns - Find and delete uncomplete entries - Adjust the Timestamp - Add new columns - Spatial and temporal distribution of crimes - Crime resolutions - Dataset variable dependency - Trends over space and time - Crime and population relationship - Influencers of unlawful behavior - When - Where - Which crimes are most likely to occur Tools Python Python R, R-ArcGIS Bridge Python Package pandas, numpy matpolotlib data.table, ggplot2, corrplot scikit -learn 9

Results Task Tools Outputs Strategic analyses (criminal activity over time) Python packages: Pandas, numpy, matplotlib Graphs showing where and when (year, month, day of the week, hour) crimes occur Tactical analyses (trends and patterns) Predictive policing (where crimes are likely to occur) R libraries: ggplot2 Emerging Hot Spot Analyses Optimized Hot Spot Analyses R libraries: corrplot Python model: scikit-learn Heat maps Hot and Cold spots map Map with unusually high number of crimes Correlation matrix Classifiers accuracy: Logistic, AdaBoost Logistic Random Forest 10

Python matplotlib package Figure 2: Bar chart visualization (Number of crimes per district) 11

Python matplotlib package Figure 3: Line diagram (Relative number of top crimes per weekday) Figure 4: Line diagram (Number of crimes per Hour) 12

Python matplotlib package Figure 5: Line diagrams (Top Crimes occurrence by Hour) 13

R ggplot2 library Figure 6: Heat map (Arrest counts per Day of Week and Time) 14

R ggplot2 library Figure 7: Heat map (Day of the Week influence on crime type and volume) Figure 8: Heat map (Relation between crime types and day of the week) 15

ArcGIS Pro Emerging Hot Spot Analyses Figure 9: San Francisco Crimes Hot and Cold Spots visualization 16

ArcGIS Pro Optimized Hot Spot Analyses Figure 10: Areas with unusually high number of crimes 17

R corrplot and hmisc package Figure 11: Correlation Matrix 18

Discussion Python Used packages: pandas, numpy and matplotlib Stand alone script Easy and quick data pre-processing and handling Easy to understand and manipulate package Customizing and scalable graphics 19

Discussion R Used packages: ggplot2, corrplot Stand alone script and R-ArcGIS integration Easy and quick data visualization (heat maps) Efficient statistical performance (correlation matrix) Easy to understand and manipulate package Customizing and scalable graphics Corrplot interpretation 20

Discussion ArcGIS Pro Fast and clear performance (run the tool) Default legend Scalable Basemap Customising style and symbology 21

Conclusions Python and R libraries (matplotlib, ggplot2, corrplot) showed efficient performance in identifying and visualizing crime spatiotemporal relationships, crime trends and patterns which gave insights and more information about the whole dataset. ArcGIS Pro has very useful tools for crime analysis in its basic functionality, such as Emerging Hot Spot Analyses and Optimised Hot Spot Analyses. Combined, these tools leverage information extraction and visualisation out of criminal data. 22

Limitations and Future Work Data Scientist feedback Crime Analyst involvement Programming language choice GIS software choice Focus on prediction models Different data October = Danger; December = Safety. Why? Investigate why Larceny/Theft kept being highest criminal activity over the years in San Francisco? 23

Thank you for your attention. 24

Sources (1) GIScomunity.com, Accessed: 20.05.2017. (2) Popularity of Programming Languages, Accessed: 02.09.2017. (3) Ciolobac, F. (2016, 11 16). What are the Top Programming Languages in the GIS World. Geomatic Canada Magazine. (4) Robinson, A., Hardisty, F., Dutton, J., & Chaplin, G. (n.d.). Overview of Programming Languages for GIS. Retrieved 05/19/2017, Penn State University, https://www.e-education.psu.edu/geog583/node/67 (5) Piatetsky, G. (2017, 05). KDnuggets. Retrieved 06 06, 2017, from New Leader, Trends, and Surprises in Analytics, Data Science, Machine Learning Software Poll: http://www.kdnuggets.com/2017/05/pollanalytics-data-science-machine-learning-software-leaders.html (6) Szczepankowska, K., Pawliczuk, K., & Z ro bek, S. (2013). The Python programming language and its capabilities in the GIS environment. International Geographic Information Systems Conference and Exhibition GIS Odyssey (pp. 213-222). Zagreb, Croatia: Croatian Information Technology Society GIS Forum. (7) De Sarkar, A., Biyahut, N., Kritika, S., & Singh, N. (2012). An environment monitoring interface using GRASS GIS and Python. Third International Conference on Emerging Applications of Information Technology (pp. 235-238). Kolkota, India: IEEE. (8) Anselin, L. (2012). From SpaceStat to CyberGIS. International Regional Science Review, 35(2), 131-157. 25

Questions? 26