Predictive Analytics: Using Exploratory Tools to Predict Incidences Probability Grid Method for Crime Analysis in ArcGIS 9

Similar documents
Crime Analysis. GIS Solutions for Intelligence-Led Policing

The CrimeStat Program: Characteristics, Use, and Audience

GIS CONCEPTS ARCGIS METHODS AND. 2 nd Edition, July David M. Theobald, Ph.D. Natural Resource Ecology Laboratory Colorado State University

Thales Canada, System Division. BattleView: Integrating ArcGIS Into Canadian Army s Command And Control Application

GIS for Crime Analysis. Building Better Analysis Capabilities with the ArcGIS Platform

GIS CONCEPTS ARCGIS METHODS AND. 3 rd Edition, July David M. Theobald, Ph.D. Warner College of Natural Resources Colorado State University

Are You Maximizing The Value Of All Your Data?

Lecture 2. A Review: Geographic Information Systems & ArcGIS Basics

Geodatabase An Overview

ArcGIS Pro: Essential Workflows STUDENT EDITION

Introduction To Raster Based GIS Dr. Zhang GISC 1421 Fall 2016, 10/19

Geodatabase An Introduction

Geometric Algorithms in GIS

GIS Level 2. MIT GIS Services

Chapter 1 Introducing GIS for police work

Geospatial Intelligence

ArcGIS Pro: Analysis and Geoprocessing. Nicholas M. Giner Esri Christopher Gabris Blue Raster

Geodatabase Management Pathway

Bentley Map Advancing GIS for the World s Infrastructure

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

Introducing GIS analysis

ArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Sam Williamson

Geospatial Data Solutions: Site and Corridor Siting Projects. Rachel Turney-Work

SPATIAL MODELING GIS Analysis Winter 2016

GIS Data Production and Editing Pathway

Geospatial Intelligence

Canadian Board of Examiners for Professional Surveyors Core Syllabus Item C 5: GEOSPATIAL INFORMATION SYSTEMS

Spatial Analysis with ArcGIS Pro STUDENT EDITION

Using a GIS to Calculate Area of Occupancy. Part 1: Creating a Shapefile Grid

Among various open-source GIS programs, QGIS can be the best suitable option which can be used across partners for reasons outlined below.

Map your way to deeper insights

DP Project Development Pvt. Ltd.

Karsten Vennemann, Seattle. QGIS Workshop CUGOS Spring Fling 2015

FIRE DEPARMENT SANTA CLARA COUNTY

Geodatabase An Introduction

UNIT 4: USING ArcGIS. Instructor: Emmanuel K. Appiah-Adjei (PhD) Department of Geological Engineering KNUST, Kumasi

Major Crime Map Help Documentation

ArcGIS 9 ArcGIS StreetMap Tutorial

PC ARC/INFO and Data Automation Kit GIS Tools for Your PC

ArcGIS. for Server. Understanding our World

Getting Started with Military Analyst for ArcGIS 9.2

GIS FOR PLANNING. Course Overview. Schedule. Instructor. Prerequisites. Urban Planning 792 Thursday s 5:30-8:10pm SARUP 158

4. GIS Implementation of the TxDOT Hydrology Extensions

Geog 469 GIS Workshop. Data Analysis

Visualization of Origin- Destination Commuter Flow Using CTPP Data and ArcGIS

Esri UC2013. Technical Workshop.

This lab exercise will try to answer these questions using spatial statistics in a geographic information system (GIS) context.

Geodatabase Essentials Part One - Intro to the Geodatabase. Jonathan Murphy Colin Zwicker

EEOS 381 -Spatial Databases and GIS Applications

Paper PO-27 Does SAS Distance Measurement Differ from ArcGIS Distance Measurement?

94-802Z: Geographic Information Systems Summer 2018

In this exercise we will learn how to use the analysis tools in ArcGIS with vector and raster data to further examine potential building sites.

Exelis and Esri Technologies for Defense and National Security. Cherie Muleh

Tutorial 8 Raster Data Analysis

GIS Lecture 5: Spatial Data

Outline. Chapter 1. A history of products. What is ArcGIS? What is GIS? Some GIS applications Introducing the ArcGIS products How does GIS work?

Visualization of Commuter Flow Using CTPP Data and GIS

ArcMap - EXPLORING THE DATABASE Part I. SPATIAL DATA FORMATS Part II

How does ArcGIS Server integrate into an Enterprise Environment? Willy Lynch Mining Industry Specialist ESRI, Denver, Colorado USA

Popular Mechanics, 1954

John Laznik 273 Delaplane Ave Newark, DE (302)

GIS Needs Assessment. for. The City of East Lansing

Crime Analyst Extension. Christine Charles

Working with the Geodatabase

Abstract: Contents. Literature review. 2 Methodology.. 2 Applications, results and discussion.. 2 Conclusions 12. Introduction

GIS Boot Camp for Education June th, 2011 Day 1. Instructor: Sabah Jabbouri Phone: (253) x 4854 Office: TC 136

How might you use visibility to map an ancient civilization's political landscape?

Introduction to the 176A labs and ArcGIS Purpose of the labs

GEOG 508 GEOGRAPHIC INFORMATION SYSTEMS I KANSAS STATE UNIVERSITY DEPARTMENT OF GEOGRAPHY FALL SEMESTER, 2002

Introduction. Project Summary In 2014 multiple local Otsego county agencies, Otsego County Soil and Water

Oakland County Parks and Recreation GIS Implementation Plan

Modeling Incident Density with Contours in ArcGIS Pro

Acknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2

GIS (GEOGRAPHIC INFORMATION SYSTEMS)

ArcGIS Web Tools, Templates, and Solutions for Defence & Intelligence. Renee Bernstein Esri Solutions Engineer

GIS ANALYSIS METHODOLOGY

Introduction to the 176A labs and ArcGIS

Geographic Systems and Analysis

ARMY ITAM GIS: Automating Standard Army Training Map Production

Key Steps for Assessing Mission Critical Data for An ebook by Geo-Comm, Inc.

Esri Exam EADP10 ArcGIS Desktop Professional Version: 6.2 [ Total Questions: 95 ]

EBA Engineering Consultants Ltd. Creating and Delivering Better Solutions

October 2011 ArcGIS 10 for Server Functionality Matrix

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

Time Series Analysis with SAR & Optical Satellite Data

Spatial Data Analysis with ArcGIS Desktop: From Basic to Advance

December 2009 ArcGIS Server Functionality Matrix

CHAPTER 22 GEOGRAPHIC INFORMATION SYSTEMS

ArcGIS for the Military: Analyzing the Operational Environment

GENERALIZATION IN THE NEW GENERATION OF GIS. Dan Lee ESRI, Inc. 380 New York Street Redlands, CA USA Fax:

An ESRI Technical Paper June 2007 Understanding Coordinate Management in the Geodatabase

NR402 GIS Applications in Natural Resources

Fundamentals of ArcGIS Desktop Pathway

Introduction to ArcGIS Server - Creating and Using GIS Services. Mark Ho Instructor Washington, DC

ArcGIS for Local Government

Tips and Tricks for Using ArcGIS for Fire Pre-Incident Planning Version II By: Chris Rogers Firefighter Kirkland Fire Department Kirkland Washington

Law Enforcement Solutions and Applications

What are the five components of a GIS? A typically GIS consists of five elements: - Hardware, Software, Data, People and Procedures (Work Flows)

GIS and Forest Engineering Applications FE 357 Lecture: 2 hours Lab: 2 hours 3 credits

ArcGIS Pro Q&A Session. NWGIS Conference, October 11, 2017 With John Sharrard, Esri GIS Solutions Engineer

Transcription:

Predictive Analytics: Using Exploratory Tools to Predict Incidences Probability Grid Method for Crime Analysis in ArcGIS 9 By Richard Heimann & Bryan Hill The Office of the Chief Technology Officer, DC GIS group, has developed a comprehensive and streamlined predictive workflow within the ArcGIS geoprocessing framework called the Probability Grid Method (PGM). With multiple versions of the model, it have achieved a scalable tool that permits crime analysts to perform predictive modeling at all ESRI license levels. With variables set as model parameters, this is a complex yet customizable process that yields optimal results. The tool has multiple output statistical tables in addition to the final probability grid that is derived from multiple combined and weighed datasets. PGM is a methodology that has been in the industry for nearly a decade and has yielded a remarkably high-accuracy rate for predicting next crimes in a series. This has been a joint effort to bridge the gap between proven PGM methods and proven spatial statistics. 1

Overview Both law enforcement and military are using these applications for practical purposes Predictive analytics has seen a spike in its use in the last few years. The benefit of expanding geographic data coupled with expansive tools has offered users the ability to see further into their data and identify spatial trends, patterns, and now future events. A large quantity of tools and methodologies now exist. These tools range from geographic profiling to spatial statistics. Both law enforcement and military are using these applications for practical purposes. One such methodology is the Probability Grid Method (PGM), developed by Paul Catalano, Brennan Long, and the co-author, Bryan Hill. These efforts have resulted in publication in the Security Journal concerning predictive efforts in several robbery series in Phoenix, Arizona. The actual particulars of that paper will not be covered in detail in this paper, however the probability grid was developed is based on the logic and research described in that paper by Paul Catalano. This paper will cover the core concepts of the Probability Grid Method. In addition, it will discuss the process by which this methodology was produced inside of the ArcGIS product using the ModelBuilder graphic programming interface. Introduction These problems can be placed into the category of too much land and not enough resources. Incident mapping has led to new and innovative ways to analyze patterns in the past several years. The practicing analyst is often faced with problems in tactical analysis that require creative methods to overcome. These problems can be placed into the category of too much land and not enough resources. When doing tactical analysis, it is common to have minimum rectangular or elliptical predictions that cover a 25-50 square mile area where the suspect may commit a new crime in a series. Consumers of this information have a difficult time framing this inside of sensible intelligence. In order to narrow the focus for the consumer, an analyst must apply and use several types of data and methods to find the one that is the most operationally effective. Many of these methods are statistical, however the analyst s experience, knowledge of the series being analyzed, and his or her gut feelings are sometimes called upon. A probability grid is a natural evolution to this problem and allows sound statistical methods to be combined with field experts to identify potential suspects and next hit areas. 2

What is PGM Simply, PGM is a tactical tool for predicting where the next incident in a series will occur. This is accomplished by combining data and methods from many proven and useful analyses into one broad workflow. These derived datasets are then weighed (or scored) and combined into one dataset that highlights target areas. The spatial layers that are created are as follows: Standard Deviational Ellipses ((SDE) Mean, 1, & 2 SD)) Convex Hull Last Crime Buffer (Near Distance Value) Intersect of All Crime Buffer (Near Distance Value) These derived datasets are weighed both individually and collectively. Finally, intersect these values into a vector based grid. Once the grid has been scored, a more useful product can be provided to the investigators and users of the information. Developing the Probability Grid Automating that workflow, so others can leverage the experience of domain experts Many analysts have utilized a grid surface to predict the location of the next crime in a series. This approach is not new to the crime analysis field and this paper does not suggest that PGM is innovative or vastly superior to any other method currently being employed. Combining several methods to achieve a better predictive model has been discussed widely in the field in the past and the PGM process is just one method that has proven easy to implement in two police departments in Arizona. The goal of this paper is to describe a procedure that appears reliable in at least one agency, and encourages others to try the PGM model and report on their success and failures. Another goal is to encourage open discussion on making a more easy to use, reliable, and useful operational product for police and military to do this type of analysis. The PGM is largely users workflow, it is fundamentally intuitive. The associated model is automating that workflow, so others can leverage the experience of domain experts without being domain experts themselves. 3

Case Study Few would argue that crime is without a spatial component Few would argue that crime is without a spatial component. The influx in geographic profiling applications is a product of the belief that this spatial element can not be overlooked. So, why PGM? In order to demonstrate the capabilities of the probability grid method, a case study in which the PGM methodology was deployed successfully will be revisited. In addition to the robbery series case study, PGM has been used in arson, burglary, and theft crime series in both Phoenix and Glendale jurisdictions. In this documented case study we observe the Video Bandit case; there was a visible pattern in the series. The suspect seemed to return to a particular geography every 1 to 3 crimes committed elsewhere. For this series, the mean distance traveled between hits was 3.36 miles with a standard deviation of 1.08 miles. Since the mean was 3.36 miles, and the greatest distance from one hit to the next was 4.755 miles, 2.28 (Mean-1StDev) miles was chosen as the buffer distance and the three buffer rings were created around the last crime in the series. A maximum distance of 6.84 miles from the last hit is well within the maximum distance this offender traveled form any of his hits (mean plus 2 Standard Deviations or 3.36+ (2*1.08) or 5.52 miles). Provide actionable intelligence In addition to the buffer from the last hit, it was necessary to determine where likely targets are located in the area. This is an intuitive process that requires the user to use their experience with both the series and geographic data. At this point, the analyst can also look to identify relationships that may exist between crimes and land use, census tracts and other political or cadastral data. This could provide actionable intelligence when the targets do not seem to be related, dissimilar to this case study where the targets were exclusively video stores. The Scoring Process to Predict Next in Series The probability grid is scored dependant on where the grids are located in relationship to the selected analytical datasets. The spatial items scored at this point determine whether a grid is: Inside the convex hull polygon Within the probability SDE Within the last hit buffer areas Within all crime buffer intersect 4

Once the grid has been scored, a ranking classification is applied, providing a more useful map product for both investigators and users. An interesting fact is that the number of targets in the high chance and very high chance areas are now reduced to 12 potential targets instead of the 27 or 21 targets identified with the SDE or rectangles alone. This is still a great deal of targets, but much less than with either method alone. The actual number of potential targets can be reduced even further by adding scores for: Analysis yielded just two potential targets Targets available in a grid Repeat victimization within a grid We recreated the map by adding these two fields. The final map demonstrates that the potential target area for the next hit is reduced even more. The analysis yielded just two potential targets. Investigators could operationally deal with 2 stores for undercover deployment activities, and perhaps even the 12 stores in the highest area. Using the probability grid process and combining logically obtained geographic information with statistical methods, the potential targets in the analysis were reduced form 26 to 12. In addition, this analysis made one area much more observable as a potential area of concern for the next hit in the series. This grid contains the video store that the suspect hit in his last armed robbery. This video store is in the list of twelve potential targets revealed by the probability grid method and is one of two stores identified in the highest probability grid. The tests of this method in actual scenarios have provided a satisfactory success rate at predicting the next target in a crime series. Methodology Conclusion It can also be made to adapt to changes in the suspect s pattern as needed, because it doesn t rely on just one method Although this is only one example and demonstration of the probability grid method, the grid has proven to be easy to create, more efficient than one conventional method alone. The probability grid method has flaws as does any other method currently in use by analysts across the country. The probability grid method utilizes two common spatial statistical methods, combines them with basic geographic relationships and intuitive knowledge of the investigators and crime analysts involved in a case. PGM also allows for easy modifications as needed, dependant on the circumstances surrounding a series, but is a combination of methods and analytical thinking based on spatial relationships and crime patterns or behavior. This customization of series based crime exposes itself as model parameters and a weighed overlay function inside of the PGM model. 5

What is ModelBuilder? Models are data flow diagrams that link together a series of tools and data to create advanced procedures and work flows The ModelBuilder interface provides a graphical modeling framework for designing and implementing geoprocessing models that can include tools, scripts, and data. Models are data flow diagrams that link together a series of tools and data to create advanced procedures and work flows. ModelBuilder is a productive mechanism to share methods and procedures with others within, as well as outside, your organization. These models can be run from inside of the ModelBuilder interface or from a dialog box. By simply exposing model variables as model parameters, the user interacts only with a polished, user friendly frontend, while retaining the ability to run and rerun the model. ModelBuilder is integrated inside of the Geoprocessing framework. Why ModelBuilder? ModelBuilder s true value comes when it can either automate complex repeated geoprocessing workflow or document and record unique workflow. Both were requirements of the PGM team. In addition to the afore mentioned advantages, ModelBuilder is simple and does not require any special skill sets. Solution Overview In the late 1990 s, many crime shops were creating custom tools on the Avenue based ArcView. ArcView offered a user friendly interface that was not previously available. Users could create maps, add data, and connect to existing databases and geocode their crime data. Digital pin mapping evolved to spatial analysis. The analytical tools that spawned change were many. One such tool was created by the co-author, Bryan Hill, and his tool provided a singular approach to the PGM. The extension that he created for ArcView was named CA Tools or Crime Analysis Tools (http://arcscripts.esri.com/details.asp?dbid=12423). This extension to the ArcView provided many of the elements needed to properly perform the probability grid method. The extension was a popular download and a critical component for many crime shops daily workflow with over 4,600 downloads. 6

The Spatial Statistics toolbox in ArcGIS 9 contains an arsenal of tools for analyzing the distribution of geographic features With the evolution to the ArcGIS platform, the Avenue Language was forsaken for Visual Basic. Those popular tools would no longer operate on the new 8x technology. Many shops made a decision to either keep ArcView and resist the change in technology or migrate only partially, having a crime fusion center with both ArcGIS and ArcView platforms. ESRI does listen The ArcGIS 9 release offered users much of the functionality that they had been asking for since the migration to 8x. The new platform offered many tools for advanced spatial statistical analysis. The Spatial Statistics toolbox in ArcGIS 9 contains an arsenal of tools for analyzing the distribution of geographic features Spatial statistics differ from traditional statistics in that space and spatial relationships are an integral and implicit component of analysis. These tools are completely integrated into the geoprocessing framework and can be used inside of a dialog box, command line, scripts, and ModelBuilder. Add this to ModelBuilder and the ability to extend the functionality of geoprocessing with Python and you have all of the ingredients to create custom tools that mimic unique workflow and the capacity to distribute the tools to others. For more information on the Spatial Statistics toolbox, refer to the following URL http://www.esri.com/news/arcuser/1104/spatial_statistics.html Solution Description Since the methodology had already been fully documented, we began by identifying which tools would be useful to gain the necessary functionality. This process is often a balancing act between what is needed and what is preferred. An overly zealous team can easily create a model that simulates too many scenarios, creating a model that has 10 s or 100 s of tools, variables and derived datasets. Next, it was simply a matter of plugging in the tools in a logical order that would duplicate the documented probability grid. This was an evolutionary process of producing the hundred percent answer or as close as can be achieved. Programmatically step through noteworthy steps of the workflow What follows is a detailed look at the probability grid methodology inside of ArcGIS 9. We will programmatically make our way through noteworthy steps of the workflow, highlighting the purpose and function of each section of code followed by an overview of the model and flow direction. This is an important aspect when automating large workflow; the geoprocesses tend to have an undirected flow, causing tools to be run in a manner that is unexpected. 7

Process Name Process Explanation The Make Feature Layer tool is used to create a feature layer from an input feature class or layer file. Incident Feature Layer Model Parameter This is the main entry point into the model. Many variables here are exposed as model parameters including SQL expression & Output Location Add XY Coordinates Adds the fields POINT_X and POINT_Y to the point input features and calculates their values. Adds X, Y coordinates to your input data. Point Distance Model Parameter Determines the distances between point features in the Input Features to all points in the Near Features, within the Search Radius. Point Distance calculates linear distance from one incident to every other incident. Calculates summary statistics for field(s) in a table. Summary Statistics Model Parameter Calculates Statistics on distance from incident to incident (previous process), such as Min, Max, Mean, Range, STD, etc. Creates a raster of random values from a normal distribution. Create Normal Raster Model Parameter Creates a Normal Raster for the eventual Vector Grid Project Raster Model Parameter Projects a raster dataset into a new spatial reference using a single polynomial fit to compute the adjustment between coordinate systems. Reclassify Int Raster to Polygon Reclassifies (or changes) the values in a raster. *Raster to Vector Grid Conversion Converts each cell value of a raster to an integer by truncation. *Raster to Vector Grid Conversion Converts a raster dataset to polygon features. *Raster to Vector Grid Conversion Convex Hull Model Parameter Creates a convex hull feature class for all or selected input features. Central Feature Model Parameter Identifies the most centrally located feature in a point, line, or polygon feature class. Identifies the geographic center (or the center of concentration) for a set of features. Mean Center Model Parameter Adds latitude and longitude coordinates to the attribute table of point, polyline or polygon features. Add Lat/Long Coordinates Near Function Determines the distance from each point in the Input Features to the nearest point or polyline in the Near Features, within the Search Radius. Summary Statistics X,Y Coordinate Statistics Table Model Parameter Calculates summary statistics for field(s) in a table. Measures whether a distribution of features exhibits a directional trend (whether features are farther from a specified point in one direction than in another direction). Directional Distribution (Standard Deviational Ellipse) Mean, 1SD, & 2SD. Model Parameter Crime Buffer (both Last Crime & All Crime) Model Parameter Creates buffer polygons to a specified distance around the Input Features. An optional dissolve can be 8

performed to remove overlapping buffers. Distance Value used is funneled in from the mean Near distance output. This value is dynamic and illustrates the mean value from one crime to every other crime. Buffer Near Distance Intersect All Crime Buffer Computes a geometric intersection of the Input Features. Features or portions of features which overlap in all layers and/or feature classes will be written to the Output Feature Class. Union Computes a geometric intersection of the Input Features. All features will be written to the Output Feature Class with the attributes from the Input Features, which it overlaps. Overlays several rasters using a common measurement scale and weights each according to its importance. Calculate Field *Weigh Datasets Model Parameter Figure 1 PGM Model 9

Input Incident Data Default Workspace Selection of Last Incident in Series Complete Documentation Spatial Extent (Vector Grid) Vector Grid Cell Size Default 250 Meters Create Spatial Reference *Derived Layers -Summary Stats for Point Distance -Point Distance Table -X,Y Statistics -Mean Center Feature -Central Feature -Convex Hull -Deviational Directional Ellipse (Mean, 1SD, & 2SD) -Union Dataset Feature Figure 2 PGM Dialog Box These derived datasets can be used for standalone analysis or to show there influence in the final Probability Grid This is one example of the probability grid method and has proven to be more efficient that one conventional method alone. The associated model has provided end users to leverage the experience of analyst while still offering the power user the ability to customize weighting factors to offer a more robust tool. Future generations of the PGM tool will be exclusively offered at the ArcView seat, giving all analysts the opportunity to create and provide operational estimates to the next hit in a series. 10

Source Information Paul Catalano is a University of Western Australia graduate conducting research on the geographic behavior of serial offenders. paulwcat@altavista.net Brennan Long is a Criminal Intelligence Analyst with the Organized Crime Bureau. Analyz007@hotmail.com Bryan Hill retired from the Phoenix Police Department after 21 years of service and is currently working as a civilian Crime Analyst with the Crime Analysis Unit, Support Services Bureau, and Glendale Police Department. bhill@ci.glendale.az.us Catalano, P.W., Hill, B and Long, B, 2001, Geographical analysis and serial crime investigation: A case study of armed robbery in Phoenix, Arizona, Security Journal, 14(3):27-41.,, Environmental Systems Research Institute (ESRI) is a vendor of the ArcView GIS software and other related GIS mapping application. www.esri.com http://www.esri.com/software/arcgis/about/desktop.html#modelbuilder http://www.esri.com/news/arcuser/1104/spatial_statistics.html Richard Heimann (Author) Separated from Active Duty Army in August 05 and currently in the NGA reserve. Worked for the DC GIS group (www.dcgis.dc.gov) while creating and documenting the PGM model and has since moved on to the Naval Research Laboratory in Washington DC. www.nrl.navy.mil 11