Predicting and Preventing Emerging Outbreaks of Crime

Similar documents
Outline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model

Variants of Pegasos. December 11, 2009

Advanced Machine Learning & Perception

CHAPTER 10: LINEAR DISCRIMINATION

( t) Outline of program: BGC1: Survival and event history analysis Oslo, March-May Recapitulation. The additive regression model

FTCS Solution to the Heat Equation

Bayes rule for a classification problem INF Discriminant functions for the normal density. Euclidean distance. Mahalanobis distance

In the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!

Econ107 Applied Econometrics Topic 5: Specification: Choosing Independent Variables (Studenmund, Chapter 6)

TSS = SST + SSE An orthogonal partition of the total SS

Robustness Experiments with Two Variance Components

John Geweke a and Gianni Amisano b a Departments of Economics and Statistics, University of Iowa, USA b European Central Bank, Frankfurt, Germany

Solution in semi infinite diffusion couples (error function analysis)

2. SPATIALLY LAGGED DEPENDENT VARIABLES

Department of Economics University of Toronto

Let s treat the problem of the response of a system to an applied external force. Again,

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4

Robust and Accurate Cancer Classification with Gene Expression Profiling

Clustering (Bishop ch 9)

An introduction to Support Vector Machine

V.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS

( ) () we define the interaction representation by the unitary transformation () = ()

Machine Learning Linear Regression

F-Tests and Analysis of Variance (ANOVA) in the Simple Linear Regression Model. 1. Introduction

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

Fall 2010 Graduate Course on Dynamic Learning

Lecture 6: Learning for Control (Generalised Linear Regression)

[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5

Reactive Methods to Solve the Berth AllocationProblem with Stochastic Arrival and Handling Times

WiH Wei He

Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms

Lecture VI Regression

Time Scale Evaluation of Economic Forecasts

Mechanics Physics 151

Chapter Lagrangian Interpolation

Mechanics Physics 151

On One Analytic Method of. Constructing Program Controls

Graduate Macroeconomics 2 Problem set 5. - Solutions

Including the ordinary differential of distance with time as velocity makes a system of ordinary differential equations.

Advanced time-series analysis (University of Lund, Economic History Department)

Dynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005

Math 128b Project. Jude Yuen

January Examinations 2012

Single-loop System Reliability-Based Design & Topology Optimization (SRBDO/SRBTO): A Matrix-based System Reliability (MSR) Method

THEORETICAL AUTOCORRELATIONS. ) if often denoted by γ. Note that

Dual Approximate Dynamic Programming for Large Scale Hydro Valleys

New M-Estimator Objective Function. in Simultaneous Equations Model. (A Comparative Study)

Notes on the stability of dynamic systems and the use of Eigen Values.

Fall 2009 Social Sciences 7418 University of Wisconsin-Madison. Problem Set 2 Answers (4) (6) di = D (10)

CS286.2 Lecture 14: Quantum de Finetti Theorems II

Forecasting customer behaviour in a multi-service financial organisation: a profitability perspective

Comb Filters. Comb Filters

Testing a new idea to solve the P = NP problem with mathematical induction

RELATIONSHIP BETWEEN VOLATILITY AND TRADING VOLUME: THE CASE OF HSI STOCK RETURNS DATA

Digital Speech Processing Lecture 20. The Hidden Markov Model (HMM)

EEL 6266 Power System Operation and Control. Chapter 5 Unit Commitment

Geographically weighted regression (GWR)

UNIVERSITAT AUTÒNOMA DE BARCELONA MARCH 2017 EXAMINATION

Chapter 6 DETECTION AND ESTIMATION: Model of digital communication system. Fundamental issues in digital communications are

Appendix H: Rarefaction and extrapolation of Hill numbers for incidence data

Li An-Ping. Beijing , P.R.China

Probabilistic Forecasting of Wind Power Ramps Using Autoregressive Logit Models

Consider processes where state transitions are time independent, i.e., System of distinct states,

Additive Outliers (AO) and Innovative Outliers (IO) in GARCH (1, 1) Processes

Approximate Analytic Solution of (2+1) - Dimensional Zakharov-Kuznetsov(Zk) Equations Using Homotopy

Linear Response Theory: The connection between QFT and experiments

Machine Learning 2nd Edition

Mechanics Physics 151

Panel Data Regression Models

Bernoulli process with 282 ky periodicity is detected in the R-N reversals of the earth s magnetic field

Hidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University

Lecture 18: The Laplace Transform (See Sections and 14.7 in Boas)

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

. The geometric multiplicity is dim[ker( λi. number of linearly independent eigenvectors associated with this eigenvalue.

Learning Objectives. Self Organization Map. Hamming Distance(1/5) Introduction. Hamming Distance(3/5) Hamming Distance(2/5) 15/04/2015

THE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS

. The geometric multiplicity is dim[ker( λi. A )], i.e. the number of linearly independent eigenvectors associated with this eigenvalue.

Data Collection Definitions of Variables - Conceptualize vs Operationalize Sample Selection Criteria Source of Data Consistency of Data

Normal Random Variable and its discriminant functions

Bandlimited channel. Intersymbol interference (ISI) This non-ideal communication channel is also called dispersive channel

Chapter 8 Dynamic Models

Comparison of Supervised & Unsupervised Learning in βs Estimation between Stocks and the S&P500

We are estimating the density of long distant migrant (LDM) birds in wetlands along Lake Michigan.

Computing Relevance, Similarity: The Vector Space Model

Filtrage particulaire et suivi multi-pistes Carine Hue Jean-Pierre Le Cadre and Patrick Pérez

Lecture 11 SVM cont

CHAPTER 5: MULTIVARIATE METHODS

2/20/2013. EE 101 Midterm 2 Review

DEEP UNFOLDING FOR MULTICHANNEL SOURCE SEPARATION SUPPLEMENTARY MATERIAL

Time-interval analysis of β decay. V. Horvat and J. C. Hardy

Chapter 6: AC Circuits

( ) [ ] MAP Decision Rule

Endogeneity. Is the term given to the situation when one or more of the regressors in the model are correlated with the error term such that

Kernel-Based Bayesian Filtering for Object Tracking

CH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC

CS 268: Packet Scheduling

Survival Analysis and Reliability. A Note on the Mean Residual Life Function of a Parallel System

Appendix to Online Clustering with Experts

Childhood Cancer Survivor Study Analysis Concept Proposal

Should Exact Index Numbers have Standard Errors? Theory and Application to Asian Growth

Transcription:

Predcng and Prevenng Emergng Oubreaks of Crme Danel B. Nell Even and Paern Deecon Laboraory H.J. Henz III College, Carnege Mellon Unversy nell@cs.cmu.edu Jon work wh Seh Flaxman, Amru Nagasunder, Wl Gorr (CMU); Bre Goldsen (Cy of Chcago). Ths work was parally suppored by NSF grans IIS-0916345, IIS-0911032, and IIS-0953330.

Background: Crme Predcon n Chcago Snce 2009, we have been workng wh he Chcago Polce Deparmen (CPD) o predc and preven emergng clusers of volen crme. Our new crme predcon mehods have been ncorporaed no our CrmeScan sofware, run wce a day by CPD and used operaonally for deploymen of parols. From he Chcago Sun-Tmes, February 22, 2011: I was a b lke Mnory Repor, he 2002 move ha feaured genecally alered humans wh specal powers o predc crme. The CPD s new crme-forecasng un was analyzng 911 calls and produced an nellgence repor predcng a shoong would happen soon on a parcular block on he Souh Sde. Three mnues laer, dd

CrmeScan The key nsgh of our mehod s o use deecon for predcon: We can deec emergng clusers of varous leadng ndcaors (mnor crmes, 911 calls, ec.) and use hese o predc ha a cluser of volen crme s lkely o occur nearby. Some advanages of he CrmeScan approach: Advance predcon (up o 1 week) wh hgh accuracy. Hgh spaal and emporal resoluon (block x day). Predcng emergng ho spos of volence, as opposed o jus denfyng bad neghborhoods. How o deec leadng ndcaor clusers? How o use hese for predcon? Whch leadng ndcaors o use?

CrmeScan: Cluser Deecon We aggregae daly couns for each leadng ndcaor a he block level, and search for clusers of nearby blocks wh recen couns ha are sgnfcanly hgher han expeced. Imagne movng a crcular wndow around he cy, allowng he cener, radus, and emporal duraon o vary. Is here any spaal wndow and duraon T such ha couns have been sgnfcanly hgher han expeced for he las T days? Tme seres of pas couns Acual couns of las 3 days Expeced couns of las 3 days

CrmeScan: Cluser Deecon We aggregae daly couns for each leadng ndcaor a he block level, and search for clusers of nearby blocks wh recen couns ha are sgnfcanly hgher han expeced. Imagne movng a crcular wndow around he cy, allowng he cener, radus, and emporal duraon o vary. We fnd he hghes-scorng space-me regons, where he score of a regon s compued by he lkelhood rao sasc. F( S) Pr(Daa H 1( S)) Pr(Daa H 0) Alernave hypohess: cluser n regon S Null hypohess: no clusers These are he mos lkely clusers; we compue he p-value of each cluser by randomzaon, and repor clusers wh p-values <.

Expecaon-Based Scan Sasc Couns are Posson dsrbued: c ~ Posson(q b ) Under he null hypohess H 0, we expec couns o be equal o baselnes: q = 1 everywhere. Under he alernave hypohess H 1 (S), we expec ncreased rsk n space-me regon S: q = q n n S, for q n > 1, and q = 1 ousde. q s relave rsk. b s expeced coun under H 0, esmaed by me seres analyss of hsorcal daa. q n = 1.3 Ths gves a smple and effcenly compuable lkelhood rao sasc: F( S) C B C e B C, wherec S c and B S b. Many oher sascs can be used (see Kulldorff, 1997; Nell, 2006)

CrmeScan: Predcon The currenly deployed verson of CrmeScan uses a smple rule for predcon of volen crme clusers: Areas whch are closer o a sgnfcan cluser of any of he monored LI are assumed more lkely o have a spke n VC whn he nex 1 week. Toal proxmy o leadng ndcaor clusers s compued usng kernel densy esmaon: score = exp (-d 2 /2) (d s dsance o he h leadng ndcaor cluser) We are also nvesgang he use of logsc regresson for predcon (resuls no shown).

CrmeScan: Prelmnary Resuls Key resul: a block level, CrmeScan predcs 60% of he clusered* VC whch wll occur n he nex week, a a 15% false posve rae. * A leas 3 VC n ha bea, and 1.5 sd. dev. more han expeced. Predcon accuracy s sgnfcanly hgher han compeng mehods.

Whch Predcors o Use? Challenge #1: hundreds of possble predcors, ncludng mnor crmes, 911 emergency calls, 311 calls for servce, ec. Challenge #2: dfferen daa sources, or combnaons of sources, may be predcve n dfferen areas of he cy. We wsh o learn whch combnaons of sources are predcve, and where, usng cross-correlaon analyss of hsorcal daa. Typcal formulaon: gven an ndependen varable me seres and a dependen varable me seres Y, maxmze correlaon beween and lagged Y, over a range of lags L = L mn L max. For whch subse of leadng ndcaors, and whch subse of locaons, s cross-correlaon maxmzed?

Maxmzng cross-correlaon Gven monored locaons s ( = 1..N), we observe he mulple ndependen varable me seres x,m (m = 1..M) and he dependen varable me seres y a each locaon. Our goal s o maxmze he correlaon r(, Y) over all subses of leadng ndcaors, all proxmy-consraned subses of locaons, and all lags L = L mn..l max : max r(, Y S { s1.. sn }, D { d1.. dm }, L { Lmn.. Lmax } ) d D s S m where x, and Y m s S y L aggregaed ndependen var. me seres aggregaed, lagged dependen var. me seres

Maxmzng cross-correlaon How o effcenly maxmze correlaon r(d, S, L) over 2 N x 2 M subses of locaons and predcors? max r(, Y) Ierave S { s1framework.. sn }, D { d(ouer 1.. dm }, loop): L { Lmn.. Lmax } 1) Randomly nalze subse of sreams D. 2) Opmze over locaons: S = and arg max Y y m S r(d, S, L) 3) Opmze over sreams: D = arg max dm D s S D r(d, S, L) s S 4) Repea seps 2-3 unl convergence. 5) Repea seps 1-4 for R random resars. 6) Repea seps 1-5 for each lag L. where x, L

Opmzng over subses of sreams Gven fxed S and L, we wan o fnd a se D o maxmze r(d, S, L). We wre: = d m D m ; m = s S x,m ; and Y = s S y. Then we maxmze r(d S, L) = r(, Y) = = Y Y d m d m D D ( m m Y) Y Now we would lke o wre hs expresson as a convex funcon of wo addve suffcen sascs, r(d S, L) = F(C, B) where C = dm D C m and B = dm D B m. If we can do hs, we can show ha he opmal D consss of he k sreams wh hghes rao C m / B m, for some k {1..N}. Ths lnear-me subse scannng (LTSS) propery allows us o fnd he exac maxmum over he 2 M subses n O(M log M).

Opmzng over subses of sreams Gven fxed S and L, we wan o fnd a se D o maxmze r(d, S, L). We wre: = d m D m ; m = s S x,m ; and Y = s S y. Then we maxmze r(d S, L) = r(, Y) = = Y Y d m d m D D ( m m Y) Y Now we would lke o wre hs expresson as a convex funcon of wo addve suffcen sascs, r(d S, L) = F(C, B) where C = dm D C m and B = dm D B m. We can wre r(d S, L) = Y C addve suffcen sasc: C = C m = ( m Y ) B no an addve suffcen sasc! B = dm D ( m m ) + d, dj D, j ( j ) Soluon: we can approxmae he all-pars compuaon usng he average do produc of sream d m wh an arbrary se of sreams.

Ierave average do produc (IADP) Snce he opmal subse D s unknown, we compue he average do produc of each sream D m wh an arbrary 1 subse of sreams D (D m D ): Qm d D' m D' Then B dm D B m, Cwhere B m = m m + ( D -1) Q m. We We can wre have r(d approxmaed S, L) = r(d S, L) wh a funcon whch can be exacly and effcenly Y Bopmzed no an usng addve he LTSS suffcen propery! sasc! B = dm D ( m m ) + d, dj D, j ( j ) However, he approxmaon may be poor when D s far from D. Our soluon s o erae: a each sep, we se D equal o he bes subse D found on he prevous sep, and repea unl convergence.

Opmzng over subses of locaons Gven fxed D and L, we wan o fnd a se S o maxmze r(d, S, L). We wre: = s S ; = d m D x,m ; and Y = s S y. Then we maxmze r(s D, L) = r(, Y) = s S s S Y s S s S Y Ths expresson s more dffcul o approxmae by a funcon ha sasfes LTSS because we have summaons boh over and Y, resulng n all-pars compuaons boh n he numeraor and n he denomnaor. The erave average do produc mehod can also be appled n hs seng, bu now we mus make fve approxmaons nsead of one. Deals are provded n he full paper (Flaxman and Nell, 2012, submed).

Resuls: Comparson of Mehods For IADP and several compeng mehods, we maxmzed cross-correlaon over subses of predcors (and locaons) for each of he 77 Chcago neghborhoods. We hen compued he average cross-correlaon found by each mehod. Mehod IADP, searchng over subses of census racs whn each neghborhood. IADP, reang each neghborhood as a sngle locaon. Average crosscorrelaon.546.423 Google Correlae.404 LASSO.325 By jonly opmzng over subses of locaons and sreams, we fnd areas wh much sronger crosscorrelaons beween ndependen and dependen varables. Improved feaure selecon: Searchng over subses of sreams for each neghborhood, we fnd sgnfcanly hgher correlaons han prevous mehods.

Resuls: Exploraory Analyss Consderng all subses of census racs whn each of he 77 neghborhoods of Chcago, 28 dfferen poenal predcors, and a 1-week lag, we found a correlaon of r =.786 beween volen crme and a subse of 12 leadng ndcaors, for 10 census racs n he Wes Englewood neghborhood. Toal run me for all 77 neghborhoods was 2.1 hours.

Conclusons and Ongong Work CrmeScan s a new and powerful mehodology for crme predcon whch has been very successful n pracce. We are n he process of exendng CrmeScan by developng novel mehods o choose an opmal se of spaally varyng leadng ndcaors for predcon. Our resuls sugges ha dfferen subses of leadng ndcaors have hgh predcve accuracy n dfferen areas, and ha our new mehods can effcenly opmze cross-correlaon over subses of locaons and sreams. Our nex sep s o deermne wheher he opmzed, spaally varyng subse of leadng ndcaors can be used o mprove he overall predcve accuracy of CrmeScan.

From CrmeScan o CyScan Workng wh he Cy of Chcago s Chef Daa Offcer, we are currenly usng our new even deecon mehods for analyss of many oher daa sources relevan o he cy. Mos neresngly, we have some promsng nal resuls for predcon of emergng paerns of 311 calls. Examples: abandoned buldngs, graff cleanup, sanaon complans, roden removal, garbage cars Our CrmeScan sofware has been renamed CyScan and s beng ncorporaed no WndyGrd, he cy s new spaal daabase, whch wll enable real-me monorng of crme, 311, and many oher daa sources.