Estimating Large Scale Population Movement ML Dublin Meetup

Similar documents
Encapsulating Urban Traffic Rhythms into Road Networks

Methodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context

SPATIAL INEQUALITIES IN PUBLIC TRANSPORT AVAILABILITY: INVESTIGATION WITH SMALL-AREA METRICS

GIS Spatial Statistics for Public Opinion Survey Response Rates

A Comprehensive Method for Identifying Optimal Areas for Supermarket Development. TRF Policy Solutions April 28, 2011

Typical information required from the data collection can be grouped into four categories, enumerated as below.

Detecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area

Spatial Variation in Local Road Pedestrian and Bicycle Crashes

Using Innovative Data in Transportation Planning and Modeling

Crime Analysis. GIS Solutions for Intelligence-Led Policing

Foreword. Vision and Strategy

Changes in the Spatial Distribution of Mobile Source Emissions due to the Interactions between Land-use and Regional Transportation Systems

Geomarketing as a Marketing Planning Tool. ESRI International User Conference San Diego, June 2007

Egypt Public DSS. the right of access to information. Mohamed Ramadan, Ph.D. [R&D Advisor to the president of CAPMAS]

Designing smart & Resilient cities:

transportation research in policy making for addressing mobility problems, infrastructure and functionality issues in urban areas. This study explored

Data Collection. Lecture Notes in Transportation Systems Engineering. Prof. Tom V. Mathew. 1 Overview 1

Commercialisation. Lessons learned from Dutch weather market

Demographic Data in ArcGIS. Harry J. Moore IV

Exploring the Association Between Family Planning and Developing Telecommunications Infrastructure in Rural Peru

The Journal of Database Marketing, Vol. 6, No. 3, 1999, pp Retail Trade Area Analysis: Concepts and New Approaches

Subject: Note on spatial issues in Urban South Africa From: Alain Bertaud Date: Oct 7, A. Spatial issues

East Bay BRT. Planning for Bus Rapid Transit

A route map to calibrate spatial interaction models from GPS movement data

Outline. 15. Descriptive Summary, Design, and Inference. Descriptive summaries. Data mining. The centroid

Discovering Urban Spatial-Temporal Structure from Human Activity Patterns

Forecasts from the Strategy Planning Model

Lessons From the Trenches: using Mobile Phone Data for Official Statistics

Status Report: Ongoing review of O-D cellular data for the TPB modeled area

Visitor Flows Model for Queensland a new approach

BROADBAND DEMAND AGGREGATION: PLANNING BROADBAND IN RURAL NORTHERN CALIFORNIA

The geography of domestic energy consumption

Accessibility as an Instrument in Planning Practice. Derek Halden DHC 2 Dean Path, Edinburgh EH4 3BA

Cartographic and Geospatial Futures

THE ROLE OF REGIONAL SPATIAL PLANNING IN SUPPORTING LONG-TERM ECONOMIC GROWTH IN NORTHERN IRELAND

Modelling Accessibility to General Hospitals in Ireland

Implementing Visual Analytics Methods for Massive Collections of Movement Data

Beyond the Target Customer: Social Effects of CRM Campaigns

A new metric of crime hotspots for Operational Policing

Metrolinx Transit Accessibility/Connectivity Toolkit

Palmerston North Area Traffic Model

Managing Growth: Integrating Land Use & Transportation Planning

Climate Resilience Decision Making Framework in the Caribbean. A case of Spatial Data Management

National Symposium on Extreme Weather Event Impacts on Transportation Infrastructure

GLA small area population projection methodology

Are You Maximizing The Value Of All Your Data?

City monitoring with travel demand momentum vector fields: theoretical and empirical findings

Content Area: Social Studies Standard: 1. History Prepared Graduates: Develop an understanding of how people view, construct, and interpret history

Socio-Economic Levels and Human Mobility

Land Use in the context of sustainable, smart and inclusive growth

Implications for the Sharing Economy

BROOKINGS May

Future Proofing the Provision of Geoinformation: Emerging Technologies

PART A Project summary

Purpose Study conducted to determine the needs of the health care workforce related to GIS use, incorporation and training.

Data Mining II Mobility Data Mining

Office of Technology Partnerships GIS Collaboration

I N T R O D U C T I O N : G R O W I N G I T C O M P L E X I T Y

Integration for Informed Decision Making

Cognitive Engineering for Geographic Information Science

Economic Geography of the Long Island Region

Indicator: Proportion of the rural population who live within 2 km of an all-season road

Mastering ArcGIS Platforms to Build a National Census Web Mapping Tool. Eoghan McCarthy (AIRO)

The GIS Path Forward Saskatchewan s Geomatics Strategic Plan

Spatiotemporal Analysis of Urban Traffic Accidents: A Case Study of Tehran City, Iran

Land Use Modeling at ABAG. Mike Reilly October 3, 2011

DEVELOPING DECISION SUPPORT TOOLS FOR THE IMPLEMENTATION OF BICYCLE AND PEDESTRIAN SAFETY STRATEGIES

Measuring connectivity in London

Economic Development and Transport in New Zealand. Thomas Simonson 4/7/2016

Extracting mobility behavior from cell phone data DATA SIM Summer School 2013

A BASE SYSTEM FOR MICRO TRAFFIC SIMULATION USING THE GEOGRAPHICAL INFORMATION DATABASE

GIS 520 Data Cardinality. Joining Tabular Data to Spatial Data in ArcGIS

GOVERNMENT MAPPING WORKSHOP RECOVER Edmonton s Urban Wellness Plan Mapping Workshop December 4, 2017

Session-Based Queueing Systems

The Greater Drogheda Area - Drogheda (Co. Louth), Drogheda South (Co. Meath) & the Meath Coast

Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data

From Viral product spreading to Poverty prediction Data Science from a telecom perspective

Section 2. Indiana Geographic Information Council: Strategic Plan

LOCATIONAL PREFERENCES OF FDI FIRMS IN TURKEY

The Future of Healthcare? W H A T D O E S T H E F U T U R E H O L D? The Empowered Consumer

Gov 2.0. Encourage citizen engagement, deliver transparency, and foster collaboration with ArcGIS.

The Challenge of Geospatial Big Data Analysis

CRP 608 Winter 10 Class presentation February 04, Senior Research Associate Kirwan Institute for the Study of Race and Ethnicity

Understanding China Census Data with GIS By Shuming Bao and Susan Haynie China Data Center, University of Michigan

Spatial Data Science. Soumya K Ghosh

Exploring Digital Welfare data using GeoTools and Grids

Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI

Economic and Social Urban Indicators: A Spatial Decision Support System for Chicago Area Transportation Planning

Spatial Extension of the Reality Mining Dataset

TRAVEL DEMAND MODEL. Chapter 6

TRANSPORT CHALLENGES AND OPPORTUNITIES FOR LANDLOCKED COUNTRIES FOR ACHIEVING SUSTAINABLE DEVELOPMENT GOALS

THE FUTURE OF FORECASTING AT METROPOLITAN COUNCIL. CTS Research Conference May 23, 2012

UNCERTAINTY IN THE POPULATION GEOGRAPHIC INFORMATION SYSTEM

Cultural Data in Planning and Economic Development. Chris Dwyer, RMC Research Sponsor: Rockefeller Foundation

Uses of Travel Demand Models Beyond the MTP. Janie Temple Transportation Planning and Programming Division

Noise Maps, Report & Statistics, Dublin City Council Noise Mapping Project Roads and Traffic Department

CHAPTER 2. Strategic Context

If you aren t familiar with Geographical Information Systems (GIS), you. GIS, when combined with a database that stores response information,

Policy Paper Alabama Primary Care Service Areas

8/27/2015 M. Helper, U. Texas, Austin

Transcription:

Deutsche Bank COO Chief Data Office Estimating Large Scale Population Movement ML Dublin Meetup John Doyle PhD Assistant Vice President CDO Research & Development Science & Innovation john.doyle@db.com https://www.db.com/ireland/

Estimating Large Scale Population Movement Presentation Outline Introduction: Research Motivation & Data Mobility: Trajectories & Large Scale Movement Population: Density Estimates Application: How to Use the Data Conclusions: Summary of the Research Deutsche Bank COO - Chief Data Ofce 2

Research Motivation Measuring the movement of people is a fundamental activity in modern society Movement data is used by: Transportation services Planning authorities Governmental departments It is also the primary data source used in the delivery of mobile communications and location based services This research documents novel algorithms and techniques for the estimation of movement from mobile telephony data addressing practical issues related to sampling, privacy and spatial uncertainty. Deutsche Bank COO - Chief Data Ofce 3

Mobile Telephony Data Call Detail Records (CDR) CDR is a data log of recorded Call, SMS and data activities which occur on a mobile operator s telephony network. Deutsche Bank COO - Chief Data Ofce Approximately 1 million customers generating over 1.5 billion records BS1 BS2 U1 Mobile Operator CDR Collection Server U2 4

CDR Data Mining Deutsche Bank COO - Chief Data Ofce CDR Spatiotemporal Data Types Trajectory Information User Social / Cell Network Cell Activities 5

Subscriber Trajectories Trajectories from CDR only capture cell locations of individuals when they record mobile phone activity

Trajectory Issues Spatial Resolution Sampling rate V is ibile P roportion of P opulation User activity follow a burst mentality Location estimates are fixed to cell tower coverage areas 1 0.8 0.6 0.4 Voronoi cells 0.2 0 04:00 08:00 12;00 Time 16:00 20:00 24:00

Scaling Cells to Regions Cell Coverage x 10 Spatial Regions of Interest 5 5 x 10 3.6 3.4 3.4 3.2 3.2 Northing Northing 3.6 3 3 2.8 2.8 2.6 2.6 2.5 2.6 2.7 2.8 2.9 3 Easting 10721 cells 3.1 3.2 3.3 2.5 3.4 x 10 2.6 2.7 2.8 2.9 3 Easting 5 500 regions 3.1 3.2 3.3 3.4 x 10 5

Uniform Sampling Within each 15-minute temporal window, the estimate of location is based on the last recorded servicing cell tower recorded for that subscriber during that period. CDR trajectory state sequence sampling of the output sequence S = {S1, S1, S3, S3, S4}. Smaller yellow circles represent actual regional transitions within a sample period and larger yellow circles represent the observed output transition sequence before resampling.

Regional Flows of Subscribers By observing the flow of people between clustered regions and the geographical areas covered, a proxy for the flow of people between individual population centres can be established. These results can summarised in an aggregated transition matrix T(k),

Average Intensity of Subscribers Between Regions

Temporal Flow of Subscribers

Population Estimation A census is the primary tool used by national governments to gather information on population metrics, which includes among others population count, religious status, material status and household occupancy. The knowledge obtained dictates future policy on decisions related to the planning of future infrastructure and public services. While the information gathered is extremely important for the delivery of such services, the cost of carrying out a census is prohibitively expensive. As a result a census may be only carried out every 5-10 years. Consequently, they provide poor temporal resolution and are incapable of providing information on the current status of a population. This motivates the requirement for low cost alternatives.

Modelling User Movement We can model individual user movement with Markov chains. Homogeneous Markov chains are useful when the state sequence, S(k), k = 0; 1; 2;..., is directly observable. By extracting a subscriber CDR trajectory, it is possible to directly observe an individual subscriber s cell tower state sequence. Markov chains may be used to model a mobile subscribers transient movement between the symbolic locations represented by the clustered cell regions.

Subscriber Regions of Interest If a Markov chains is ergodic where W is a matrix with identical rows w, and all components of w sum to 1. The fixed row vector, w, of a mobile subscriber s mobility Markov chain conveys the probability of observing that subscriber at a region in space over a long period of time. As not all mobility Markov chains are ergodic, introduce a regularisation weight where Q is a modified Markov chain, R is the number of states, J is a R x R matrix of ones and α balances the learnt mobility patterns summarised by P with the influence of random transition probabilities introduced by the term J/R

The Q of a randomly select subscriber Low transition probabilities are not illustrated for visual clarity The observed regional ranking suggests that the subscriber tends to travel in County Meath, with occasional trips into Dublin City

Population Density

ED Population Corr 86.61% Corr 84.38%

Population Estimation The correlation between census data and maximum weighting approach is approximately 98.4%. The correlation between census data and aggregated approach is approximately 97.7%. However, as performance is restricted by its ability to measure population proportions in different areas, but not the ability to estimate counts, the effectiveness of such techniques for inferring census type data needs further research and is the subject of future work.

Application Areas Mobile network operators are beginning to see profit margins fall due to tighter regulation increasing demand for data services falling revenues generated from call and SMS traffic In this context, network operators are increasingly focusing their efforts on new revenues generation schemes lower subscriber churn increasing customer satisfaction rates However, this shift in focus has unearthed significant gaps in their knowledge of how subscribers use and perceive the mobile services on offer to them.

Transportation Planning Kernel density estimate of journey trajectories identified as travelling along (a) road and (b) rail travel paths.

High Mobile Traffic Regions Of Interest Combine the vector weights of high data usage subscribers Better understanding of the areas they occupy on a daily basis Design more efficient networks Identify coverage black spots Better data for marketing

Identify Event Mobility

Geographically Weighted Amenities

Catchment Area

Acknowledgements This research was funded by a Strategic Research Cluster grant (07/SRC/I1168) by Science Foundation Ireland under the National Development Plan and by the Irish Research Council under their Embark Initiative in partnership with ESRI Ireland. I would also like to gratefully acknowledge the support of Meteor for providing the data used in this research, in particular John Bathe and Adrian Whitwham.

Questions?