From Viral product spreading to Poverty prediction Data Science from a telecom perspective

Similar documents
Lessons From the Trenches: using Mobile Phone Data for Official Statistics

Understanding China Census Data with GIS By Shuming Bao and Susan Haynie China Data Center, University of Michigan

Measuring patterns of human behaviour through large-scale mobile phone data

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

Climate Resilience Decision Making Framework in the Caribbean. A case of Spatial Data Management

THE ROLE OF GEOSPATIAL AT THE WORLD BANK

Downloaded from

Commercialisation. Lessons learned from Dutch weather market

Analysis of Rural-Urban Systems and Historic Climate Data in Mexico

LandScan Global Population Database

Disaster Management & Recovery Framework: The Surveyors Response

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

Estimating Large Scale Population Movement ML Dublin Meetup

Integrating geography and statistics, but what about earth observation?

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Integrating Official Statistics and Geospatial Information NBS Experience

CHAPTER 2: KEY ISSUE 1 Where Is the World s Population Distributed? p

What s new (or not so new) in Population and Poverty Data Initiatives

The geography of domestic energy consumption

Development of a Regional Land Cover Monitoring System In the Lower Mekong Region a Joint Effort Between SERVIR-Mekong and Partners -

Purchase patterns, socioeconomic status, and political inclination

Introduction to Spatial Big Data Analytics. Zhe Jiang Office: SEC 3435

The AIR Bushfire Model for Australia

SDG s and the role of Earth Observation

GOVERNMENT MAPPING WORKSHOP RECOVER Edmonton s Urban Wellness Plan Mapping Workshop December 4, 2017

Cartographic and Geospatial Futures

World Meteorological Organization

Egypt Public DSS. the right of access to information. Mohamed Ramadan, Ph.D. [R&D Advisor to the president of CAPMAS]

The Integrated Ge spatial Information Framework to the strengthening of NSDI, Mongolia

MEANING AND MEASURES OF DEVELOPMENT

Designing smart & Resilient cities:

TRAITS to put you on the map

DATA DISAGGREGATION BY GEOGRAPHIC

Oak Ridge Urban Dynamics Institute

Measuring Poverty. Introduction

Green Chemistry Member Survey April 2014

University of Lusaka

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

Regional Flash Flood Guidance and Early Warning System

Foundation Geospatial Information to serve National and Global Priorities

The Challenge of Geospatial Big Data Analysis

How to shape future met-services: a seamless perspective

Population Density and Growth. Distribution of people on Earth

Jim Fox. copyright UNC Asheville's NEMAC

Summary Article: Poverty from Encyclopedia of Geography

WMO. Key Elements of PWS and Effective EWS. Haleh Haleh Kootval Chief, PWS Programme

Measuring Disaster Risk for Urban areas in Asia-Pacific

MANUAL ON THE BSES: LAND USE/LAND COVER

Compact guides GISCO. Geographic information system of the Commission

1. INTRODUCTION 2. BIG DATA IN URBAN STUDIES 3. LESSONS FROM BIG DATA

CS 350 A Computing Perspective on GIS

Spatial Data Science. Soumya K Ghosh

Demographic Data in ArcGIS. Harry J. Moore IV

Introducing GIS analysis

The Spatial Information Corridor Contributes to UNISPACE+50. JIANG HUI Director of Internation Cooperation Department

geographic patterns and processes are captured and represented using computer technologies

Space Applications for Disaster Risk Reduction and Sustainable Development

Spatially Enabled Society

ArcGIS for Desktop. ArcGIS for Desktop is the primary authoring tool for the ArcGIS platform.

Catalysing Innovation in Weather Science - the role of observations and NWP in the World Weather Research Programme

A Framework for the Study of Urban Health. Abdullah Baqui, DrPH, MPH, MBBS Johns Hopkins University

GCSE 4231/01 GEOGRAPHY

Quarterly Pacing Guide 6th grade Social Studies Content Expectations

Integration for Informed Decision Making

THE BEST POLICY (SMART) DECISIONS COME FROM SPATIALLY CONNECTED GOVERNMENT

Ørsted. Flexible reporting solutions to drive a clean energy agenda

Spotlight on Population Resources for Geography Teachers. Pat Beeson, Education Services, Australian Bureau of Statistics

BRIDGING THE GEOSPATIAL DIGITAL DIVIDE: WORLD BANK-UNGGIM PARTNERSHIP

Urban Climate Resilience

Geography - Grade 8. Unit A - Global Settlement: Patterns and Sustainability

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

Geospatial framework for monitoring SDGs/Sendai targets By Shimonti Paul Sr. Assistant Editor Geospatial Media & Communications

International Journal of Computing and Business Research (IJCBR) ISSN (Online) : APPLICATION OF GIS IN HEALTHCARE MANAGEMENT

Land Accounts - The Canadian Experience

SUPPORTS SUSTAINABLE GROWTH

WORLD COUNCIL ON CITY DATA

Sustainable and Harmonised Development for Smart Cities The Role of Geospatial Reference Data. Peter Creuzer

Northrop Grumman Concept Paper

Opportunities and challenges of HCMC in the process of development

Methodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context

Social Science Data for High Latitudes

CONFERENCE STATEMENT

ILLINOIS CERTIFICATION TESTING SYSTEM

Lesson 8. Natural Disasters

Molinas. June 15, 2018

The World Bank Mongolia Livestock and Agricultural Marketing Project (P125964)

The World Bank Indonesia National Slum Upgrading Project (P154782)

5 WAY S T O I N N O VAT E W I T H I N S I G H T S

GCSE 4231/01 GEOGRAPHY (Specification A) FOUNDATION TIER UNIT 1: Core Geography

Welcome. C o n n e c t i n g

A Simplified Travel Demand Modeling Framework: in the Context of a Developing Country City

GIS (GEOGRAPHICAL INFORMATION SYSTEMS) AS A FACILITATION TOOL FOR SUSTAINABLE DEVELOPMENT IN AFRICA

Paper Two - Optional Themes for Standard and Higher Level

WELCOME TO GCSE GEOGRAPHY WHERE WILL IT TAKE US TODAY?

Role of National Legal and Policy Frameworks on Geospatial Information

Predictive Analytics on Accident Data Using Rule Based and Discriminative Classifiers

Weather Climate Science to Service Partnership South Africa

Most people used to live like this

By Value By Volume By Type-Mercury Based Thermometers, Digital Thermometers, Infrared Thermometers, Temperature Trend Indicator, Others

CURRENT AND FUTURE TROPICAL CYCLONE RISK IN THE SOUTH PACIFIC

Transcription:

From Viral product spreading to Poverty prediction Data Science from a telecom perspective Pål Sundsøy, Telenor Group Research Oct 22 nd 2015 International Conference on Big Data for Official Statistics, Abu, Dhabi The pulse of Bangladesh (base station activity) Mobility patterns derived from CDR data, Grameenphone + Norway Mobility patterns of Pakistan Telenor Research

Among the major mobile operators in the world Approaching 200 million mobile subscriptions 33 000 employees Present in markets with 1.6 billion people

The Big Data Analytics team at Telenor Group Research work across all markets A team of 9 Data scientists Collaboration partners at leading academic research institutions Bridge between academic research and all business units Explore and develop new ways to utilize customer data across markets

Our customers generate an increasing amount of information in our systems What s in it for Telenor?

Billions of data points collected each day A number - Caller Date & time B number Receiving party Type: Call, SMS, Data, etc Data volume Cell_ID: Location IMSI: SIM card 5 TAC: Handset

Each BU generates huge amounts of data Customer Interactions Top up/payment Call Center Usage Network Performance Other data sources Inventory & Sales DWH & BI The data enables us to make better decisions across markets

Data Science: Learning from data Study mobile phone data can give us new insight into human sociology Telenor Research collaborates with MiT, Harvard and NorthEastern University, Flowminder ++ We us a large database with de-identified data for research 7 27/10/2015

Testing advanced behavioral indicators Or partners at MiT have successfully predicted mobile phone users personality class ( Big 5 personal traits ) using CDR data Together with MiT we explore the use of advanced derived variables to predict telco relevant behaviour Example variables Percentage of initiated contacts Variations in response time Entropy in contacts Entropy of visitited places Recharge data >1000 Indicators 6 categories

Gender prediction Support vector machine (SVM) and Random Forest (RF) are the best candidates Sampling (test+train): South Asia 50K, Europe 500K Total Accuracy Linear SVM Kernel SVM RF Europe 72.2% 73.5% 73.1% South Asia 73.1% 74.1% 74.8% The top 20% segment gives around 85-90% accuracy Top 6 features (SVM) 1.Interactions_per_contact text median sem 2.Number_of_interactions call mean 3.Interactions_per_contact text median mean 4.Duration_of_calls call mean mean 5.Interactions_per_contact call std mean 6.Number_of_interactions call sem 1.Duration_of_calls call mean sem 2.Interactions_per_contact call mean mean 3.Entropy_of_contacts call mean 4.Duration_of_calls call median mean 5.Interactions_per_contact call std mean 6.Number_of_interactions call mean Note: Radius of gyration among top predictors in RF ranking A cross country study of gender prediction E.Jahani, P. Sundsøy, Y.Montjoye, A.Pentland, J.Bjelland Netmob 2015, MIT

The greatest value of a picture is when it forces us to notice what we never expected to see. John Tukey, American mathematician

Data-Driven Development Using data for social good Disaster mobility and behavior Collaboration between Grameenphone, ICCCAD, Flowminder & Telenor Group (Research + Corporate Responsibility) Assessing mobility patterns and changes in economic behavior during the Cyclone Mahasen (May 2013). Improve models for Infectious disease spread Use mobile phone data to understand the spread of Dengue fever in Pakistan Collaboration with epidemiologists from Harvard T.H. Chan School of Public Health Poverty Prediction Using telecom data in assessing poverty (Flowminder, MIT) Privacy Collaboration with UN Global Pulse

Heartbeat of Bangladesh Assessing mobility patterns and changes in economic behavior during the Cyclone Mahasen Article in submission

Using power of mobile data to model spread of Dengue Fever o Human mobility plays a crucial role in the spread of the disease. o Incorporating human mobility data from CDRs improves the spatial and temporal prediction of potential outbreaks, and it also will anticipate potential outbreaks in new areas. o The method can be operationalized through better risk maps for government and health practitioners Privacy and mobile data for the dengue project We follow the guidelines in the Telenor Privacy Toolbox developed by the Group Privacy Officer For Pakistan there were additional considerations, coming from national considerations: All involved under Non-Disclosure Agreements Personal information (CDRs) processed within the data warehouse and on location in Islamabad, Pakistan, and no personal information was exported The results of the processing resulted in anonymous summations/aggregations Only the aggregate data was used in the study and the epidemic modeling of dengue Article published in PNAS (2015): Impact of human mobility on the emergence of dengue epidemics in Pakistan, A.Wesolowski, T. Qureshi, M.Bonid, P. Sundsøy, M.Johansson, S.Rasheed, K. Engø-Monsen, Caroline O. Buckee

Introducing mobile phone data in Poverty prediction Current Research: Improving Accuracy, Resolution and Regularity fd CDRs can produce rapidly updateable and spatially detailed metrics on consumption, social network structure and mobility all shown to be related to poverty CDRs can improve resolution in urban areas where satellite onlybased approaches lack detail Spatial resolution: Phone data has only be used for coarse mapping (large admin areas) Technology comparison: Not evaluated what mobile data adds vs remote sensing/other spatial layers. E.g. urban areas problematic using remote sensing Metric comparison: Different economic welfare measures have not been included (income vs wealth vs consumption in mapping)

Introducing mobile phone data in Poverty prediction Survey data Telco surveys DHS PPI Mobile phone data Basic phone usage Advanced phone usage Social Network Mobility Top-up Revenue Handset fd PREDICTION Satellite layers Population Aridity index Evapotranspiration Various animal densities Night time lights Elevation Vegetation Distance to roads/waterways Urban/Rural Land cover Pregnancy data Births Ethnicity Precipitation Annual temperature Global human settlement layer # poor per km 2 Prediction maps

Introducing mobile phone data in Poverty prediction Poverty Prediction map Methods 1. Spatial prediction Bayesian geostatistical modelling Prediction maps 2. Individual classification using machine learning methods RF GBM SVM Deep learning fd Quantification of uncertainty

Partnerships and Principles on Big data for social good BIG impact & shared value is our goal Our DATA & EXPERTISE is our contribution Handling data RESPONSIBLY is key COLLABORATION is the name of the game Kenth Engø-Monsen kenth.engo-monsen@telenor.com Geoffrey Canright Geoffrey.canright@telenor.com Pål Sundsøy pal-roe.sundsoy@telenor.com