Mobility Analytics through Social and Personal Data Pierre Senellart Session: Big Data & Transport Business Convention on Big Data Université Paris-Saclay, 25 novembre 2015
Analyzing Transportation and Mobility Data Consider an analytics task on transportation and mobility: Macro-level: How is transportation infrastructure affected by the creation of Université Paris-Saclay? Micro-level: What are the chances I will be late if I have a meeting with John today? 2 / 16 Télécom ParisTech Pierre Senellart
Current approaches Current approaches to transportation and mobility analytics: Deploying sensor networks (LIDAR, cameras, etc.) Leveraging on connected devices used for other means (car GPS systems, speed cameras, smart card validators for public transportation, mobile cell towers, toll gates,etc.) Personalized surveys Potential issues: Cost of deploying new sensors Data not publicly available, sometimes preciously guarded by companies as valuable assets Privacy concerns, data may not be easily redistributed Latency in installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level, inaccessible at the micro-level 3 / 16 Télécom ParisTech Pierre Senellart
Current approaches Current approaches to transportation and mobility analytics: Deploying sensor networks (LIDAR, cameras, etc.) Leveraging on connected devices used for other means (car GPS systems, speed cameras, smart card validators for public transportation, mobile cell towers, toll gates,etc.) Personalized surveys Potential issues: Cost of deploying new sensors Data not publicly available, sometimes preciously guarded by companies as valuable assets Privacy concerns, data may not be easily redistributed Latency in installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level, inaccessible at the micro-level 3 / 16 Télécom ParisTech Pierre Senellart
Current approaches Current approaches to transportation and mobility analytics: Deploying sensor networks (LIDAR, cameras, etc.) Leveraging on connected devices used for other means (car GPS systems, speed cameras, smart card validators for public transportation, mobile cell towers, toll gates,etc.) Personalized surveys Potential issues: Cost of deploying new sensors Data not publicly available, sometimes preciously guarded by companies as valuable assets Privacy concerns, data may not be easily redistributed Latency in installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level, inaccessible at the micro-level 3 / 16 Télécom ParisTech Pierre Senellart
Social Web: Abundant Source of Data Hundreds of millions of messages posted each day on platforms such as Twitter Widespread use of social media in city inhabitants (68% of Singaporeans regularly use social media) Crowdsourcing platforms (e.g., Amazon Mechanical Turk) can be used to obtain new data or to annotate existing data. Can be combined with user-contributed and open data of high quality (e.g., Open Street Map, RATP open data with all locations of bus stops) A large part of this is fully accessible, free or cheap to obtain Give access to both objective and subjective information Complementary to traditional data collection The Social Web, the Crowd as virtual sensors 4 / 16 Télécom ParisTech Pierre Senellart
Personal Information: Rich Information at the Individual s Level Individuals interested in analytics on issues mattering to them On these issues, wealth of personal information: Trajectories tracked by mobile phones (Google Timeline, Apple Location Services) Calendar information describing appointments, including addresses Contacts Emails with information about appointments, trips Virtual assistant services (Wipolo, Tripit) with detailed information about trips Own and relations social media information, with timestamped and geolocated events, pictures, messages etc. Digital traces extremely valuable, especially for mobility analytics, but only exploited by Google, Facebook, Apple,..., for their own purposes Give digital traces back to users, and the way to exploit them! 5 / 16 Télécom ParisTech Pierre Senellart
This talk 3 use cases for urban data collection from the Web (current research activities at Télécom ParisTech): User Trajectories and Mobility Networks Social Sensing for Trajectories of Moving Objects Asking the Right Questions to the Crowd 6 / 16 Télécom ParisTech Pierre Senellart
Outline User Trajectories and Mobility Networks Social Sensing for Trajectories of Moving Objects Asking the Right Questions to the Crowd 7 / 16 Télécom ParisTech Pierre Senellart
Semantics from users GPS tracks Users often equipped with a smartphone with capability of recording regular timestamped locations from GPS data, WiFi network or cell tower triangulation Indeed, such information often already recorded to support some applications (Google Now, Frequent Locations, etc.) Tracks are noisy, incomplete, and not matched with actual transit networks Little added value for the user so far 8 / 16 Télécom ParisTech Pierre Senellart
Smarter urban mobility [Montoya et al., 2015] A Time Multimodal J ourney B 3-axis Location Dynamics Context Smartphone Time Series Path generation Dynamic Bayesian Network Particle Filter Projection Transportation Network,,, 8:45 8:50 9:18 9:23 9:30 9:31 9:40 9:48 9:55 9:56 10:00 10:10 Multimodal Itinerary Clean-up Transit Line Recognition Transit Schedules, 8:45 8:50 9:18 9:23 9:40 9:48 10:00 10:10 A 123 Multimodal Itinerary Problem: Smart and adaptive recommendations for mobility in cities (transit, bike rental, car, etc.) Challenge: Should take into account personal information (calendar, etc.), past trajectories, public information about transit networks and traffic Methodology: Map GPS tracks to routes of public transport, learn route patterns, connect to personal information 9 / 16 Télécom ParisTech Pierre Senellart
Outline User Trajectories and Mobility Networks Social Sensing for Trajectories of Moving Objects Asking the Right Questions to the Crowd 10 / 16 Télécom ParisTech Pierre Senellart
Geolocation from the Social Web Much social Web content annotated with location information: Twitter or Facebook smartphone apps can add location to posts Pictures uploaded on Flickr or Instagram from GPS-equipped cameras (e.g., smartphones) often have geolocation data Possible to use information extraction techniques to extract location information from text and semi-structured information 11 / 16 Télécom ParisTech Pierre Senellart
Mobility in smart cities from social data Problem: Infer patterns of mobilities of populations within a city from social Web data (e.g., Instagram) Challenge: Partial and noisy data Methodology: Aggregate information from numerous users, to identify hotspots (e.g., tourist attractions) and paths between hotspots; cluster populations according to their mobility history 12 / 16 Télécom ParisTech Pierre Senellart
Social sensing of moving objects [Ba et al., 2014] Problem: Infer trajectories and meta-information of moving objects from Web and social Web data (e.g., Flickr) Challenge: Uncertainty and inconsistency in extracted information Methodology: Data cleaning by filtering incorrect locations, and truth discovery to identify reliable sources 13 / 16 Télécom ParisTech Pierre Senellart
Outline User Trajectories and Mobility Networks Social Sensing for Trajectories of Moving Objects Asking the Right Questions to the Crowd 14 / 16 Télécom ParisTech Pierre Senellart
Optimizing crowd queries [Amsterdamer et al., 2013a,b] 15 / 16 Télécom ParisTech Pierre Senellart Problem: How to answer a specific mining need (e.g., mobility patterns) by asking a sequence of questions on crowdsourcing platforms Challenge: Crowd accesses are costly; individual questions asked (e.g., How often do you use a bike?, What is your most common transportation option to your workplace? ) are not independent of each other Methodology: Maintain at all times a current knowledge of the world and estimate what is the best question(s) to ask given this knowledge.
Merci. Contributions from, and joint work with: Télécom ParisTech. Talel Abdessalem, Antoine Amarilli, Mouhamadou Lamine Ba, Anis Bouriga, Sébastien Montenez ENS Cachan & INRIA Saclay. Serge Abiteboul INRIA Saclay & ENGIE. David Montoya Tel Aviv University. Yael Amsterdamer, Yael Grossman, Tova Milo National Univȯf Singapore. Stéphane Bressan A*STAR, Singapore. Huayu Wu
Bibliography I Yael Amsterdamer, Yael Grossman, Tova Milo, and Pierre Senellart. Crowd mining. In Proc. SIGMOD, pages 241 252, New York, USA, June 2013a. Yael Amsterdamer, Yael Grossman, Tova Milo, and Pierre Senellart. Crowd miner: Mining association rules from the crowd. In Proc. VLDB, pages 241 252, Riva del Garda, Italy, August 2013b. Demonstration. M. Lamine Ba, Sébastien Montenez, Talel Abdessalem, and Pierre Senellart. Monitoring moving objects using uncertain web data. In Proc. SIGSPATIAL, pages 565 568, Dallas, USA, November 2014. Demonstration. David Montoya, Serge Abiteboul, and Pierre Senellart. Hup-me: Inferring and reconciling a timeline of user activity with smartphone and personal data. In Proc. SIGSPATIAL, Seattle, USA, November 2015.