Assessing pervasive user-generated content to describe tourist dynamics

Similar documents
Place this Photo on a Map: A Study of Explicit Disclosure of Location Information

keywords spatio-temporal data analysis, geovisualization, locationdisclosure,

Leveraging Explicitly Disclosed Location Information to Understand Tourist Dynamics: A Case Study

Spatial Data Science. Soumya K Ghosh

Extracting Touristic Information from Online Image Collections

Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI

A route map to calibrate spatial interaction models from GPS movement data

An Ontology-based Framework for Modeling Movement on a Smart Campus

Generalisation and Multiple Representation of Location-Based Social Media Data

The Challenge of Geospatial Big Data Analysis

Quantifying urban attractiveness from the distribution and density of digital footprints

Geographical Bias on Social Media and Geo-Local Contents System with Mobile Devices

Mobility Analytics through Social and Personal Data. Pierre Senellart

Clustering Analysis of London Police Foot Patrol Behaviour from Raw Trajectories

Space-adjusting Technologies and the Social Ecologies of Place

DEVELOPMENT OF GPS PHOTOS DATABASE FOR LAND USE AND LAND COVER APPLICATIONS

From Research Objects to Research Networks: Combining Spatial and Semantic Search

How local communities could profit from Habitats infrastructure

Geographic Data Science - Lecture II

When Map Quality Matters

Climate Risk Visualization for Adaptation Planning and Emergency Response

Kentucky Collaborates in GeoMAPP Project: The Advantages and Challenges of Archiving in a State with a Centralized GIS

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research

Putting the U.S. Geospatial Services Industry On the Map

Reproducible AGILE

ESRI Survey Summit August Clint Brown Director of ESRI Software Products

Detecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area

GIS = Geographic Information Systems;

Course Syllabus. Geospatial Data & Spatial Digital Technologies: Assessing Land Use/Land Cover Change in the Ecuadorian Amazon.

Museumpark Revisit: A Data Mining Approach in the Context of Hong Kong. Keywords: Museumpark; Museum Demand; Spill-over Effects; Data Mining

Spatial Information Retrieval

TOWARDS ESTIMATING THE PRESENCE OF VISITORS FROM THE AGGREGATE MOBILE PHONE NETWORK ACTIVITY THEY GENERATE

3D Urban Information Models in making a smart city the i-scope project case study

Community GIS over the Web:

DM-Group Meeting. Subhodip Biswas 10/16/2014

Web Visualization of Geo-Spatial Data using SVG and VRML/X3D

Metropolitan Wi-Fi Research Network at the Los Angeles State Historic Park

From Social Sensor Data to Collective Human Behaviour Patterns Analysing and Visualising Spatio-Temporal Dynamics in Urban Environments

Developing 3D Geoportal for Wilayah Persekutuan Iskandar

GIScience & Mobility. Prof. Dr. Martin Raubal. Institute of Cartography and Geoinformation SAGEO 2013 Brest, France

Using spatial-temporal signatures to infer human activities from personal trajectories on location-enabled mobile devices

Arboretum Explorer: Using GIS to map the Arnold Arboretum

City, University of London Institutional Repository

How a Media Organization Tackles the. Challenge Opportunity. Digital Gazetteer Workshop December 8, 2006

VGIscience Summer School Interpretation, Visualisation and Social Computing of Volunteered Geographic Information (VGI)

Geo-Enabling Mountain Bike Trail Maintenance:

The INSPIRE Community Geoportal

A Simple Tags Categorization Framework Using Spatial Coverage to Discover Geospatial Semantics. TARDY, Camille, MOCCOZET, Laurent, FALQUET, Gilles

The Concept of Geographic Relevance

Geographic Information Systems and Science: Today and Tomorrow. Michael F. Goodchild University of California Santa Barbara

Supporting movement patterns research with qualitative sociological methods gps tracks and focus group interviews. M. Rzeszewski, J.

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

The linguistic landscape of social media: A case study of the Senate Square, Helsinki

Spatial Data Availability Energizes Florida s Citizens

Exploring spatial decay effect in mass media and social media: a case study of China

Crime Analysis. GIS Solutions for Intelligence-Led Policing

MACHINE LEARNING FOR MOBILITY STUDIES IN URBAN AND NATURAL AREAS TUULI TOIVONEN / DIGITAL GEOGRAPHY LAB

Discovering Urban Spatial-Temporal Structure from Human Activity Patterns

A method of Area of Interest and Shooting Spot Detection using Geo-tagged Photographs

The Social Life of Location. David Sonnen September 2008

Animating Maps: Visual Analytics meets Geoweb 2.0

Smart Data Collection and Real-time Digital Cartography

ArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics

* Abstract. Keywords: Smart Card Data, Public Transportation, Land Use, Non-negative Matrix Factorization.

8/28/2011. Contents. Lecture 1: Introduction to GIS. Dr. Bo Wu Learning Outcomes. Map A Geographic Language.

A General Framework for Conflation

A Technique for Importing Shapefile to Mobile Device in a Distributed System Environment.

Massachusetts Institute of Technology Department of Urban Studies and Planning

A Cloud Computing Workflow for Scalable Integration of Remote Sensing and Social Media Data in Urban Studies

Spatial Data Infrastructure Concepts and Components. Douglas Nebert U.S. Federal Geographic Data Committee Secretariat

The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes

Three-Dimensional Visualization of Activity-Travel Patterns

Contribution of Swiss mountain lakes to the well-being of tourists and day-trippers

LISBON Fragments of a complex City

Data Mining II Mobility Data Mining

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence

Understanding individual and collective mobility patterns from smart card records: A case study in Shenzhen

GIS for Crime Analysis. Building Better Analysis Capabilities with the ArcGIS Platform

Spatial Extension of the Reality Mining Dataset

Belfairs Academy GEOGRAPHY Fundamentals Map

Cutting Edge Engineering for Modern Geospatial Systems Rear Admiral Dr. S Kulshrestha, retd

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Think Global, Map Local! Introducing the. Open Green Map

Dynamic Maps and Historical Context

Research Group Cartography

Cell-based Model For GIS Generalization

Explore. history landscapes family.

TRAITS to put you on the map

2013 AND 2025 THE FUTURE OF GIS

The Future of Geography in an. Society. Michael F. Goodchild University of California Santa Barbara

Towards Geographic Information Observatories

Global 3D Models with Local Content

An Implementation of Mobile Sensing for Large-Scale Urban Monitoring

Exploring Urban Areas of Interest. Yingjie Hu and Sathya Prasad

From the Venice Lagoon Atlas towards a collaborative federated system

The GapVis project and automated textual geoparsing

The Recognition of Temporal Patterns in Pedestrian Behaviour Using Visual Exploration Tools

Gistat: moving towards a location information management system

European Solar Days May Report Spain

A Simple Procedure for Estimating Loss of Life from Dam Failure. Wayne J. Graham, P.E. 1

Transcription:

Assessing pervasive user-generated content to describe tourist dynamics Fabien Girardin, Josep Blat Universitat Pompeu Fabra, Barcelona, Spain {Fabien.Girardin, Josep.Blat}@upf.edu Abstract. In recent years, the large deployment of mobile devices has led to a massive increase in the volume of records of where people have been and when they were there. The analysis of these spatio-temporal data can supply highlevel human behavior information valuable to urban planners, local authorities, and designers of location-based services. This paper proposes the study of publicly available people generated geo-referenced content to provide novel perspectives on tourist dynamics. Our initial works analyzed these digital footprints people leave behind them as a historic of physical presence when they visit a city. The results provided insights on the density of tourists, the points of interests they visit as well as the most common trajectories they follow. Yet to be able to fully analyze these newly available data, there is a need to understand the diverse circumstances they were generated from. We believe that the understanding of the human practice behind these data and their relation to the urban space could open a new perspective in analyzing tourism. Introduction The recent emergence of location technologies and techniques favoured the development of new approaches to capture and analyze people s mobility (see [1], 2004 for a survey). One opportunity has been to replace traditional travel diaries, paper-and-pencil interview, computer-assisted telephone interviews, and computerassisted-self-interview, by automatically collecting mobility data. Yet these automatic location sensing techniques open new privacy, scalability and longevity issues that limit their deployment. In this paper, we show that volunteered geographic information [2] can be at the source of the solution to these issues to capture tourists mobility and behaviours during their visits. Recent research showed the potential of the geographically annotated material available on the Web. For instance, some scholars showed that the location and time metadata associated with photos and their tags enable the extraction of place and event semantics [3]. In this work we exploit a similar dataset in the aim of validating a new approach to describe tourist dynamics. A first part of this paper report on our preliminary work on the problematic of revealing the presence of tourists during their visits of a city. In the second part of this paper, we suggest that these results must be first assessed with the analysis of how

2 Fabien Girardin, Josep Blat volunteers handle and annotate their measurements and observations (e.g semantic description in geotagging, granularity in georeferencing). Understanding tourist dynamics Our approach takes advantage of the unprecedented amount of digital data linked to the physical world generated by the recent explosion in the use of capture devices (e.g. digital cameras) and collaborative web platforms to share their content (see [4] for a review). We produced statistics, geovisualizations and animations to reveal the promises of exploiting user-generated content in urban tourism [5]. Our early case studies took place in the Province of Florence (Figure 1) as well as in the cities of Rome, Italy and Barcelona, Spain. The local authorities of these regions aim at better understanding the tourist flows traveling across their boundaries. So far, they have been using classical survey-based hotel and museums frequentation data to know where tourists of different nationalities prefer to spend their time, hence money. However, they lack observations of the mobility, nationality and quantity of the day trippers, that is the tourists who visit but are "invisible" in the data, as they do not sleep in town. Fig. 1. Heatmaps revealing the presence of photographers from their accumulated georeferenced photos, in the Northern part of central Italy, in downtown Florence and around the Basilica di Santa Maria dal Fiore. Photographic imagery copyright Telespazio. We retrieved large amounts of georeferenced photos taken by thousands photographers and shared via the popular photo-sharing web platform Flickr 1 (see Table 1 for the figures). Our approach took advantage of open and freely-available resources and combines them using de facto standards often based on the extensible Markup Language (XML). We used Google Earth for interactive visual synthesis of encodings generated using a combination of MySQL for data storage and querying to select and aggregate. We developed a software named Urban Dynamics to access, process, transform, cluster, sample, filter the raw data stored in the database and to generate geovisualizations. 1 http://www.flickr.com

Assessing pervasive user-generated content to describe tourist dynamics Region Barcelona Province of Florence Rome Number of photos 154,106 81,017 144,501 3 Number of photographers 5818 4280 6018 Table 1. Accumulated number of georeferenced photos and number of photographers over a 2 years period (2005-2007) in the three regions of the case study. Based on the time and the disclosed location of the photos, we extracted records of the people presence and movements such as density of tourist, inbound and outbound trajectories, patterns of flow between points of interest; performed statistical analysis and designed geovisualizations. This exploratory visual analysis was used as a mean of preliminary investigation. In fact, the results go far beyond the initial expectations of collecting clues on day trippers activities. The study disclosed new insights for tourism officials, such as the density of the flows of different types of visitors among the main attractions of the city (Figure 2). Fig. 2. Geovisualiation of the main paths tourists took to visit the points of interests of the city. 753 Italian visitors (top) are activity off the beaten tracks with a large amount of points of interets visited but little flow. 675 American visitors (bottom) stay on a narrow desire line. Photographic imagery copyright Telespazio. A user-centered approach to assess the data At this stage, it can be hypothesized that uploading, tagging and making public the location of a photo can be interpreted as a report of a physical presence in space and

4 Fabien Girardin, Josep Blat time. However, the data collected suffer similar problems as automatic sensors when it comes to the fluctuating quality in the data and trust. For instance, the city and type of urban landscape impact the use of coarse and fine-grained location information [6]. In addition, at a very general and caricatural level, our early results suggest that German users tend to provide more accurate location information to their photos than Spaniards or that Europeans use more tags than North Americans. These observations highlight the need to understand how people employ the accuracy of location information to link digital information with the physical world. Understanding the practice In consequence, prior to assess the significance of user-generated content for tourism, one must first understand people s practices of annotating content with geographical information. We are currently working on providing answers in that direction to the question: Which are the people different georeferencing and geotagging behaviours? We investigate this domain with the following sub-questions and motivations. Domain: Sub-question Semantic: Which ways do people use to label (i.e. geotag and describe) information? Granularity: How do people distinguish and define the levels of granularity to georeference the information? Disclosure: How does automatic positioning influence location disclosure? Co-evolution: How do the practices of geotagging and georeferencing evolve over time? Motivation and approach Identify and describe the different strategies of geotagging, by correlating the use of main tags categories to profile the users and the space Identify and describe the different strategies of georeferencing by comparing the categories of tags used and the location accuracy. Profile the types of traces and their accuracy. Describe the impact of automatic georeferencing of information on the geotagging behaviour. For instance, the automatic georeference through GPS could generate a poorer labelling of the information. Practices might not be static. Describe how they change over time. Users might move from one type of behaviour to another. In that case we would need to describe the circumstances of these changes. This analysis of the user practice will assess a theoretical framework that describes the human agency in the production of geo-annotated photos. It will be used to study the significance of this user-generated content in urban, tourism, mobility and travel research.

Assessing pervasive user-generated content to describe tourist dynamics 5 Method The georeferenced photos stored in Flickr come with an extensive amount of empirical data about themselves (e.g. nationality, type of camera) the language and words the users employ to make public the information location as well as their strategies (e.g. photos uploaded in one batch, accuracy) to do so. The analysis builds on these qualitative data with the generation of categories (e.g. users' perspectives, process, activity, strategy) and positioning it within a theoretical model to describe practices to georeference information. Statistical, data mining and information visualization techniques are used to extract a deeper meaning from individual and collective perspectives in terms of semantic, granularity, disclosure and co-evolution. The validation of the observations of categories and patterns takes place with the comparison within multiple cities and cultures. Calibration of user-generated content The accumulated and aggregated records of where and when people were seem to lead to an improved understanding of different aspects of mobility and travel. However these new insights only mirror a specific community of Flickr users. Therefore, there is a strong need to examine their significance in comparison with existing mobility and activity data available in cities. We are currently developing an extension to our current Urban Dynamics application to explore multiple datasets to calibrate and validate the results with secondary order presence, survey and mobility data (e.g. cellphone network usage, tickets sold, manual counting, and Bluetooth scanning). Due to the large difference in the nature of the activity producing the data we compare, it might be that correlating different datasets does not only reinforce observations, but also reveals additional dimensions of user behavior that we might not yet have accounted for. For instance, the presence of photos in one area suggest a space for sightseeing. On the other and a strong cellphone network activity of roamers indicate another type of tourist activity such as relaxing. A current challenge is to understand more precisely the user-activity that is reflected in each of these types of datasets. Conclusion The explosion in the use of mobile and captures devices (e.g. mobile phone, digital cameras) and the emergence of content sharing platforms is leading to the emergence of a wealth of publicly available user-generated geospatial data. Our first case study specifically featured the value of Flickr and its geographically reference photos with the goal of performing urban and mobility analysis of visitors. This exploratory analysis enabled to quantify by the amount of photos taken and the presence of individuals the attractiveness over time of the major points of interests of an urban space. This work shows that there might be potential in taking advantage of the digital traces people constantly leave behind them. It could, for instance, reveal the temporal

6 Fabien Girardin, Josep Blat character of a space, its attractiveness among a certain group of people, and its level of connectivity with other spaces. This type of insight is limited in most travel survey and sensing infrastructure by privacy-sensitive or aggregated information. Yet these insights have a fluctuating quality as they rely on data generated and disclosed by people. We have observed differences in the accuracy of the location information depending on the region they describe or the cultural background of the volunteers who generated the content. In this paper we have suggested that a detailed study of the individual and social practices behind these data is necessary in order to assess the relevance of the user-generated content to inform on tourist dynamics. References 1. Wolf, J. Applications of new technologies in travel surveys. In 7th International Conference on Travel Survey Methods, Costa Rica. (2004). 2. Goodchild, M. F. Citizens as voluntary sensors: Spatial data infrastructure in the world of web 2.0. International Journal of Spatial Data Infrastructures Research 2 (2007), 24 32. 3. Rattenbury, T., Good, N., and Naaman, M. Towards automatic extraction of event and place semantics from flickr tags. In SIGIR 07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA, 2007), ACM Press, pp. 103 110. 4. Torniai, C., Battle, S., and Cayzer, S. Sharing, discovering and browsing geotagged pictures on the web. Tech. rep., HP Labs, 2007. 5. Girardin, F., Fiore, F. D., Ratti, C., and Blat, J. Leveraging explicitly disclosed location information to understand tourist dynamics: A case study. Journal of Location-Based Services. In print (2008). 6. Girardin, F., and Blat, J. Place this photo on a map: A study of explicit disclosure of location information. Late Breaking Result at Ubicomp 2007, September 2007.