Exploring Urban Areas of Interest Yingjie Hu and Sathya Prasad
What is Urban Areas of Interest (AOIs)?
What is Urban Areas of Interest (AOIs)? Urban AOIs exist in people s minds and defined by people s behaviors Essentially fuzzy Different people may have different opinions Can we identify urban AOIs generally agreed by most people? How?
One possible data source: remote sensing data Unfortunately, remote sensing data don t record people s interests. A RS image of Shanghai A photo of Shanghai
Another possible data source: questionnaire survey Labor intensive and time consuming Please tell us the areas you consider interesting in the city, and draw them on the map.
Social media data Social media data provide records for people s interactions with the urban environment. They can be efficiently retrieved from public APIs. Many social media data contain location information. - Geotagged Tweets - Geotagged Flickr photos - Foursquare checkins -
Project Develop an automation workflow to extract AOIs from social media Show the evolution of AOIs in different cities in different years Understand AOIs Explore potential applications of AOIs
Why Flickr data? Reflect locations people consider interesting Cover a timespan of the past 10 years Publicly available through APIs Large number of users (around 100 million users)
Which parts of Flickr data are used in this project General metadata: locations, time, photo id, owner id, server id, Text tags: what are people talking about here? Photos: what are people looking at here?
Project Stage 1: data retrieval Cities: New York, London, Paris, Shanghai, Mumbai, Dubai Timespan: 2004-2014 Method: Flickr public API City # User # photo New York 2,751 2,761,542 London 2,357 2,876,013 Paris 3,019 1,456,298 Shanghai 1,775 254,123 Mumbai 1,901 55,532 Dubai 2,176 89,457
Project Stage 2: extracting AOIs from Flickr Data Goal: identifying AOIs (clusters) from photo locations (points) Data: Flickr metadata, including locations, time, user id, Challenges of Flickr data: - Biased: not representative of entire population; active users vs inactive users. - Noisy: errors exist in the user-specified locations. - Varied: different years and cities may have very different numbers of photos.
Project Stage 2: extracting AOIs from Flickr Data How do we handle these challenges: - Bias issue: reducing bias by removing additional photos taken by the same user within a radius of 200 meters. - Noise issue: choosing a clustering algorithm that is robust to data noise. - Variation issue: detecting AOIs based on the percentage of people.
Project Stage 2: extracting AOIs from Flickr Data Method: - DBSCAN (Density Based Spatial Clustering of Applications with Noise) - Advantages of DBSCAN to this problem - Doesn t require a pre-determined k - Clusters can be any arbitrary shape - Robust to noise K-means DBSCAN K-means DBSCAN
Project Stage 2: extracting AOIs from Flickr Data Two parameters of DBSCAN: - search radius: - Larger radius will produce larger clusters - We choose 200 meters for extracting neighborhood level AOIs - minimum number of points within the radius: - The minimum requirement for a cluster to be formed - Larger number will produce fewer clusters - We choose 2% of all Flickr users based on experiments - Two parameters together determine the meaning of AOI - In this project, AOIs are city regions that have been visited by at least 2% of all people who have taken photos in that city and in that year.
Project Stage 2: extracting AOIs from Flickr Data Applying DBSCAN to extract clusters Using Chi-shape algorithm to form concave hull User Percentage DBSCAN Chi-shape
Project Stage 2: extracting AOIs from Flickr Data Convex hull vs. Concave hull Convex hull Concave hull
Project Stage 3: understanding AOIs What are people talking about in these AOIs? Data: text-based tags attached to photos Challenge: some tags are common to many AOIs - E.g., Paris and France are very common to AOIs in the city of Paris Goal: highlight the tags that can characterize the local AOI, while reducing the common text descriptions. Method: term-frequency and inverse document frequency (TF-IDF)
Project Stage 3: understanding AOIs What are people talking about in these AOIs? Two examples produced by the algorithm: Eiffel tower area, Paris, 2014 Time Square area, New York, 2005
Project Stage 3: understanding AOIs What are people looking at in these AOIs? Data: Flickr photos Challenge: - Photos taken in an area have random qualities - A huge number of photos to process - 1,456,298 in Paris - 2,761,542 in New York -
Project Stage 3: understanding AOIs What are people looking at in these AOIs? Goal: automatically select the photos that can represent the preferable views from most people, while removing more personal photos. Method: - Human face detection - Image similarity comparison - Image clustering
Project Stage 3: understanding AOIs What are people looking at in these AOIs? An example of how the algorithms work Face detection Representative image Noisy image
Demo http://maps.esri.com/sp_demos/urbanaois
Project Summary Developed a reusable and automated software program that can be applied to point datasets in different domains The derived AOIs are objective and based on crowdsourcing data (for different cities in different years) Revealed the historical evolution patterns of AOIs over space and time The derived AOIs can be used in geodesign, location analytics, and other applications. The developed application is online, and can be accessed.