Geographic Knowledge Discovery Using Ground-Level Images and Videos

This work was funded in part by a DOE Early Career award, an NSF CAREER award (#IIS- 1150115), and a PECASE award. We gratefully acknowledge NVIDIA for their donated hardware. Geographic Knowledge Discovery Using Ground-Level Images and Videos First Workshop on GeoAI: AI and Deep Learning for Geographic Knowledge Discovery November 7, 2017 Shawn Newsam Associate Professor and Founding Faculty Electrical Engineering & Computer Science, University of California, Merced Ling Xie Daniel Leung Yi Zhu Eduardo Hernandez Aaron San Jose Xueqing Deng

Images (Pixels) and Location Overhead imagery Ground-level images and videos

Geographic Knowledge Discovery Using Ground-Level Images and Videos Alignment of geo-problem with computer vision problem. Sufficient training data? Evaluation: often no ground truth available since trying to map unmapped phenomena. My projects span emergence of deep learning as prominent computer vision approach.

Rewind to ~2009 Volunteered Geographic Information (VGI) + Georeferenced Social Multimedia = (Georeferenced) Social Multimedia as VGI

Volunteered Geographic Information Wikipedia: Volunteered Geographic Information (VGI) is the harnessing of tools to create, assemble, and disseminate geographic data provided voluntarily by individuals (Goodchild, 2007).

VGI: OpenStreetMap Sept. 2009 OpenStreetMap is a free editable map of the whole world. It is made by people like you.

GoogleMaps Sept. 2009

VGI: Pop vs. Soda

VGI: Citizen Science: Christmas Bird Count

VGI: Citizen Science: Did You Feel It?

VGI: Geograph Great Britain and Ireland The Geograph Britain and Ireland project aims to collect geographically representative photographs and information for every square kilometre of Great Britain and Ireland, and you can be part of it. Drainage Ditch, Seal Sands Road View east along a drainage ditch alongside the road to the Seal Sands petrochemical works. Since 2005, 12,781 contributors have submitted 5,563,983 images covering 277,440 grid squares, or 83.5% of the total squares

VGI: Flickr 178,016,405 geotagged items (Mar. 14, 2012)

Social Multimedia as VGI I stipulate that georeferenced social multimedia can be considered as a form of VGI. It is a often a serendipitous form since its original purpose might not have been for geographic discovery. Research challenge is how to extract useful geographic information in an automated fashion.

Social Multimedia as VGI Leveraging large collections of georeferenced, community contributed photographs can help solve three knowledge-discovery problems: 1. annotating novel images, 2. annotating geographic locations, and 3. performing geographic discovery. S. Newsam, Crowdsourcing what is where: Community-contributed photos as volunteered geographic information, IEEE Multimedia: Special Issue on Mining Community-Contributed Multimedia, 17(4), pp. 36-45, 2010.

Social Multimedia as VGI: 1a. Annotating Novel Images

Social Multimedia as VGI: 1b. Annotating Novel Images

Social Multimedia as VGI: 2. Annotating Geographic Locations

Social Multimedia as VGI: 3. Geographic Discovery

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 1: Mapping Developed and Undeveloped Regions

Project 1: Mapping Developed and Undeveloped Regions community-contributed photos (Geograph Britain and Ireland project) Land Cover Map 2000 (UK Centre for Ecology & Hydrology)?

Project 1: Mapping Developed and Undeveloped Regions community-contributed photos (Geograph Britain and Ireland project)

Project 1: Mapping Developed and Undeveloped Regions training images label images feature extraction train classifier fraction developed map feature extraction classify target images aggregate labels in 1x1 km tiles target images binary classification map

Project 1: Mapping Developed and Undeveloped Regions Image Dataset 1: Flickr Downloaded 920K Flickr. Distribution for 1x1 km tiles shown to left (log10 scale). 5,420 tiles contain no Flickr images. 4,580 tiles contain average of 200, median of 10, and maximum of 53,840 images.

Project 1: Mapping Developed and Undeveloped Regions Image Dataset 1: Geograph Downloaded 120K images from the Geograph Britain and Ireland project Distribution for 1x1 km tiles shown to left (log10 scale). Only 614 tiles without images. 9,386 tiles contain average of 13, median of 5, and maximum of 1,458 images.

Project 1: Mapping Developed and Undeveloped Regions Image Features

Project 1: Mapping Developed and Undeveloped Regions Results Ground Truth Maps Maps Generated Using Flickr Images Maps Generated Using Geograph Images

Project 1: Mapping Developed and Undeveloped Regions Investigated: Photographer intent: Flickr vs. Geograph images Manual vs. unsupervised learning Filtering (removing) uninformative photos

Project 1: Mapping Developed and Undeveloped Regions D. Leung and S. Newsam, Proximate Sensing: Inferring What-Is-Where From Georeferenced Photo Collections, IEEE International Conference on Computer Vision and Pattern Recognition, 2010 (oral presentation).

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 2: Mapping Scenicness

Project 2: Mapping Scenicness Scenic or not website

Project 2: Mapping Scenicness

Project 2: Mapping Scenicness Results

Project 2: Mapping Scenicness Investigated: Gist image features. Graph Laplacian semi-supervised learning to exploit unlabeled images to improve accuracy of regressor. Regressor based on composite visual-geographic location kernel which considers both the visual and characteristics and the geographic locations of the images.

Project 2: Mapping Scenicness L. Xie and S. Newsam, IM2MAP: Deriving maps from georeferenced community contributed photo collections, ACM International Conference on Multimedia: Workshop on Social Media, 2011.

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 3: Mapping Land Use

Project 3: Mapping Land Use

Project 3: Mapping Land Use Results

Project 3: Mapping Land Use Investigated: Convolutional neural networks for feature extraction. CNNs trained on 7 million Places labeled dataset. Integrating indoor/outdoor classification. Semi-supervised learning through keyword search on Flickr to augment training dataset.

Project 3: Mapping Land Use Y. Zhu and S. Newsam, Land use classification using convolutional neural networks applied to ground-level images, ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS), 2015. (Winner best poster award.)

Project 3: Mapping Land Use D. Leung and S. Newsam, Can off-the-shelf object detectors be used to extract geographic information from geo-referenced social multimedia? ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS): Workshop on Location Based Social Networks, 2012.

Project 3: Mapping Land Use Extending to fine-grain city-scale land use classification

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 4: Mapping Public Sentiment

Project 4: Mapping Public Sentiment Considered six emotions: Anger, disgust, fear, joy, sadness, and surprise. Used convolutional neural networks to label individual images. Applied to 1.7M images of San Francisco over ten year period from 2006 to 2015. Performed spatial and spatio-temporal hotspot detection.

Project 4: Mapping Public Sentiment

Project 4: Mapping Public Sentiment Results: Disgust spatial hotspots (red=high, blue=low)

Project 4: Mapping Public Sentiment Results: Joy spatial hotspots (red=high, blue=low)

Project 4: Mapping Public Sentiment Results: Joy Spatio-temporal

Project 4: Mapping Public Sentiment Results: Joy Localized Temporal Trend AT&T park, home of the SF Giants baseball team. Numbers at top of bars indicate end-of-season rankings of team. Notice the correlation!

Project 4: Mapping Public Sentiment Results: Disgust Localized Temporal Trend Mission neighborhood in SF. Left is disgust ratio by year. Right is median house price by year. Notice the correlation!

Project 4: Mapping Public Sentiment Y. Zhu and S. Newsam, Spatio-temporal sentiment hotspot detection using geotagged photos, ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS), 2016. (Best fast forward presentation runner up.)

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 5: Mapping Pet Ownership

Project 5: Mapping Pet Ownership Convolutional neural network to detect cats or dogs in images. 1+ million Flickr images of San Francisco. Compared image-based detection with textbased detection using user-supplied tags.

Project 5: Mapping Pet Ownership Results Image-based resulted in twice as many detections as text-based. More dog detections. Similar spatial distributions for image- and text-based. Dogs detected where there are parks, cats detected in residential areas. Dog detections more concentrated.

Project 5: Mapping Pet Ownership Winner of undergraduate category of ACM Student Research Competition at SIGSPATIAL 2016.

Geographic Knowledge Discovery Using Ground-Level Images and Videos Project 6: Mapping Human Activity

Project 6: Mapping Human Activity Eight sports: baseball, basketball, football, golf, racquetball, soccer, swimming, and tennis. Also, parade and street fight. 265K georeferenced YouTube videos of San Francisco. Convolutional neural network with both spatial and temporal streams.

Project 6: Mapping Human Activity

Project 6: Mapping Human Activity Results

Project 6: Mapping Human Activity Y. Zhu, S. Liu, and S. Newsam, Large-Scale Mapping of Human Activity using Geo-Tagged Videos, ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS), 2017. Please visit Yi s poster on Wednesday evening for more information

Geographic Knowledge Discovery Using Ground-Level Images and Videos Opportunities: Map phenomenon not observable through other means. Challenges: Complex, noisy data. Uneven spatial distribution of data. Obtaining labeled training data particularly for deep learning-based methods (don t need location of training data though). Lack of ground truth for evaluation.

Geographic Knowledge Discovery Using Ground-Level Images and Videos Future directions: Integrating other georeferenced social media such as Twitter, Facebook, etc. Continue to exploit advances in computer vision. Exploit spatial correlations, etc. Use geo-problems to motivate computer vision research.

Acknowledgements Ling Xie Daniel Leung Yi Zhu Eduardo Hernandez Aaron San Jose Xueqing Deng This work was funded in part by a DOE Early Career award, an NSF CAREER award (#IIS-1150115), and a PECASE award. We gratefully acknowledge NVIDIA for their donated GPU hardware.