Automatically generating keywords for georeferenced images

Similar documents
Comparing Flickr tags to a geomorphometric classification. Christian Gschwend and Ross S. Purves

Measuring topographic similarity of toponyms

GIS and Remote Sensing

Digitization in a Census

Citation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.

Canadian Board of Examiners for Professional Surveyors Core Syllabus Item C 5: GEOSPATIAL INFORMATION SYSTEMS

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME:

Grant Agreement No. EIE/07/595/SI BEn

Spatializing time in a history text corpus

A General Framework for Conflation

ADDING METADATA TO MAPS AND STYLED LAYERS TO IMPROVE MAP EFFICIENCY

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research

Introduction INTRODUCTION TO GIS GIS - GIS GIS 1/12/2015. New York Association of Professional Land Surveyors January 22, 2015

Spatial Data Science. Soumya K Ghosh

Spatial Information Retrieval

How to Construct Urban Three Dimensional GIS Model based on ArcView 3D Analysis

Land-Line Technical information leaflet

From Research Objects to Research Networks: Combining Spatial and Semantic Search

Introduction to GIS I

Improving Map Generalisation of Buildings by Introduction of Urban Context Rules

SPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM

Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata

Why Is Cartographic Generalization So Hard?

The Combination of Geospatial Data with Statistical Data for SDG Indicators

An Ontology-based Framework for Modeling Movement on a Smart Campus

Remote Sensing and GIS Applications for Hilly Watersheds SUBASHISA DUTTA DEPARTMENT OF CIVIL ENGINEERING IIT GUWAHATI

ENGRG Introduction to GIS

Assessing pervasive user-generated content to describe tourist dynamics

Creating a Geodatabase for Tourism Research in the Sultanate of Oman

GIS Applications on Environmental Education in Taiwan

NR402 GIS Applications in Natural Resources

ENGRG Introduction to GIS

Spatial Decision Tree: A Novel Approach to Land-Cover Classification

Introduction to Geographic Information Science. Updates/News. Last Lecture 1/23/2017. Geography 4103 / Spatial Data Representations

GIS for ChEs Introduction to Geographic Information Systems

Urban-Rural spatial classification of Finland

Globally Estimating the Population Characteristics of Small Geographic Areas. Tom Fitzwater

Principles of IR. Hacettepe University Department of Information Management DOK 324: Principles of IR

Automated Assessment and Improvement of OpenStreetMap Data

INTEGRATION OF HIGH RESOLUTION QUICKBIRD IMAGES TO GOOGLEEARTH

Advanced Landcover Prediction and Habitat Assessment (ALPHA) 1.0 metadata

Beyond points: How to turn SMW into a complete Geographic Information System

GEOGRAPHIC INFORMATION SYSTEMS

Techniques for Science Teachers: Using GIS in Science Classrooms.

Southern African Land Cover ( )

SRJC Applied Technology 54A Introduction to GIS

ACCURACY ASSESSMENT OF ASTER GLOBAL DEM OVER TURKEY

Spanish national plan for land observation: new collaborative production system in Europe

A Basic Introduction to Geographic Information Systems (GIS) ~~~~~~~~~~

TRANSFORMATION THROUGH CLC WITH THE CONTINUOUS RESEARCH TECHNIQUES - GIS (OPEN CODE) AND RS (GEO-WEB SERVICES)

Digital Elevation Models (DEM) / DTM

GeoSUR SRTM 30-m / TPS

2.2 Geographic phenomena

Basics of GIS. by Basudeb Bhatta. Computer Aided Design Centre Department of Computer Science and Engineering Jadavpur University

DEVELOPMENT OF GPS PHOTOS DATABASE FOR LAND USE AND LAND COVER APPLICATIONS

REVIEW MAPWORK EXAM QUESTIONS 31 JULY 2014

Mapping Historical Information Using GIS

Boreal Fen probability - metadata

Identification of Very Shallow Groundwater Regions in the EU to Support Monitoring

Boreal Wetland probability - metadata

Challenges in Geocoding Socially-Generated Data

ENV208/ENV508 Applied GIS. Week 1: What is GIS?

Digital Elevation Models (DEM) / DTM

An Internet-based Agricultural Land Use Trends Visualization System (AgLuT)

INTRODUCTION TO GIS. Dr. Ori Gudes

CHAPTER 1 THE UNITED STATES 2001 NATIONAL LAND COVER DATABASE


VISUALIZATION URBAN SPATIAL GROWTH OF DESERT CITIES FROM SATELLITE IMAGERY: A PRELIMINARY STUDY

Advanced Algorithms for Geographic Information Systems CPSC 695

THE USE OF GEOMATICS IN CULTURAL HERITAGE AND ARCHAEOLOGY FOR VARIOUS PURPOSES

Digital Elevation Models. Using elevation data in raster format in a GIS

Deutsche Gesellschaft für Kartographie e.v.

Sustainable and Harmonised Development for Smart Cities The Role of Geospatial Reference Data. Peter Creuzer

ISO Land Use 1990, 2000, 2010 Data Product Specifications. Revision: A

Quality information for the Digital Agenda EuroGeographics contribution to the Digital Agenda for Europe

What is GIS? Introduction to data. Introduction to data modeling

DATA SOURCES AND INPUT IN GIS. By Prof. A. Balasubramanian Centre for Advanced Studies in Earth Science, University of Mysore, Mysore

Geo372 Vertiefung GIScience. GIS and the internet

Steve Pietersen Office Telephone No

Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies

UC Berkeley International Conference on GIScience Short Paper Proceedings

A Case Study for Semantic Translation of the Water Framework Directive and a Topographic Database

The Architecture of the Georgia Basin Digital Library: Using geoscientific knowledge in sustainable development

Comparison in terms of accuracy of DEMs extracted from SRTM/X-SAR data and SRTM/C-SAR data: A Case Study of the Thi-Qar Governorate /AL-Shtra City

Regional GIS Presentation for Small and Large Jurisdictions. Michelle E. Fults GIS Manager January 8, 2009

GLL THE STUDY OF METHODS FOR CORRECTING GLOBAL DIGITAL TERRAIN MODELS USING REMOTE SENSING DATA. I. Kolb, M. Lucyshyn, M. Panek

Evaluation of land surface temperature (LST) patterns in the urban agglomeration of Krakow using different satellite data and GIS

Indexing Structures for Geographic Web Retrieval

IV Course Spring 14. Graduate Course. May 4th, Big Spatiotemporal Data Analytics & Visualization

Deriving place graphs from spatial databases

Site Suitability Analysis for Local Airport Using Geographic Information System

Data Creation and Editing

Web Portal to European Soil Database

Version 1.1 GIS Syllabus

Machine Learning to Automatically Detect Human Development from Satellite Imagery

Using spatial-temporal signatures to infer human activities from personal trajectories on location-enabled mobile devices

Chapter 5. GIS The Global Information System

How the science of cities can help European policy makers: new analysis and perspectives

Software. People. Data. Network. What is GIS? Procedures. Hardware. Chapter 1

USING HYPERSPECTRAL IMAGERY

Transcription:

Automatically generating keywords for georeferenced images Ross S. Purves 1, Alistair Edwardes 1, Xin Fan 2, Mark Hall 3 and Martin Tomko 1 1. Introduction 1 Department of Geography, University of Zurich, Switzerland ross.purves@geo.uzh.ch; ali.edwardes@gmail.com; martin.tomko@geo.uzh.ch 2 Department of Information Studies, University of Sheffield, UK x.fan@sheffield.ac.uk 3 Department of Computing Science, Cardiff University, UK M.M.Hall@cs.cardiff.ac.uk KEYWORDS: georeference, image, keyword, indexing, caption The possibility of automatically annotating information objects such as images with metadata related to their geography is increasingly relevant, as larger volumes of such information are captured with explicit georeferences (e.g. Flickr images). Within the Tripod project research has been carried out as to the extent to which we can automatically derive keywords and captions for images which are captured by cameras capable of automatically storing coordinates and azimuth information, that is to say not only the position from which a picture was taken, but also the direction in which the camera was pointing. Improved tagging of images can improve search, and particularly in commercial contexts, reduce annotation costs which are an important part of the work of image libraries (Iwasaki et al., 2008). In previous work we have described how the Pansofsky-Shatford facet matrix (Table 1) formalises ways in which we might consider carrying out this task (Purves et al., 2008). The matrix has three levels, termed the specific of, the generic of and about. Each level has four associated facets: who, what, where and when. In GIScience the where facet is of particular interest, with the where/ specific of element referring to how we name locations, in other words the toponyms that we assign to a particular location. The where/ about element relates to the qualities that a location might manifest for us, thus, for example, the degree of remoteness or the warmth conveyed by an image. The subject of this paper is though the where/ generic of element, that is to say generic concepts such as church, village and beach. Table 1. The Pansofsky-Shatford facet matrix (Shatford (1986), p. 49) Facets Specific Of Generic Of About Who? Individually named persons, animals, things Kinds of persons, animals, things Mythical beings, abstraction manifested or symbolised by objects or beings What? Individually Actions, Emotions, abstractions Where? When? named events Individually named geographic locations Linear time; dates or periods conditions Kind of place geographic or architectural Cyclical time; seasons, time of day manifested by actions Places symbolised, abstractions manifest by locale Emotions or abstraction symbolised by or manifest by In this paper we set out to describe the stages involved in automatically generating keywords for georeferenced images, and illustrate the methods applied through a small number of examples, before discussing further work which we will report at GISRUK.

2. Methods In order to assign keywords to images based on location, a number of calculations must be carried out. Within the Tripod project we first identify the viewshed of an image, based on the camera parameters, camera location, azimuth and terrain in rural areas. In urban areas the camera location is simply buffered using an empirically derived distance to form an image sector. Having generated a visible geometry, we then sample a variety of spatial datasets and explore the classes found within these datasets. Since different datasets have very different contents, we use a concept ontology (Edwardes et al., 2007) to map between dataset instances and concepts that are used by keywords. Lastly, the complete list of candidate concepts is ranked and filtered to give a final set of candidate keywords. 2.1 Identifying visible area The first stage in identifying the visible area for a particular camera is the extraction of device metadata from the EXIF header of the image. This metadata contains information about the camera settings (e.g. focal length) and is used to determine, for example, the angular width of the visible area. We first identify urban images using landcover data, and then assume that in such regions if a building is visible in the image, it is proximal and assign a buffer of 50m to the sector identified. We identify buildings in images using a simple content-based building detection algorithm. In rural areas, where terrain plays a much more important role in specifying the visible area, we calculate the viewshed for the angularly restricted sector identified from the camera metadata using standard methods and Shuttle Radar Topography Mission (SRTM) DEM data with a resolution of 90m (Figure 1). In previous work we explored the sensitivity of the buffer size and the distance over which viewsheds were calculated, particularly with respect to visibility of point-like objects and found that thresholds of around 1000m were sensible for objects with a width of ~25m (Tomko et al., 2009). Our approach assumes that urban viewsheds are limited by objects, rather than terrain, which is clearly an oversimplification in some cases. 2.2 Linking spatial data to concepts The underlying hypothesis in Tripod is that spatial data will describe many aspects of an image taken at some location, assuming that its focus is somehow geographic. A key challenge is linking multiple datasets to concepts in an extensible way which allows the integration of multiple datasources, and generates keywords which are not specific to individual datasets. The use of different datasets allows the assignment of different types of concepts at different scales. Thus, for example, we can integrate land cover data and topographic data from different National Mapping Agencies or OpenStreetMap. Keywords are assigned by mapping individual items in a dataset to concepts in an ontology, which was generated by exploring user-generated content such as Geograph and contains a range of relationships between concepts (Edwardes et al., 2007). Thus, multiple dataset items might map to the same concept and we can introduce new data at any time to the system. We identify candidate concepts by, in the case of topographic data, producing very high resolution raster representations (typically ~1-5m) where individual footprints are based on estimates of the realworld size of individual classes of objects. All dataset items are assigned unique values, and by intersecting visible areas with concept representations, a list of candidate concepts and their relative areas with respect to the visible area are generated (Figure 1).

Figure 1. Spatial data for a visible area corresponding to the second image shown in Figure 2 SwissTopo 1:200 000 and Corine data are shown here note the viewshed is restricted to a range of 1.5km and is calculated using SRTM 90m data 2.3 Ranking and filtering candidate concepts The simplest way to rank concepts would be simply to use their relative areas. However, this ignores several important aspects. Firstly, not all data are spatially contiguous, and thus landcover-derived concepts will automatically float to the top of any such ranking. Secondly, the salience of a particular concept in a scene is likely to be related to not only its spatial footprint, but also its overall rarity with respect to the scene, and thus its descriptive prominence (Tomko and Purves, 2009). Thus, a tree in the Sahara desert should be assigned more weight than a tree in central Switzerland. Finally, the web provides us with a potential means of assessing how commonly a particular concept is used in a region. By querying the web with toponyms assigned to the visible area through the process of reverse geocoding (e.g. Smart et al., 2009) and concepts, we can explore the web prominence of individual keywords. In the final ranking of concepts, we rank according to area, web prominence and descriptive prominence, filtering ubiquitous concepts which are not commonly used. By combining web and descriptive prominence, we reduce the importance of rare but uninteresting or unphotographed concepts. 3. Exemplar results and discussion Figure 2 illustrates results from the Tripod system for two exemplar images. Keywords were generated in the urban Edinburgh case using Corine landcover data and OpenStreetMap. In the Swiss rural case, keywords were generated using Corine landcover data and a SwissTopo 1:200000 dataset.

college, byway, village loch, meadows, moor, agriculture, shore, reservoir, forest Figure 2. Two images and top-ranked keywords the second image corresponds to Figure 1 For the first image, 2 out of the 3 keywords are appropriate (the image shows a school in Edinburgh) whilst the third is extracted from landcover data, where the possible concepts for discontinuous urban fabric are suburbs and village. In this case, village is clearly inappropriate. For the second image (of a lake in Switzerland) the set of keywords agree well with the image, apart from the somewhat incongruous use of loch. This is because loch s German meaning (hole) which causes it to be highly ranked with local toponyms by the web prominence algorithm. The complete system is designed to automatically generate candidate keywords for images which can be used in both indexing and search. Current work is evaluating the quality of these keywords for large collections, and will be reported on at GISRUK. 5. Acknowledgements This research reported in this paper is part of the project TRIPOD supported by the European Commission under contract 045335. We would also like to gratefully acknowledge contributors to Geograph British Isles, see http://www.geograph.org.uk/credits/2007-02-24, whose work is made available under the following Creative Commons Attribution-ShareAlike 2.5 Licence (http://creativecommons.org/licenses/by-sa/2.5/). Many thanks to the referees for their constructive comments which have improved this paper. References Edwardes, AS, Purves RS, Simone Bircher and, Christian Matyas. (2007). Deliverable 1.4: Concept ontology experimental report. Available at http://tripod.shef.ac.uk/outcomes/public_deliverables/tripod_d1.4.pdf Iwasaki, K, Kanbara, M, Yamazawa, K & Yokoya, N (2008). Construction of extended geographical database based on photo shooting history. In Proceedings of the 2008 International Conference on Content-Based Image and Video Retrieval, ACM, New York, NY, pp. 185-194. Purves, RS, Edwardes, AJ & Sanderson, M. (2008). Describing the Where improving image annotation and search through geography. First Intl. Workshop on Metadata Mining for Image Understanding (MMIU 2008). Smart P, Twaroch F, Tomko M and Jones, C (2009) Deliverable 6.5: Final Toponym Ontology Prototype. Available at http://tripod.shef.ac.uk/outcomes/public_deliverables/tripod_d6.5.pdf Shatford S (1986) Analyzing the subject of a picture: a theoretical approach. Catalogue and Classification Quarterly pp39 62. Tomko, M, Trautwein, F & Purves, RS (2009) Identification of Practically Visible Spatial Objects in Natural Environments. In Proceedings of AGILE 2009, Springer-Verlag, Vienna, Austria.

Tomko, M, & Purves, RS (2009) Venice, City of Canals: Characterizing Regions through Content Classification. Transactions in GIS, pp295-314. Biographies Ross Purves is a lecturer in Zurich. Mark Hall is a PhD student in Cardiff. Xin Fan and Martin Tomko are postdoctoral researchers in Sheffield and Zurich respectively, Alistair Edwardes has moved to new pastures in the Department of Communities and Local Government.