INTEGRATING AUTHORITATIVE AND VOLUNTEERED GEOGRAPHIC INFORMATION - AN ONTOLOGICAL APPROACH Crowd Sourcing in National Mapping Internship Funding ACTIVITY Workshop Leuven (Belgium) 14th May 2013 Jimena Martínez Ramos [jmr125@mun.ca]
2 Table of contents 1. Background 2. Problem 3. Objective 4. Proposed approach 5. Semantics in OSM datasets (ongoing work) 6. Conclusions and future work
3 Background The need of integrating Geographic Information from different sources National Mapping Levels Agencies of heterogeneity (NMAs) are likely to find difficult to justify the costs of traditional data maintenance mechanisms. System VGI projects (like OpenStreetMap) are growing and are seen as a good data source to be integrated with authoritative datasets. Syntactic There is growing need of integrating different data sources. Structural? Semantic interoperability is still an issue in the Semantic integration problem. (meaning of words) (*) http://ggim.un.org/ (UN Report, 2012)
4 The Problem Semantic Heterogeneity in Geographic Information Ogden and Richards, 1923 Different conceptualizations: Semantic Heterogeneities Words or symbols stand for things through ideas motorway trunk Symbol (freeway) Reality (road) turnpike
5 The Problem Semantic Heterogeneity in Geographic Information Post-process Matching ONTOLOGIES! symbols with the same meaning Pre-process Everybody STANDARDS! thinking the same trunk turnpike freeway Reality (road)
6 The Problem What are ontologies about? Explicit specification of a conceptualization Gruber, 1993 They are ways to conceptualize a domain. Subject Predicate Object Class Properties (Sub)class Transpor tation issuperclassof Road issubclassof
7 Objective Ontology Usual approach and Standard to geodata approach integration to semantic using ontologies integration Dataset 1 Freeway Dataset 2 turnpike Ontology 1 Common Ontology Standard conceptuali 2 zation Freeway+ Turnpike Turnpike = = motorway Interoperable datasets VGI(OSM) motorway Ontology 3
8 Proposed Approach The method according the objectives Official source 1 R2RML mapping RDF freeway Official source 1 turnpike R2RML mapping RDF motorway Reality turnpike Common Knowledge OpenStreetMap R2RML mapping RDF trunk motorway Standard Domain Ontology
9 Ongoing work: Semantic Heterogeneity in OSM datasets More than one tag per real-world phenomenon (synonymy) Number of Quebec tags per phenomenon increases with the scale, and St. John s % is important <highway=bus_stop> <highway=bus_stop> <public_transport=stop_position> <public_transport=stop_position>
10 Ongoing work: Semantic Heterogeneity in OSM datasets More than one tag per real-world phenomenon (synonymy) Number of tags per phenomenon evolve with time, and % is still important <highway=bus_stop> <public_transport=stop_position> Decreasing Increasing Agreement level through of agreement time 2006 2008
11 Conclusions and future work Proposed approach is based in a domain ontology, which allows: Matching datasets to a common pivot (R2RML allows flexible and direct mappings) No need to know how to handle ontologies. Reusing the mappings. Semantic Heterogeneity in OSM datasets. Number of tags and their % of occurrence per real-world phenomenon Time and spatial scale are factors affecting SH in OSM datasets. Future work: Developing more the ontology. User-friendly interface for making R2RML mappings. Deeper study factors involved in SH in OSM datasets, trying to model it.
12 Thank you. Gracias. Questions? Acknowledgements AGILE/EuroSDR NSERC TU Delft IGN Spain Sinfogeo Ltd. amenity=plane> Dr. Jean Brodeur Drs. Marian de Vries Marine and Geomatics Lab colleagues PNOA aerial images. IGN Spain Icons by http://dryicons.com Jimena Martinez [jmr125@mun.ca]