Semantic Annotation of Geographic Information

Size: px
Start display at page:

Download "Semantic Annotation of Geographic Information"

Transcription

1 Geoinformatik Semantic Annotation of Geographic Information Inauguraldissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften durch den Fachbereich Geowissenschaften der Westfälischen Wilhelms - Universität Münster vorgelegt von Eva Marie Klien aus Essen im April 2008

2 Dekan: Erster Gutachter: Zweiter Gutachter: Tag der mündlichen Prüfung: Tag der Promotion: Prof. Dr. Hans Kerp Prof. Dr. Werner Kuhn Prof. Dr. Michael Gould.....

3 i Abstract We address the question how to explicate the meaning of terms that carry geographic information (GI), i.e. we deal with semantic annotations of geographic features. Semantic annotation establishes a link between feature type and formal category descriptions. To evaluate the validity of using a category for annotation, it is necessary to either know or determine that the features (that are instances of the feature type) represent members of that category. For example, the category FLOODPLAIN provides a formal characterization for particular floodplains. Feature instances (seen as information objects) that represent floodplains are then validly annotated with the category FLOODPLAIN. Manual annotation is error-prone if the person that annotates is either not familiar with the conceptualization captured in the category descriptions, or not familiar with the content of the data source. Further, manual annotation often considers only one perspective on the information source, thus disregarding its potential applicability in different contexts. Existing methods to automatically support annotation are mostly based on text analysis and their algorithms rely on statistics and heuristics. Their benefit in reliably assessing semantic interoperability is limited, since they do not take into account the underlying conceptualizations. In this thesis, we develop a method for evaluating the validity and extensibility of semantic annotations that have been generated either manually or with text-based methods. This is done by evaluating the plausibility of the represented entity s category membership. The proposed method relies on physical qualities and spatial relations for categorizing geographic entities. For example, members of the category FLOODPLAIN are characterized by being flat and adjacent to a river. Thus it is possible to define rules for category membership that are restricted to the characteristic physical qualities and spatial relations of geographic entities. These rules are formalized as part of geospatial domain ontologies. Since geographic features have a geometric representation as well as a spatial location, it is possible to compute and analyze metric and topological relations of the represented geographic entities. Consequently, we can use the formal rules to evaluate category membership to assess the validity and extensibility of semantic annotations. For this, the formal rules are translated into spatial analysis procedures that can be executed on the feature instances with spatial operators provided by geographic information systems. The outcome of our research is threefold: (i) we define a conceptual model for semantic annotation in GI web service environments, (ii) we provide a formal ontological foundation for the geospatial domain, and (iii) we develop a method to automatically support the annotation process that evaluates the validity of existing annotations and suggests possible new ones. We have implemented a prototype extension to the geographic information system ArcGIS to provide a test environment for the proposed method. With this prototype, we have shown the applicability of category membership evaluation in a scenario for evaluating the validity of FLOODPLAIN annotations that were generated manually.

4 ii Contents Chapter 1 Synopsis Introduction Conceptual Model for Semantic Annotation of Geographic Features Formal Ontological Foundations for the Geospatial Domain Method for Evaluating and Extending Semantic Annotations Results and Conclusion Roadmap Through the Thesis 25 Chapter 2 Ontology-Based Retrieval of Geographic Information Introduction Semantic Heterogeneity Problems in GI Discovery and Retrieval Semantic Descriptions of Geographic Information Ontology-Based GI Discovery Ontology-Based GI Retrieval Discussion and Related Work Conclusions and Future Work 58 Chapter 3 Requirements for Geospatial Ontology Engineering Introduction Ontology Application Example Requirement Specification for Geospatial Ontologies Discussion and Agenda 71 Chapter 4 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata Introduction Motivation: Ontology-Based Discovery and Retrieval of GI Spatial Relations Method for Automating Semantic Annotation Related Work Discussion and Future Work 87

5 iii Chapter 5 A Rule-based Strategy for the Semantic Annotation of Geodata Introduction General Framework for the Semantic Annotation of Geodata in Web Environments Semantic Web Technology Case Study for Semantic Annotation Related Work Conclusion and Future Work 104 Chapter 6 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework Introduction The Semantic Web Service Framework Approach for Registering and Annotating Geodata Walk-through for Registering and Annotating WFS in WSMO Benefits Conclusion and Future Work 118 Chapter 7 Category Membership Evaluation for Geographic Entities - Ontological Foundations and Implementation Introduction Background on Formal Ontology and DOLCE Formal Ontological Foundation for the Geospatial Domain Rules for Concept Membership Evaluation Conclusion and Future Work 147 Acknowledgements 152 Curriculum Vitae 153

6 iv List of Research Papers Lutz, M. and E. Klien (2006). Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science (IJGIS) 20(3): Reprinted by permission of the publisher Taylor & Francis Ltd.. Klien, E. and Probst, F. (2005). Requirements for Geospatial Ontology Engineering. In: Toppen, F. and Painho, M. (eds.). In: Toppen, F and Painho, M. (eds.): Proceedings of the 8th Conference on Geographic Information Science (AGILE 2005), Estoril, Portugal, pp Reprint with kind permission of ISEGI-UNL, granted 8 April Klien, E. and Lutz, M. (2005). The Role of Spatial Relations in Automating the Semantic Annotation of Geodata. In: Cohn, A. and Mark, D (eds.). Proceedings of the Conference of Spatial Information Theory (COSIT'05), Ellicottville, NY, USA. Lecture Notes in Computer Science, Vol. 3693, pp Reprint with kind permission of Springer Science and Business Media, granted 9 November Klien, E. (2007). A Rule-based Strategy for the Semantic Annotation of Geodata. Transactions in GIS, Special Issue on the Geospatial Semantic Web 11(3): Reprint with kind permission of Wiley-Blackwell Publishing Ltd., granted 7 November Klien, E., D. I. Fitzner and P. Maué (2007). Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework. In: Wachowicz, M. and Bodum L. (eds.): 10th Conference on Geographic Information Science (AGILE 2007), Aalborg, Denmark. Reprint by permission of the Editor M. Wachowicz, 10 April Klien, E., Probst, F. and Nientiedt, M (2008). Category Membership Evaluation for Geographic Entities Ontological Foundations and Implementation. under review

7 v Abbreviations and Acronyms AI DL DOLCE FOL GI GIS INSPIRE ISO OWL OGC SDI XML WFS WSML WSMO SWING SWRL Artificial Intelligence Description Logics Descriptive Ontology for Linguistic and Cognitive Engineering First Order Logic Geographic Information Geographic Information System Infrastructure for Spatial Information in Europe International Organization for Standardization Web Ontology Language Open Geospatial Consortium Spatial Data Infrastructure extensible Markup Language Web Feature Service Web Service Modeling Language Web Service Modeling Ontology Semantic Web Services Interoperability for Geospatial Decision Making Semantic Web Rule Language

8 Chapter 1 Synopsis Abstract. The synopsis gives a general overview of our research. It presents the core achievements and discusses the overall results and contributions. With the synopsis, readers are introduced to the research lines for achieving partly automated support for the semantic annotation of geographic features and they are pointed to the respective chapters, which are published or submitted as journal and conference articles, for further reading.

9 Chapter Introduction Web service environments are increasingly used to access, integrate, and reuse geographic information (GI) from multiple sources. The vision of enabling seamless information exchange and the web-components of Spatial Data Infrastructures (SDI) as the technological framework for implementing that vision have also gained momentum on a political dimension. Currently, several initiatives in Europe like INSPIRE 1, GMES 2, and GEOSS 3 aim at strengthening the capacities for monitoring the environment and natural resources by making use of GI and GI technologies. These initiatives closely relate to the vision of a Single European Information Space postulated by the European Commission as one of the three research priorities for Europe s information society and media policies (EC 2005). Beside the political momentum, this vision has also been one of the major driving factors behind the increasing interest in semantic interoperability research within the field of geographic information science. Distributed repositories and processing of GI have brought up the issue of what the contents mean to practitioners trying to assess and combine resources from multiple sources (Kuhn 2005). The potential of web service environments can only be fully exploited if the number of GI sources accessible reaches a critical mass, and if this information can be found and interpreted correctly by the users (Roman and Klien 2007). Searching, retrieving and integrating information is a communication process that involves an information requestor and one or several information providers. In this communication process requester and provider are not interacting directly but through the information system, acting as a broker for handling the negotiation between them (Brodeur et al. 2005). Direct personal communication can resolve semantic heterogeneity problems arising from different conceptualizations, terminology, context, and missing information relatively simple. Currently available search algorithms in information systems, however, rely on matching the terms used in data schemas and metadata. They can only account for syntax, not for the underlying conceptualizations. To support the users of information systems in discovering and interpreting GI as well as to enable seamless information integration across heterogeneous data sources, the meaning of the terms that carry geographic information needs to be made explicit in a formal and machine-processable way (Fonseca and Martin 2007; Sheth 1999) Rationale Many approaches in semantic interoperability research employ information system (IS) ontologies for the purpose of capturing the conceptualization of a specific user community (e.g. (Arpinar et al. 2004; Halevy et al. 2003)). An IS ontology is understood as a logical theory accounting for the intended meaning of a formal vocabulary (Guarino 1998). We follow Masolo (2003 p.13) in seeing categories "as cognitive artifacts ultimately depending on human percep- 1 Infrastructure for Spatial Information in Europe, 2 Global Monitoring for Environment and Security, 3 Global Earth Observation System of Systems,

10 Synopsis 3 tion, cultural imprints and social conventions". Formal category descriptions help making already formed conceptualizations explicit. The formal category descriptions in ontologies can be used to explicate information. We call this explication process semantic annotation. With semantic annotation it becomes possible to associate multiple conceptualization to the content of a data source and thus to make it findable and usable in many different contexts. The availability of semantic annotations for GI will widen the scope for assessing semantic interoperability in GI web service environments considerably. Approaches for assessing semantic interoperability that go beyond text analysis and exploit the logic formulas from ontologies have been successfully applied. For example, in this thesis we present approaches for ontology-based discovery that employ subsumption reasoning (see Chapter 2) and query containment checking (see Chapter 6). Other approaches include the semantic similarity measurements developed by Janowicz (2006) and Schwering (2005) and the rule-based discovery proposed by Lutz and Kolas (2007). Generating annotations with ontologies is difficult and time-consuming and currently no commonly applied strategy exists for how to annotate GI sources in web service environments. We expect that the ontology-based approaches for assessing semantic interoperability will gain momentum only if the annotation process can be facilitated in a way that (i) makes it more cost efficient and transparent for the users, and (ii) supports the generation of reproducible annotations for GI that facilitate data usage in different contexts. The overall goal of this thesis is to provide the means to increase the number of semantically annotated GI sources. For this, the user is be supported in evaluating the validity of annotations and in extending annotations to other perspectives. Our research is influenced by theories and methods of two evolving scientific fields: Geographic Information Science, particularly Geospatial Semantics, and the Semantic Web Scope In this thesis, we address the semantic annotation of geographic data, more specifically vector data. The presented method is not restricted to vector data, but has not yet been applied on raster data. Furthermore, we restrict our investigations for a conceptual model of semantic annotation to the realm of GI web service environments, more specifically to the web-components of those infrastructures that comply with OGC Web Service (OWS) specifications (OGC 2002) as published by the Open Geospatial Consortium (OGC) 1. The presented method for supporting the annotation process is based on analyzing datasets and thus is not applicable to support creating semantic annotations of service functionality. Approaches for semantic annotation of GI service functionality have been investigated e.g. in (Fitzner et al. 2008; Lutz 2006). In the following, we adopt the OGC terminology for geographic information, namely geographic feature for information objects that represent real world entities and that are associated with a location relative to the Earth (OGC 2002). The scope of the ontologies employed for semantic annotation is restricted to the geospatial domain. We define the geospatial domain to include everything that humans experience and conceptualize in geographic (large-scale) space. For geographic space, we follow the definition 1

11 Chapter 1 4 from Egenhofer and Mark (1995) stating that geographic space is the large-scale space (in contrast to small-scale space) that can be explored only by navigating in it, and we conceptualize it from multiple views. Consequently, geospatial domain ontologies are understood as ontologies for entities that are located or happen in geographic space (see Section and Section 7.1). In this thesis, we concentrate on entities that are located in geographic space and leave investigations on entities that happen in geographic space for future work. We employ the term semantic annotation for the process of associating the meaning captured in formal category descriptions of ontologies with geographic features. The resulting semantic annotations are formal and expressive descriptions that can be exploited for assessing semantic interoperability based on logic reasoning. This is a difference to annotating web resources by tagging. A tag is a keyword that provides valuable information for users of web resources (Hammond et al. 2005). However, tags do not exhibit the expressivity of logic descriptions. They cannot formally account for ambiguities like synonyms and homonyms (Speller 2007), which is required for the more complex reasoning tasks in applications like semantic discovery, seamless information integration and automatic service composition. For a general overview on how we use terms, please consult the glossary at the end of this chapter. Naming conventions. We use small caps for CATEGORIES, Concepts start with an upper case, entities with lower case and Feature types are set in single quotation marks Problems, Research Questions and Hypothesis We approach the following problems and related research questions for contributing to achieve the overall vision of semantically enabled GI web service environments. A. Missing conceptual model for semantic annotation The use of ontologies in information systems for the assessment of semantic interoperability between geospatial data sources is far from mature. While more and more geospatial domain ontologies are developed, we observe that only little experience and no well-defined strategy exist for associating the category descriptions in ontologies with the geographic features of a particular data source. We thus identify the need for a conceptual model and a strategy for annotation that allows producing consistent and reproducible annotations. For this, the following research questions need to be addressed: What are the requirements for semantic annotation in GI web service environments? How to establish the link between a feature and a formal description of its intended meaning? What are the components involved in the annotation process and what are the relationships between them? B. Missing formal ontological foundation for the geospatial domain Semantic annotations of GI make the underlying conceptualization explicit. The formal descriptions are applied for assessing semantic interoperability in distributed environments. Ontologies that are used for annotation should account for different world views, thus widening the scope

12 Synopsis 5 for usage of a data source. We call them domain ontologies. At the same time, we want distinct domain ontologies to be comparable. Moreover, they should exhibit a philosophically rigid structure to ensure for sound and reproducible annotations. Foundational ontologies capture the most generic categories in a rigorous way (Masolo et al. 2003). They structure the most generic notions needed for sound ontology engineering (Schneider 2003). If we want to compare the conceptualizations captured in different domain ontologies, we need to align them to this common ground. The alignment also helps to eliminate inconsistencies in existing ontologies. However, geospatial domain ontologies that exhibit this kind of alignment and formal structure and that could be employed for the annotation of GI are not available. We assume that it is possible to define a formal ontological foundation for the geospatial domain, by aligning the most generic categories of the geospatial domain to the upper-level notions of a foundational ontology. This structure provides the common denominator that supports ontology engineers to produce ontologically sound and comparable geospatial domain ontologies for different world views such as geomorphology, hydrology, ecological planning, and tourism. To develop such an ontological foundation for the geospatial domain, we have to approach the following questions: Which of the currently available foundational ontologies may serve as common denominator for geospatial domain ontologies? What are the most generic categories of the geospatial domain and how can they be aligned to a foundational ontology? How to structure the generic categories for the geospatial domain to ensure rigidity through the branches of the ontology? C. Missing support for evaluating the validity and extensibility of existing semantic annotations We want to support the generation of ontologically sound and reproducible semantic annotations that make GI findable and usable in many different contexts. We have already specified the problems regarding the reproducibility and ontological soundness of annotations. We now approach the question how we can provide partly automated support for the user. Existing methods for automatic annotation support are mostly based on text analysis (Grcar et al. 2007a). They are primarily developed for dealing with textual web resources and their algorithms rely on statistics and heuristics. Their benefit in reliably assessing semantic interoperability is limited, since they infer semantic annotations without taking into account the underlying conceptualizations. From our viewpoint, central for creating semantic annotations is to establish a link between a feature type and a formal category description. To evaluate the validity of using a category for annotation, it is necessary to either know or determine that the features (that are instances of the feature type) represent members of that category. In most cases, the initial annotation for a data source will be provided by the data modeler and will reflect her interpretation of the represented entities. Manual annotation is especially error-prone if the person that annotates is either not familiar with the conceptualization captured in the domain ontology, or not familiar with the content of the data source. Manual annotation also often disregards the fact that multiple perspectives on a data source can be beneficial to support the needs of different usages (Bayerl et al. 2004).

13 Chapter 1 6 We identify the need for a method that supports the users in validating annotations that have been generated either manually or with text-based methods. Further, an additional benefit will be achieved by providing the means to extend validated annotations to reflect different perspectives. For this endeavor, we have to deal with the following questions: What are the characteristics of geographic entities that can be derived from analyzing the geographic features representing the entities under investigation? How to formally describe the procedure of category membership evaluation? What kind of ontology structure is needed to support the extension of annotations to other perspectives by suggesting additional category descriptions to serve as semantic annotations? Hypothesis To approach the problems and research questions formulated above, we pursue three major strands of research: (i) we define a conceptual model for semantic annotation in GI web service environments to ensure the reproducibility of the results, (ii) we provide a formal ontological foundation for the geospatial domain to ensure the ontological soundness of the process and its results (i.e. the semantic annotations), and (iii) we develop a method to support the evaluation of the validity of existing annotations and the extension of annotations to other perspectives. These three research strands are reflected in the general hypothesis that has guided our research: The proposed conceptual model and ontological foundation provide the basis for the definition of reproducible and ontologically sound semantic annotations for geographic features. The proposed method for analyzing the plausibility of category membership provides support (i) for evaluating the validity of annotations and (ii) for extending the annotations to reflect other perspectives. We have specifically tested and approved the applicability of category membership evaluation for assessing the validity of annotations that were generated manually. For this, we have implemented a prototypical test environment as extension to the geographic information system ArcGIS 1 that combines formal rules encoded in the Web Service Modeling Language (WSML) (de Bruijn 2005) and the spatial operators offered by ArcGIS. The remainder of this chapter is structured as follows. Sections 1.2, 1.3, and 1.4 introduce our core research strands and point to the respective chapters in this thesis for further reading. Section 1.5 concludes with discussing the main results and contributions and gives an outlook on future work. Section 1.6 gives an overview how the three research strands are reflected in the publications that make up the chapters of this thesis. 1

14 Synopsis Conceptual Model for Semantic Annotation of Geographic Features Recent work has shown the benefit of using ontologies in GI infrastructures for describing and reasoning on geographic information (Bowers et al. 2004; Frank 2003; Janowicz 2006; Lemmens and Vries 2004; Lutz 2006). However, we observe that only little experience and no welldefined strategy for how to semantically annotate GI exist Requirements for Semantic Annotations In Chapter 2, we describe a web service environment in which semantic annotations for geographic feature types are directly derived from the domain ontologies (see Figure 2.2 in Section 2.3.2) in form of application ontologies (see Figure 2.4 in Section 2.3.3). The link between the application ontology and the feature type schema is provided by registration mappings (see Figure 2.5 in Section 2.3.4). This approach allows for complex query processing on the underlying logic and, at the same time allows for automating GI retrieval. However, the approach has some drawbacks regarding the complexity and consistency of the resulting semantic annotations (Klien et al. 2007). Inconsistencies in the ontology structure. The application ontologies are directly derived from the domain ontology, i.e. application categories are sub-categories of domain categories. By this, taxonomic relations are established where from an ontological point of view no isa relation exists. In the example of Chapter 2, the application category CHMI_MEASUREMENT describes a feature type with point geometry and several attributes. It is defined as sub-category of the domain category MEASUREMENT, which refers to actual measurements. In this structure the instances of the application category CHMI_MEASUREMENT are as well instances of the domain category MEASUREMENT. This treats both the actual water level measurements in some river and their representations (features that represent measurements) as being ontologically the same kind of entity. More examples illustrating the problem are given in Sections and In this respect, most of the currently available ontologies would carelessly consider the representation of a pipe, as shown in Rene Magrittes famous painting "La trahison des images" (Figure 1.1), as actually "being a pipe". Figure 1.1: Rene Magritte: The Treachery of Images (La trahison des images) ( ).

15 Chapter 1 8 We argue that only if the ontology employed for semantic annotation is internally consistent, real benefit for semantic interoperability can be achieved. Therefore, a general guideline should be followed by clearly differentiating between two questions: What do the instances of geographic feature types instantiate? What do the instances of geographic feature types represent? We formulate the first requirement that semantic annotations should clearly differentiate between instantiation and representation, thus ensuring consistency in the formal descriptions. Complexity of generating registration mappings. For ontology-based GI discovery it is sufficient to reason on the application ontologies. For ontology-based GI retrieval, however, more specific information on the feature type s structure is required. To describe the links between feature type structure and application ontology, the approach in Chapter 2 employs the notion of registration mappings introduced by (Bowers and Ludäscher 2004) (see Figure 2.5 in Section 2.3.4). The links can be exploited to automatically formulate the filter for retrieving features from a Web Feature Service (OGC 2005). However, registration mappings quickly become complex, especially if a 1-to-1 mapping between an attribute in the schema and an ontology concept is not possible (Section 2.3.4). Altogether, the process of creating application ontologies together with registration mappings is complex, time consuming and no automatic support seems to be feasible. We thus formulate the second requirement that the model for semantic annotation should render the task as easy as possible and allow for the integration of methods for partly automating the annotation task How to Associate Features with Meaning In Chapter 5, we introduce a conceptual model for semantic annotation. The proposed model is illustrated with an example in Figure 1.2. According to the OGC Reference Model (OGC 2002), a geographic feature is the starting point for modeling geographic information. A geographic feature is defined as an abstraction of a real world entity with a location relative to the Earth. We thus use the term geographic feature to denote features that represent geographic entities, i.e. entities that are located in geographic space. A feature type schema describes the feature instances of that feature type. For example, the schema of the feature type FT_floodedLowlands describes the structure of those features in the database that are instances of the feature type FT_floodedLowlands. Similarly, a category definition in the geospatial domain ontology provides a characterization of geographic entities that are members of that category. For example, the category FLOODPLAIN characterizes floodplain entities in the real world. Hence, feature types can be used to represent categories and features can be used to represent members of a category. In Figure 1.2, the feature type FT_floodedLowlands represents the category FLOODPLAIN, and all the feature instances of FT_floodedLowlands represent members of the category FLOODPLAIN (i.e. floodplain entities).

16 Synopsis 9 spatialrelation Floodplain Category from Domain Ontology GeographicObject PhysicalQuality WaterBody hasquality SlopeQuality hasquality hasvalue Lake River adjacentto Floodplain FlatSlope ReferenceRegion Feature Type Schema <xsd:complextype name= FT_floodedLowlands"> Geographic Features semantically annotates <xsd:complexcontent> <xsd:extension base="gml:abstractfeaturetype"> <xsd:sequence> <xsd:element name="geometry type="gml:surfacetype"/> <xsd:element name="id" type="xsd:integer"/> <xsd:element name= vegetationtype" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexcontent> </xsd:complextype> instantiate represent categorizes Geographic Entities Figure 1.2: Conceptual model for semantic annotation of geographic features (identical to Figure 7.1 from Section 7.1). Features derive their meaning from the concepts that are employed to categorize geographic entities. The purpose of features is to support communication and reasoning about the geographic entities that they represent. Good illustrations are maps that we use in our everyday communication to reason about entities located in geographic space. How do we know which entities are represented by the features displayed on the map? We derive the meaning from the (cartographic) legends whose implicit semantics have been learned before and from the natural language labels whose meaning is assumed to be shared by all map users. If the cartographic annotations provided by the legend are interpretable, we have no problem to transfer the concepts behind the legend symbols as meanings to the representations on the map. However, in distributed information environments, this direct interpretation of symbols by humans is often not possible or at least hindered by ambiguity. We need formal and explicit descriptions to capture the conceptualizations of a user community, i.e. domain ontologies. Further we need to associate the category descriptions in the ontology with the features in a database in a formal and machine-processable way that can be used for automatic reasoning. Semantic annotation establishes a link between a feature type and a formal category description. This link is identified based on the knowledge that the feature instances represent geographic entities that are members of that category (see Section 5.2.2). For example, the category FLOODPLAIN provides a formal characterization for floodplain entities that are located in geographic space. Features that represent floodplain entities are then semantically annotated with the category FLOODPLAIN. It is important that the link is not defined as taxonomic relation (i.e. sub-category), since the feature type schema and the geospatial domain ontology describe different kinds of entities (see Section and also Sections and 6.3.2). Note that in this

17 Chapter 1 10 thesis, we only give this informal characterization of the semantically annotates relation; a specification of its formal semantics in the underlying logic is not provided Strategy for Generating Semantic Annotations Based on the conceptual model described in 1.2.2, we can define a strategy for semantic annotation of geographic features that satisfies the second requirement specified in Section 1.2.1, namely simplicity of generation and a partial automated support of the process. This strategy is introduced in Chapter 5 and a reference implementation in the Semantic Web Service Framework WSMO (Roman et al. 2005) is presented in Chapter 6. The proposed strategy consists of two steps that are illustrated in Figure 1.3. The first step is straightforward and consists of translating the feature type s application schema into the syntax of the ontology language. This step can be performed automatically since a plain syntactic transformation is performed (see Sections and 6.3.1). The transformation is necessary, since both the feature type description and the Domain Ontology (DO) have to be encoded with the same language. This is a prerequisite for formalizing the semantic annotation link in the next step, as well as for being able to perform automatic reasoning on both. The result of this transformation is a Feature Type Ontology (FTO) that provides the structural information of the feature types served by the specific Web Feature Services (WFS). At this stage, the FTO does not contain additional information to the original feature type description; the transformation only changes the syntactic structure. For a concrete example of such a transformation, please refer to Figure 6.4 in Section Figure 1.3: Strategy for Annotating Geographic Feature Types in Web Service Environments (modified Figure 5.1 from Section 5.2). The second step is more complex and labeled as the semantic annotation task in Figure 1.3. This is a manual step, where the person responsible for the annotation needs to (i) determine to which category the represented entities belong and then (ii) use the category description to semantically annotate the respective feature type. Since both descriptions the DO and the FTO are encoded in the same ontology language, it is possible to formalize the annotation link in axioms. Examples for axioms that specify the annotation link are given in Sections (Figure 6.7) and (statement 11). The example in Section further illustrates the potential of this

18 Synopsis 11 strategy for additionally formalizing application-specific information that was not explicit in the original feature type schema, e.g. the quantification of adjacency. Defining links between DO and FTO underlies the same mechanism as assigning keywords, or tags to web resources. However, in our case, the resulting annotation is much more expressive, since the formal category description provides a partial account of the underlying conceptualization. Therefore, it serves to explicate the meaning of the used terms instead of assuming their common understanding, as it is the case with keywords or tags. Further, the formal descriptions can be used to infer relationships to other descriptions Discussion of Conceptual Model for Semantic Annotation The development of the conceptual model is necessary to clarify the framework into which a method for automatic annotation support can be embedded. The proposed model meets the requirements we have formulated for semantic annotation in Section It eliminates the modeling inconsistencies displayed in the annotation approach presented in Chapter 2. Furthermore, as shown in Section 6.3.1, the pre-processing step of syntax transformation can be easily automated, thus simplifying the subsequent semantic annotation considerably. Further, it is important that the proposed conceptual model does not affect the structure of the data source. Through semantic annotation we can associate multiple views with the features that were not explicit before. This enables the usage of the data source in a broader context without the need for reengineering of the data model. 1.3 Formal Ontological Foundations for the Geospatial Domain To (automatically) evaluate whether an entity can be seen as a member of a category, we first need to establish a sound and well-structured ontological foundation for entities that are directly or indirectly located in geographic space. Many attempts for geospatial ontology engineering exist (e.g. SWEET 1, GeoSciML 2, SWING 3, OS Ontologies 4 ) and the approaches vary greatly along intended usage, formality of representation and rigor of the underlying philosophical assumptions. From these different attempts to capture categories relevant for the geospatial domain, it becomes obvious that different conceptualizations of the same geospatial reality exist. Ontologies, in contrast to standards, do not impose a one and only view on the world. In contrast, they want to account for the many different views that exist and make them explicit in respective category descriptions. As stated in Section 7.1, domain ontologies that are used to semantically annotate geographic features have to meet the following requirements: 1 Semantic Web for Earth and Environmental Terminology (SWEET): 2 GeoSciML - Geological Sciences ML: 3 Semantic Web services INteroperability for Geospatial decision making (SWING): 4

19 Chapter 1 12 The ontology structure needs to ensure comparability. We assume that semantic interoperability in web service environments can only be achieved if the different conceptualizations captured in the formal descriptions are comparable. To ensure the comparability between the distinct ontologies, they have to be aligned to the same common denominator (Masolo et al. 2003; Probst 2007). The ontology structure should allow for multiple views in an ontologically consistent way. We want to apply the proposed method not only for evaluating the validity of annotations, but also for suggesting additional categories for extending the annotations to other perspectives. Being able to reflect multiple perspectives on data sources in the formal descriptions provides the basis for assessing semantic interoperability. This will be a major step to support a harmonized GI usage like, for example, postulated by the INSPIRE principles Modeling Approach We take the perspective that comparability and ontological consistency is best achieved by aligning the domain ontologies to a foundational ontology (see Sections and 7.2). Foundational ontologies capture the most generic categories in an ontologically rigorous way (Masolo et al. 2003). In this thesis, we define a formal ontological foundation for the geospatial domain, by aligning the most generic categories of the geospatial domain to the upper-level categories of the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) (Masolo et al. 2003). This structure provides the common denominator that will support the creation of ontologically sound and comparable geospatial domain ontologies for specific world views, such as geomorphology, hydrology, ecological planning, and tourism. This intermediate foundational structure closes the gap between the very generic categories of DOLCE and the very specialized categories described in the respective domain ontologies. We expect that once a stable and sound structure has been established, it will serve to ensure the ontological soundness of the annotation process. This formal ontological foundation for the geospatial domain needs to meet the following requirements: It must be based on clearly defined philosophical assumptions and be developed according to the principles of formal ontological analysis (Masolo et al. 2003; Welty and Guarino 2001). It must be specified in a formal representation language, thus conforming to the definition of ontology understood as a logical theory accounting for the intended meaning of a formal vocabulary (Guarino 1998). It must be aligned to a foundational ontology (Masolo et al. 2003; Probst 2007; Welty and Guarino 2001). In Chapter 7, we present the results of a formal ontological analysis of entities that are directly or indirectly located in geographic space. The resulting ontology exhibits the generic categories relevant for the geospatial domain. The ontological analysis has largely followed the guidelines discussed in Chapter 3. This includes the alignment of the most generic geospatial categories to the foundational ontology DOLCE. We have chosen DOLCE for this undertaking because of its cognitive bias (Masolo et al. 2003) and the adequacy of its upper-level notions for deriving the different kinds of entities related to the realm of geography. We argue that the alignment improves structure and robustness of the ontology and, moreover, provides a sound philosophical

20 Synopsis 13 underpinning that helps to reveal the conceptualizations of the domain (evidence for the benefit of this strategy has been given in (Probst 2006)). Further, whenever feasible we have relied on agreed upon terminologies from different sources, including GEMET 1, WordNET (Fellbaum 1998), and SDTS 2. Figure 1.4: Most general categories of the ontology for entities that are directly or indirectly located in geographic space (depicted in shades of grey) aligned to the basic categories of DOLCE (depicted in white) (identical to Figure 7.3 in Section 7.3) The resulting ontology structure supports ontology engineers in differentiating between subcategories, relations and roles to avoid traps like categorizing an entity into two different categories at the same time (Welty and Guarino 2001). The proposed taxonomy for the generic categories of the geospatial domain is displayed in Figure 1.4. Examples for illustrating the modeling decisions and their benefits for a coherent and robust ontology structure are given in Section Additionally, the formalizations of the category descriptions in the Web Service Modeling Language (WSML) (de Bruijn 2005) are accessible online 3. In summary, the proposed structure exhibits the following characteristics (please refer to Section 7.2 and 7.3 for further explanations): A geographic object can play multiple roles. This acknowledges the fact that people might have multiple conceptualizations of the same real world entities. For example, a forest, which is a vegetation unit, might play the role of a resource as well as the role of a recreational area. 1 GEneral Multilingual Environmental Thesaurus, 2 Spatial Data Transfer Standard (SDTS), 3

21 Chapter 1 14 The roles that a geographic object plays, exhibit the spatial properties of their host as indirect physical qualities and spatial relations. For example, a forest is close to a town. In the case that the forest plays the role of a recreational area, the recreational area is also close to the town. In our communication we can easily skip the differentiation whether we are referring to direct or indirect spatial properties. Since it seems cognitively plausible to say the forest is close to the town as well as the recreational area is close to the town the proposed ontology has to account for both statements without rendering the last as ontologically inconsistent. Reference regions as introduced by (Probst 2007) for approximating a quality s value allow the specification of different quantifications. For example, the definition of the distance quality provided in Section enables to specify a comparable notion of adjacent by providing the means to specify maximum and minimum values for the adjacent reference region (e.g. 0 m and 40 m ). This reference region can differ depending on the user s interpretation of adjacency. This application-specific interpretation can now be formalized in semantic reference spaces for adjacency as part of the annotation of geographic features. That means that data providers can explicitly state the value range for adjacency in the annotation part of the Feature Type Ontology (see Statement (11) in Section 7.4.3). This allows other users not only to interpret the given information in the intended way, but also to map between different conceptualizations of what is, for example, allowed to be considered as floodplains characterized as being adjacent to rivers Discussion of Formal Ontological Foundations We have developed the proposed ontology with the specific purpose in mind that it serves generating ontologically consistent semantic annotations of geographic features. We do not claim that the proposed ontological foundation will be usable in any context. Other views on how to structure the most generic categories of the geospatial domain exist and provide valuable insights (Casati et al. 1998; Grenon and Smith 2004), yet they are based on a foundational ontology we consider as too difficult to use in the context of annotating geographic features. We think that the proposed foundational structure serves best our requirements derived from the conceptual model for semantic annotation of geographic features. In ongoing research projects like SWING 1 and COMPASS 2, we will explore further, if it serves as foundation on which more specific ontologies for the geospatial domain can be built, while ensuring their comparability and ontological consistency. 1.4 Method for Evaluating and Extending Semantic Annotations To evaluate the validity of using a category for annotation, it is necessary to either know or determine that the features represent members of that category. The developed method needs to rely on evaluating the plausibility of the represented entity s category membership. 1 SWING project: 2 COMPASS project:

22 Synopsis 15 In Section 1.4.1, we will first concentrate on the roles of physical qualities and spatial relations for characterizing geographic entities and on the idea to exploit them for category membership evaluation. This idea of analyzing feature instances for evaluating category membership is first introduced in Chapter 4. In Section 1.4.2, we present rules for category membership evaluation that can be used to support the semantic annotation process. The idea to define formal rules as part of the ontology is first introduced in Chapter 5. Section defines the application scope for the proposed method. A prototype application for testing the applicability is presented in Section Employing Spatial Location, Geometry and Topology We call entities that are located in geographic space geographic entities. Physical properties and spatial relations are central for experiencing and for conceptualizing entities in geographic space. Consequently, they are central for categorizing and communicating geographic entities (Egenhofer and Mark 1995; Papadias and Kavouras 1994; Smith and Mark 1998). For example, floodplains are characterized as being flat and adjacent to rivers. The geographic features that represent geographic entities are associated with a location relative to the Earth (OGC 2002). They have a geometric representation as well as a spatial location. This gives us the possibility to evaluate the plausibility of a categorization, without the need to examine the geographic entity in the real world. Instead, we can check for its topological and metric characteristics through the geographic features in the database and infer, whether the represented entities might be members of a category or not. For example, we can examine for each feature instance of the feature type FT_floodedLowlands, whether they exhibit a flat slope and whether they are adjacent to a river. If all feature instances fulfill these conditions we can infer the plausibility of categorizing the represented geographic entities as members of FLOODPLAIN. This idea is discussed in Section 4.4 with a focus on the role of spatial relations. Further investigations are presented in Sections 5.4 and In contrast, non-spatial features do not offer comparably strong characteristics that they share with the entities that they represent. For example, a picture of a coffee mug represents a real coffee mug. However, it is very hard to identify any non-spatial characteristics that are implicitly stored in the picture that would be exploitable for determining the kind of entity that is represented Rules for Category Membership Evaluation We propose to define rules for category membership evaluation that are restricted to the characteristic physical qualities and spatial relations. These rules define some (not all) of the conditions that a geographic entity has to fulfill in order to fall into the category. The rules are formalized as part of the geospatial domain ontology. An overview on the underlying ontology structure is given in Section 7.3. Further examples for rule definitions are given in Sections 4.4.3, and The statements (1) and (2) below give formal rules for categorizing lowlands and floodplains. These rules are defined as implications and not as definitions, since the conditions on the right side are necessary but not sufficient to characterize the category on the left side. This is a difference and advancement compared to the formalizations provided in Chapters 4 and 5, where we have used logical equivalence as connective between the left side and the right side. From our

23 Chapter 1 16 current perspective it is not correct to state logical equivalence between the statements of both sides, since, based on the spatial characteristics of its members, we can only give a partial characterization of a category. A lowland is a terrestrial unit, which has a slope quality, which is approximated by a flat reference region (see Section 7.3.3): "x,q,v (Lowland(x) fl TerrestrialUnit(x) hasquality(x, q) Slope- Quality(q) hasvalue(q, v) FlatSlopeReferenceRegion(v)) (1) A floodplain is a lowland and it is adjacent to a river: "x,q,v (Floodplain(x) fl Lowland(x) $y [River(y) adjacentto(x, y)]) (2) Since geographic features have a geometric representation as well as a spatial location, it is possible to compute and analyze metric and topological relations of the represented geographic entities. Consequently, we can employ the formal rules in the process to evaluate the validity of semantic annotations. For this, the formal rules are translated into spatial analysis procedures that can be executed on the feature instances with spatial operators available in standard geographic information systems. More examples for rules as well as their formalization in different languages like the Semantic Web Rule Language (SWRL) (Horrocks et al. 2004) and the Web Service Modeling Language (WSML) (de Bruijn 2005) are provided in Section 5.4 and Section Scope of Applicability The presented rules for category membership evaluation only capture a small subset of the member characteristics of a category. By this they are only applicable for evaluating but not for detecting a category membership. Thus, the analysis process needs some starting point, which needs to be provided by an initial categorization and could be performed or provided by, for example: 1. Manual annotation by the users, 2. another method for annotation support like the term-matching techniques as presented in (Grcar and Klien 2007), or 3. reference datasets with geographic features that already have a validated semantic annotation and that cover the same spatial extent as the features to be annotated. Once the initial annotation with one of the category terms from the domain ontology exists, the proposed method can support the user in the following two application scenarios. Application Scenario 1: Evaluate validity against domain ontology Annotations that have been generated with option 1 or 2 should first be validated with the proposed method, before additional extensions of the semantic annotations are evaluated. In Figure 1.5, John models the feature type FT_floodedLowlands to describe features that represent lowlands. Accordingly, he selects the category LOWLAND for annotation. At this stage the proposed method can be employed to support John in validating the annotation. For example, John interprets lowlands as terrestrial units of low altitude, while the category description characterizes lowlands to be terrestrial units with a flat slope (in the sense introduced in statement (1)). In

24 Synopsis 17 this case the underlying misconception can be identified and eliminated with the proposed method. Figure 1.5: Multiple views on a set of different entities located in geographic space (modified Figure 7.11 from Section 7.4.2). Application Scenario 2: Evaluate extensibility Different people with different views on their surroundings will perceive different entities at the same location. The structure of the proposed ontology for the geographic domain supports these multiple views (Section 1.3, also Section 7.3.2). It allows categorizing different entities at the same location in geographic space in an ontologically consistent way. In Figure 1.5, Karen, Steve, Susan, and John observe the landscape from different viewpoints, which results in categorizing different entities. John perceives lowlands, Karen perceives floodplains, Steve perceives recreational areas, and Susan perceives habitats. The proposed domain ontology (Section 1.3) models the relations between the respective categories in the following ways (i) FLOODPLAIN is a subcategory of LOWLAND, (ii) members of LOWLAND might play the role of RECREATIONALAREA, and (iii) members of LOWLAND might play the role of HABITAT. We now assume that the annotation for the feature type FT_floodedLowland with the category LOWLAND has been validated. Based on this knowledge, the system evaluates category membership for all related categories, i.e. all categories whose members are directly or indirectly located at the same spatial locations as lowlands. This is potentially the case for all members of sub- or super-categories to LOWLAND, all roles that might be played by lowlands, and members of categories that are related in a part of relation, e.g. a vegetation unit covers a terrestrial unit and will therefore share the same spatial extent. We conclude that because humans have multiple views on the landscape, a feature might represent more then just one geographic entity (understood as everything that humans experience and conceptualize in geographic space). Annotations that reflect these multiple perspectives on a data source will be beneficial to support the needs of different usages.

25 Chapter Prototypical Test Environment We have implemented a prototypical test environment for the proposed method. The tool has been implemented as extension to the geographic information system ArcGIS. In Section 7.4.3, we present the components and functionalities of the tool and describe a walkthrough for the example of evaluating the validity of annotating features with the category FLOODPLAIN. The implementation is described in detail in (Nientiedt 2008). In the following we will briefly introduce the components and functionalities, the tested scenario, and its results. To illustrate the tool s functionalities, we recapture the annotation scenario from the previous paragraph. John s annotation of the feature type FT_floodedLowlands with the category LOWLANDS has been validated. The system now supports John in identifying further plausible extensions to the annotation by checking the rules for related categories. In this case we evaluate the plausibility of category membership for FLOODPLAIN. The workflow through the implemented application is illustrated in Figure 1.6 and consists of a manual pre-processing outside the tool, three processing steps within the tool, and a final manual post-processing step again outside the tool. Manual translation from WSML to XML Mappings from spatial relations to spatial operators based on predefined mappings Floodplain Rule Intermediate representation in XML import Graph generation Rule for floodplain category membership as part of the domain ontology in WSML ArcGIS Category Membership Evaluation Tool user input User Interface Floodplain category description in the domain ontology execute + display the results User Interface Validation of using Floodplain category for annotation based on the evaluation results Specification of input data sets Quantification for parameters Features to be annotated Features that serve Input Data as reference data Pre-processing: WSML to XML Step 1:XML import Step 2: user input Step 3: execution in ArcGIS Post-processing: validate annotation Figure 1.6: Components and functionalities of the prototype implementation (identical to Figure 7.12 in Section 7.4.3). Pre-processing: WSML to XML. The geospatial domain ontology and the rule for floodplain category membership are encoded in the logic-based ontology language WSML (de Bruijn 2005). First, the formal WSML description is translated into an intermediate XML representation. This step is necessary since it is easier to import the rules into the tool by parsing an XML structure with well-defined elements compared to plain logic syntax. In this translation step, the physical qualities and spatial relations that are part of the rule descriptions in WSML are mapped into the respective spatial operators offered by the geoprocessing framework chosen for the implementation. In the example, the spatial relation adjacentto from the floodplain rule (statement (2) in Section 1.4.2) is translated into the ArcGIS spatial operator sequence buffer followed by intersect. Figure 4.2 in Section provides an illustration for the corresponding spatial analysis procedure. A table with more examples of mappings defined between the geo-

26 Synopsis 19 spatial domain ontology and the spatial operators of ArcGIS is available in Section (Table 7.2). Since we only consider vector data in this prototypical implementation, we can restrict the mappings to those spatial operators that are executable on vector data. Step 1: XML import. The XML description for the FLOODPLAIN rule that has been generated in the pre-processing step is imported by the tool. The extracted information is stored in a graph structure, which is able to manage the executable procedure. In the example, the graph consists of two nodes. The first node is representing the buffer-operation, and the second node the intersect-operation. Step 2: User input. At this point, the user is asked to specify the features that have a validated annotation with the category LOWLAND and that are to be examined for annotation with the category FLOODPLAIN. Next, to be able to calculate the relation R between two entities x and y (e.g. x is adjacent to some river ) a reference dataset with the well-known geometry of y (e.g. all rivers) needs to be specified. In the example, a reference dataset is needed that has a validated annotation with the category RIVER. Furthermore, the user has the possibility to provide context-specific information, e.g. she can specify the value range for what she considers adjacentto. Step 3: Execution in ArcGIS. Once all required information has been collected, the application starts the execution process with traversing and executing the operations represented by each graph node. The set of river features from the referenced dataset is given to the root node of the graph. After having executed the buffer-operation on all features, the generated output is given to its child node, the intersect-operation. The intersect-operation is executed with the set of features to be annotated as a second input parameter, i.e. the buffer output intersects with the lowland features. The resulting dataset contains all lowland features, which are in a specific distance (the buffer distance) to river features. This process is abstracted and illustrated in Figure 1.7. A detailed walk through this analysis process for evaluating the FLOODPLAIN category membership is described in Section Please note that compared to the initial idea presented in Chapter 4, we now assume that a validated semantic annotation of the feature type exists beforehand; here it is annotated with the category LOWLAND. Figure 1.7: Overview on the spatial analysis procedure for evaluating the plausibility of a semantic annotation with the category FLOODPLAIN (modified Figure 4.3 from Section 4.4.2).

27 Chapter 1 20 Post-Processing: Validate Annotation. From the displayed results, the user can evaluate the plausibility to semantically annotate the features that represent lowlands additionally with the category FLOODPLAIN. Alternatively, if only a subset of features conforms to the floodplain rule, the users might choose to define a new feature type for this subset and to annotate this feature type with the categories LOWLAND and FLOODPLAIN. The specification of the link between FT_floodedLowlands and the category FLOODPLAIN is, again, a manual step. In the SWING project, a user interface is developed that enables the users to specify this link and automatically generates the annotations in WSML from the user input (Grcar et al. 2007b). However, this is work in progress and the integration with the prototype tool for category membership evaluation is planned for the future Discussion of Method for Evaluating and Extending Semantic Annotations The rules for category membership evaluation are formalized as part of the geospatial domain ontology. Rule specifications require a certain expressivity from the ontology language, which is not provided e.g. in Description Logic (DL). However, it is possible to combine a popular ontology language like OWL-DL (which is based on DL) with rule languages. In Chapter 5, we use OWL-DL in combination with the Semantic Web Rule Language (SWRL) (see Section 5.3), which defines the formalization in two parts. The category descriptions in DL can be used for subsumption reasoning in a GI discovery task (for explanations see Section 2.4), while the SWRL rules define the category membership rules. Reasoning on the rules would require different algorithms than reasoning on DL. A more comprehensive approach is provided by WSML, which has been specified as formal language for describing web services. WSML is a family of formal description languages that correspond to different logical language paradigms. In the SWING project, the variant WSML-flight has been chosen as ontology language since it meets best the requirements for annotation and discovery in terms of expressivity and reasoning capacity (Klien et al. 2007), which is illustrated with the application presented in Chapter 6. WSML-flight can also be used for specifying the rules for category membership (see Section 7.4.1). With WSML-flight it is thus possible to employ the same language for category and rule descriptions and subsequently exploit both in the same reasoning. The goal of the implemented prototype is to provide a test environment for the proposed method for evaluating and extending semantic annotations of GI. We have used the tool only within small scale test scenarios, like the one described in this chapter. We have not yet tested its applicability in scenarios where more than two datasets are involved. So far, we have ignored any problems that might arise due to differences in, for example, spatial reference systems, resolutions, and the like between different datasets. The need for more complex analyses along with the need for quality and consistency checking before the datasets are imported into the analysis process will have an impact on the overall performance of the tool. Furthermore, alternative approaches for presenting the results to the user are possible. For example, an algorithm for calculating probability values for the category membership could be included. In that case, a strategy needs to be developed on how to associate probability values with the semantic annotation link, and how to handle it in the matchmaking algorithms used for assessing semantic interoperability. We expect that the more semantically annotated GI sources are available, the better

28 Synopsis 21 the proposed method will work, since it can rely on a greater set of reference datasets that have a validated semantic annotation. 1.5 Results and Conclusion This thesis contributes to the field of GI semantic interoperability research by the following achievements. We first state the main results, each followed by their contributions. We define a conceptual model for the semantic annotation of geographic features in web service environments. The model identifies the involved components and clarifies the relationships between them. We have implemented the proposed model in the Semantic Web Service Framework WSMO, using the Web Service Modeling Language (WSML) as formal representation language for the ontologies. However, the model is not restricted to WSMO and could be implemented with any representation language. Conforming to the proposed conceptual model helps to avoid modeling inconsistencies in the formal descriptions generated for annotation. Further, it ensures transparency for the person who generates or uses them. Moreover, the application of the proposed model is not restricted to the realm of GI web service environments, but could be translated to other annotation scenarios of web resources. We provide an informal characterization for the semantic annotation link. Semantic annotations are established between category descriptions in ontologies and feature types provided by data sources. The link associates the meaning captured in the category description with the features that instantiate the feature type. This link can be realized based on the knowledge that the features represent members of that category. This clarification of what happens in the annotation process helps to ensure the reproducibility of its results. This is an important aspect for judging the reliability of the proposed annotation link and its applicability for assessing semantic interoperability. We propose a formal ontological foundation for the geospatial domain. This ontology is the result of an ontological analysis of entities that are directly or indirectly located in geographic space. By aligning the generic categories to the foundational ontology DOLCE we have developed a formal and well-defined foundational structure for the geospatial domain. Only a formal and well-structured ontology that captures the conceptualization on how humans perceive geographic entities can ensure the ontological soundness of the annotation process and of its results. Moreover, it ensures comparability and allows for using the semantic annotations for assessing semantic interoperability. With the proposed ontology we provide a sound philosophical underpinning for category membership evaluation. Furthermore, it supports the users in formulating ontologically sound semantic annotations. Finally, we expect that the alignment of more specialized domain ontologies to the developed generic geospatial ontology will ensure comparability between them. To evaluate the validity of a semantic annotation, we suggest evaluating the plausibility of the represented entity s category membership. The approach to validate an annotation based on category membership evaluation helps to eliminate the uncertainty that remains from specifying the annotation manually or with text-based methods. For example, a user decides to annotate some feature type with the category LOWLAND, while the spatial analysis reveals that the

29 Chapter 1 22 investigated features have a considerable steep slope. Hence, the user is informed by the system of this inconsistency between the entities that are represented by the features and the category description for lowlands in the ontology. That is, the entities that are represented by the features are not members of the category. Hence, the intended semantic annotation is not valid. This approach is unique compared to existing methods that promise to automate semantic annotation. Existing methods are mostly based on text analysis and their algorithms rely on statistics and heuristics, not taking into account the underlying conceptualizations. We evaluate the plausibility of a represented entity s category membership with rules that are restricted to characteristic physical qualities and spatial relations. The proposed method relies on the importance of physical qualities and spatial relations for categorizing geographic entities. Based on this assumption, the rule definitions for category membership evaluation are restricted to characteristic physical qualities and spatial relations. These rules are formalized as part of the geospatial domain ontology. We can employ the rules for evaluating the validity of existing semantic annotations by translating them into spatial analysis procedures that can be executed on the features with spatial operators provided in GIS. The proposal to combine qualitative descriptions of an ontology with quantitative methods provided by GIS operations is a novel contribution to the existing set of methods for automatic semantic annotation support. The proposed method for analyzing the plausibility of category membership not only provides support for evaluating the validity of annotations but also for extending the annotation to other perspectives. Additional candidate categories for extending the annotation are those categories whose members share directly or indirectly the same spatial location as the represented entity. Hence, the system will evaluate the plausibility of category membership for subcategories or roles that might be played by the entity under investigation. For example, features that are annotated with the category LOWLAND can also be examined for representing entities categorized as members of FLOODPLAIN (FLOODPLAIN being a subcategory to LOWLAND) and they could also be examined for playing the role of a local recreational areas (since LOCALRECREATIONALAREA is a role that a lowland can play). Within the prototypical test environment we have shown the applicability of the proposed method for evaluating the validity of annotations that have been generated manually. In the tested scenario, features that were annotated with the category LOWLAND have been evaluated for the validity of being annotated with the category FLOODPLAIN Expected Impacts We expect that if the proposed method is integrated into a semantic annotation component for GI web services (like the one currently developed in the SWING project (Grcar et al. 2007b)) the number of semantically annotated information sources will increase. The availability of semantic annotations will widen the scope for assessing semantic interoperability in GI web service environments considerably. The annotations generated with the support of the proposed method reflect different perspectives on the same geographic scene. This will increase the quality of the search results, since more sources can be discovered that were originally produced in a different context. Further, the use of a well-defined geospatial domain ontology for annotation ensures that information pro-

30 Synopsis 23 vider and information requestor understand the information in the same way, thus allowing the dataset to be used in multiple contexts for which they were not originally intended. For users that are unfamiliar with ontologies, it is not trivial to come up with valid semantic annotations, let alone to identify additional categories for extending the annotation. Therefore, the proposed method will increase the value and acceptance of the semantic annotation task. Further advantages for the person in charge of annotating are the reduction of labor time and the possibility to assess the validity of annotations on the fly. In the long term, the thesis results can therefore be seen as a contribution for reducing the high barrier for using ontologies in information systems that currently impedes the successful implementation of semantically enabled service environments. In that respect we expect that it helps to bring formalized semantics in information systems into the mainstream of IT Future Work From the research presented in this thesis, we can identify several research lines for continuing the work in the context of supporting the semantic annotation of GI. Extending the geospatial domain ontology We have concentrated on identifying and describing the most generic and commonly agreed upon categories of the geospatial domain. The resulting foundational structure can now be used to align more specialized domain ontologies. Extensions to this structure are currently under development as part of the ontology engineering efforts in the SWING project and the COMPASS project including domains of mineral resources, geology, hydrology, ecology, and others. It will be interesting to observe, whether the ontologies developed in different research projects can benefit from this ontological foundation and whether they will eventually be comparable. So far, we have restricted the domain ontology to capture only those entities that are directly or indirectly located in geographic space. Temporal aspects and processes are not considered. Yet, in a geospatial domain ontology, processes in the landscape play an important role e.g. landslides, hurricanes, and flooding events. For this, it will be necessary to extend the ontological analysis to entities that are not only directly or indirectly located, but also to those that happen in geographic space. Based on the formal characterizations of processes, it will be interesting to examine, whether the scope of the method for evaluating category membership can be extended to other types of data, e.g. data that represent processes. The authors of DOLCE emphasize the cognitive bias of their ontology in the sense that it aims at capturing the ontological categories underlying natural language and human common sense (Masolo et al. 2003, p.13). By aligning the generic categories of the geospatial domain to DOLCE, we assume that the resulting structure is cognitively biased. However, this assumption still needs empirical verification. In general, the proposed ontological foundation and the more specialized domain ontologies that are aligned to it need to be evaluated by humans for their usefulness. Strategies for evaluating the usefulness of an ontology are developed in the SWING project (Schade et al. 2008).

31 Chapter 1 24 Extending the prototype implementation The immediate follow-up activities will concentrate on extending the prototype application in such a way that it can be integrated into the semantic annotation component developed in the context of the SWING project. So far, we have evaluated the proposed method on a small scale within the prototype application in ArcGIS. More work needs to be done in the following directions: The mapping between the spatial properties and the ArcGIS operations is currently hardcoded, but could be implemented dynamically. For this, we can specify semantic annotations for the geoprocessing functionalities of different providers. In that way we leave the decision open, in which geoprocessing framework the spatial analyses will be executed. Based on the semantic annotations, the mappings remain flexible and might be performed automatically. Only a small portion of the geoprocessing functionality has been implemented so far and extensions are possible in many directions. The next step has to be the integration of analysis functionalities for raster and remote sensing data. For this, we can build on work described in e.g. (Silva et al. 2005). A diploma thesis at the Institute for Geoinformatics at the University of Muenster currently investigates these issues. Another important aspect is the investigation to what extent the proposed method could be implemented with web services that provide the required geoprocessing functionality. For the annotation task in a GI web service environment, it would be beneficial to make use of available GI web services for executing the rules for category membership evaluation on the feature instances. However, problems are anticipated for several reasons. First, geoprocessing web services are still immature and only a few are available online. The OGC specification for a Web Processing Service (WPS) has only recently been approved (OGC 2007). Second, depending on the amount of data that has to be processed and the complexity of the analysis procedure the computational cost might be too high to be performed over the Web (Friis-Christensen et al. 2007). However, work on geoprocessing web services is rapidly advancing (Díaz et al. 2008; Förster and Schäffer 2007). Further, projects like the German GDI-GRID 1 investigate the use of grid technology to support computationally costly geoprocessing over the Web. The migration of the functionalities implemented in our test environment to a web service based processing environment is currently investigated with a diploma thesis at the Institute for Geoinformatics. Widening the scope for automatic support With the currently available technologies the overall goal of a fully automated support for semantic annotation is not realizable. However, we see a great potential in the combination of different strategies that rely on exploiting different resources for suggesting, detecting, or evaluating the category membership of represented entities. Tools for string-based matching (e.g (Noy and Musen 2000)) can be employed in a pre-processing step to compare the terms used for naming categories in the ontology and the terms used for naming feature types. In the SWING project, we will further investigate term-matching algorithms based on text mining as described in (Grcar and Klien 2007). Apart from text mining approaches, we also suggest investigating alternative techniques like hypothesis checking (i.e. using linguistic patterns such as term1 is a term2 as a query to a search engine) and Google Distance (Cilibrasi and Vitanyi 2004). In gen- 1 Spatial Data Infrastructure Grid,

32 Synopsis 25 eral, algorithms for automating the annotation of web content are in the focus of the Semantic Web community (e.g. (Bouquet et al. 2006)). These should be further monitored and investigated for their applicability to GI web service environments. Another perspective can be integrated into the process by exploiting feedback gained from user interaction on the Web. This area is currently examined in another PhD thesis at the Institute for Geoinformatics (Maué 2007). We assume that a combination of all these methods will eventually lead to reasonable results in the annotation process with only little need for supervision. 1.6 Roadmap Through the Thesis The remaining chapters are original publications in peer-reviewed journals and conference proceedings. In the following, we give a roadmap through this set of papers. The content of the published papers remains unchanged, while the layout has been adapted to the overall layout of this thesis. The papers are arranged according to their time of writing thus reflecting the evolution of our research. While reading the executive summaries below, please compare with the picture in Figure 1.8, which illustrates how the different chapters relate to our three research strands. CHAPTER 2 CHAPTER 3 CHAPTER 4 CHAPTER 5 CHAPTER 6 CHAPTER 7 Lutz, M. and E. Klien (2006). Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science (IJGIS) 20(3): Klien, E. and Probst, F. (2005). Requirements for Geospatial Ontology Engineering. In: Toppen, F. and Painho, M. (eds.). Proceedings of the 8th Conference on Geographic Information Science (AGILE 2005), Estoril, Portugal, pp Klien, E. and Lutz, M. (2005). The Role of Spatial Relations in Automating the Semantic Annotation of Geodata. In: Cohn, A. and Mark, D (eds.). Proceedings of the Conference of Spatial Information Theory (COSIT'05), Ellicottville, NY, USA. Lecture Notes in Computer Science, Vol. 3693, pp Klien, E. (2007). A Rule-Based Strategy for the Semantic Annotation of Geodata. Transactions in GIS, Special Issue on the Geospatial Semantic Web 11(3): Klien, E., D. I. Fitzner and P. Maué (2007). Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework. In: Wachowicz, M. and Bodum L. (eds.): 10th Conference on Geographic Information Science (AGILE 2007), Aalborg, Denmark. Klien, E., Probst, F. and Nientiedt, M (2008). Category Membership Evaluation for Geographic Entities Ontological Foundations and Implementation. under review In Chapter 2, we set the scene on how we can utilize the common vocabulary specified in domain ontologies to enhance discovery and retrieval of GI in SDIs. The paper describes an information infrastructure in which domain ontologies are used to semantically annotate both

33 Chapter 1 26 service advertisements and service requests. The proposed approach for semantic annotation of geographic feature types relies on the use of application ontologies and registration mappings. With this work, we have shown that ontologies can be successfully employed in information infrastructures to improve discovery and retrieval. The availability of semantically annotated services is crucial, but generating these annotations is a laborious task. We therefore identified the need for automated support for the task of generating semantic annotations of GI. For the semantic annotation of GI, we need geospatial domain ontologies as a common ground to which members of different communities can commit. In Chapter 3, we present a study on the requirements for geospatial ontology engineering to arrive at sound domain ontologies for different geographic information communities. The study provides the basis for all subsequent ontology engineering in the scope if this thesis. In Chapter 4, we sketch out the idea on how to use the specific characteristics of geographic data to support the process of detecting whether an information source is annotated with the concepts captured in a domain ontology. Here, we introduce for the first time the idea for what we will later call category membership evaluation. The paper concentrates on investigating the role of spatial relations (that exist implicitly in any geographic dataset) for category membership evaluation. The approach on semantic annotation from Chapter 2 revealed some drawbacks in terms of complexity and ontological inconsistency. In Chapter 5 we define a conceptual model and strategy for semantic annotations of geographic features types in GI web service environments. Additionally, we further advance the method for category membership evaluation introduced in Chapter 4. Here we propose to define formal rules for category membership evaluation that are defined as part of domain ontologies. Chapter 6 builds upon the foundations provided in Chapter 5 and gives a concrete example of how the conceptual model for semantic annotation can be implemented. For this, we use the semantic web service framework WSMO and the Web Service Modeling Language (WSML). However, no support for automating the annotation task is integrated at this stage. In Chapter 7, we clarify the ontological assumptions behind our approach and we perform an ontological analysis of entities that are directly or indirectly located in geographic space. By aligning the generic categories to the foundational ontology DOLCE, we develop a formal and well-defined foundational structure for the geospatial domain. More specific domain ontologies that are aligned to this structure will meet the requirements that we have formulated for domain ontologies used in annotation, i.e. (i) they will be comparable, and (ii) they will exhibit a structure that allows for multiple views in an ontologically consistent way. Further, we present a prototypical implementation of a test environment for the proposed method. With this prototype, we show the applicability of category membership evaluation in a scenario for evaluating the validity of FLOODPLAIN annotations that were generated manually.

34 Synopsis 27 Conceptual Model for Semantic Annotation of Geographic Feature Types Formal ontological foundations for the Geospatial Domain Method for automatic support of the annotation process CHAPTER I: Synopsis CHAPTER II: Ontology-Based Retrieval of Geographic Information Proposes semantic annotations based on application ontologies and registration mappings Employs domain ontologies for discovery and retrieval of GI Identifies the need for automatic support in annotation task CHAPTER III: Requirements for Geospatial Ontology Engineering Further rationale for the benefits of semantic annotations Discusses the requirements for and principles of Geospatial Ontology Engineering CHAPTER IV: The Role of Spatial Relations... Introduces the idea of category membership evaluation by exploiting the geometry and topology of instance data Investigates the role of spatial relations for category membership evaluation CHAPTER V: A rule-based strategy for the semantic annotation of geodata Defines a conceptual model and strategy for semantic annotation in a Semantic Geospatial Web framework Introduces formal rules for performing category membership evaluation as part of the annotation process CHAPTER VI: Baseline for Registering... Provides an implementation of the model in the Semantic Web Service Framework WSMO CHAPTER VII: Category Membership Evaluation for Geographic Entities Ontological Foundations and Implementation Provides the formal ontological foundations for implementing and integrating the method for category membership evaluation in the semantic annotion process Performs a formal ontological analysis of entities that are located in geographic space Describes a prototype implementation for evaluating the proposed method Figure 1.8: Illustration on how the papers compiled in this thesis relate to the three research strands. Glossary Research in Semantics and related disciplines is being carried out from manifold perspectives. This has lead to a) an inconsistent use of core terminology and b) the invention of specific terms to avoid confusion with same terminology used in different contexts, leading to author-specific terminology. Even in the short period of this PhD research, my own terminology has slightly changed due to the need to demarcate my personal usage of terms from their usage in other contexts. A valuable insight gained from research in semantics is the acknowledgement that there is no such thing as

35 Chapter 1 28 semantic truth for the interpretation of symbols used in communication. Consequently, our challenge is not so much to find and use the right terminology (although some core agreement should be followed), but to be careful and concise with explicating the meaning, understanding and usage of a term in a specific context. To avoid confusion, the glossary in Table 1.1 defines the most important terms as used in the synopsis. Table 1.1: Glossary for clarifying the usage of terms in the synopsis Term used in the Synopsis Semantic Annotations Semantic Annotation Category Concept Geographic Entity Geographic Feature Geographic Space Geospatial Domain Ontology Feature Type Ontology Foundational Ontology Explanation Logic statements that establish a formalized link between a category description in an ontology and feature types in a data source. The process of explicating the meaning of terms that carry geographic information by establishing a link between a feature type and a formal category description. This link is identified based on the knowledge that the feature instances represent geographic entities that are members of that category. We follow the definition from (Masolo et al. 2003, p.13) that categories are cognitive artifacts ultimately depending on human perception, cultural imprints and social conventions. We follow the definition from (Sloman et al. 1998, p.192) that concepts and categories are, to a large extent, flip sides of the same coin. Roughly speaking, a concept is the idea that characterizes a set, or category, of objects. Geographic entities including include everything that humans experience and conceptualize in geographic space. Geographic entities are those entities that are directly or indirectly located or happen in geographic space. A geographic feature is defined as an abstraction of a real world entity with a location relative to the Earth (OGC 2002). Accordingly, we use the term geographic feature for information objects stored in a database that represent geographic entities (used in the context of OGC web services where geodata is served as features). Geographic space is a large-scale space that cannot be observed from a single viewpoint. A large-scale space can be explored only by navigating in it, and we conceptualize it from multiple views. Geospatial domain ontologies comprise those entities that are directly or indirectly located or happen in geographic space. A feature type ontology encodes a feature type schema in an ontology representation language. A feature type ontology can be linked to domain ontologies through semantic annotation. We refer to the definition given by (Schneider 2003, p.1), stating that foundational ontologies are axiomatic theories of domain-independent upper-level notions such as object, attribute, event, parthood, dependence, and spatio-temporal connection. Foundational ontologies capture the most generic categories in an ontologically rigorous way (Masolo et al. 2003).

36 Synopsis 29 References Arpinar, B., Sheth, A., Ramakrishnan, C., Usery, L., Azami, M. and Kwan, M.-P., Geospatial Ontology Development and Semantic Analytics. In: Wilson, J. P. and Fotheringham, A. S. (eds.), Handbook of Geographic Information Science, Blackwell Publishing. (in print, 2005, 21 pages). Bayerl, P. S., Ungen, H. L., Gut, U. and Paul, K. I., Methodology for Reliable Schema Development and Evaluation of Manual Annotations. Proceedings of the Workshop on Knowlegde Markup and Semantic Annotation at the Second International Conference on Knowledge Capture (K-CAP'03), Sanibel Island, Florida, USA. Bouquet, P., Brunelli, R., Chanod, J.-P., Niederée, C. and Stoermer, H. (eds.), Mastering the Gap: From Information Extraction to Semantic Representation, Budva, Montenegro, CEUR Workshop Proceedings, Vol-187, online Bowers, S., Lin, K. and Ludäscher, B., On Integrating Scientific Resources through Semantic Registration. Proceedings of 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), Santorini Island, Greece, pp Bowers, S. and Ludäscher, B., An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In: Rahm, E (ed.), Proceedings of International Workshop on Data Integration in the Life Sciences (DILS'04), Leipzig, Germany, LNCS 2994, Springer, pp Brodeur, J., Bédard, Y. and Moulin, B., A geosemantic proximity-based prototype for the interoperability of geospatial data. Computer, Environment and Urban Systems, 29 (6), pp Casati, R., Smith, B. and Varzi, A., Ontological Tools for Geographic Representation. In: Guarino, N. (ed.), Proceedings of the International Conference on Formal Ontology in Information Systems (FOIS 98), Trento, Italy, pp Cilibrasi, R. and Vitanyi, P., Automatic Meaning Discovery Using Google, online de Bruijn, J. (ed.), The Web Service Modeling Language WSML, online Díaz, L., Granell, C. and Gould, M., Case Study: Geospatial processing services for webbased hydrological applications. In: Sample J.T., Shaw, K., Tu, S. and Abdelguerfi, M (eds.), Geospatial Services and Applications for the Internet, Springer (in press, released July 2008). EC, i2010 A European Information Society for growth and employment. Communication from the Commission to the Council, the European Parliament, the European Economic and Social Committee and the Committee of the Regions. Report COM(2005) 229 final. Commission of the European Communities. Egenhofer, M. and Mark, D., Naive Geography. In: Frank, A. and Kuhn, W. (eds.), Proceedings of the Conference on Spatial Information Theory, Semmering, Austria, pp Fellbaum, C., WordNet. An Electronic Lexical Database. MIT Press, Cambridge, MA.

37 Chapter 1 30 Fitzner, D. I., Hoffmann, J. and Klien, E., Functional Description of Geoprocessing Services as Conjunctive Queries, under review. Fonseca, F. and Martin, J., Learning The Differences Between Ontologies and Conceptual Schemas Through Ontology-Driven Information Systems. Journal of the Association for Information Systems, 8 (2), pp Förster, T. and Schäffer, B., A client for distributed geo-processing on the web. In: Ware, J. M. and Taylor, G. E. (eds.), Proceedings of Web and wireless geographical information systems (W2GIS 2007), Cardiff, United Kingdom, pp Frank, A., Ontology for Spatio-temporal Databases. In: Koubarakis, M. et al. (eds.), Spatiotemporal Databases: The Chorochronos Approach, Lecture Notes in Computer Science 2520, Springer, Berlin, pp Friis-Christensen, A., Bernard, L., Lutz, M. and Ostländer, N., Designing Service Architectures for Distributed Geoprocessing - Challenges and Future Directions. Transactions in GIS, 11 (6), pp Grcar, M. and Klien, E., Using Term-matching Algorithms for the Annotation of Geoservices. Proceedings of the Web Mining 2.0 Workshop, in conjunction with ECML- PKDD 2007, Warsaw, Poland. (in press) Grcar, M., Klien, E., Fitzner, D. I., Maué, P., Mladenic, D. and Grobelnik, M., 2007a. D4.1 Representational language for Web-service annotation models. Deliverable of the SWING project, online Grcar, M., Novak, B., Klien, E. and Hoffmann, J., 2007b. D4.2 Dealing with highly ambiguous cost-sensitive descriptions. Deliverable of the SWING project, online Grenon, P. and Smith, B., SNAP and SPAN: Towards Dynamic Spatial Ontology. Spatial Cognition and Computation, 4 (1), pp Guarino, N., Formal Ontology and Information Systems. In: Guarino, N. (ed.), Proceedings of the International Conference on Formal Ontology in Information Systems (FOIS 98), Trento, Italy, pp Halevy, A. Y., Ives, Z. G., Mork, P. and Tatarinov, I., Piazza: data management infrastructure for semantic web applications. Proceedings of the 12th International World Wide Web Conference (WWW03), Budapest, Hungary, ACM, pp Hammond, T., Hannay, T., Lund, B. and Scott, J., Social Bookmarking Tools (I). A General Review. D-Lib Magazine, 11(4), online Horrocks, I., Patel-Schneider, P., Boley, H., Tabet, S., Grosof, B. and Dean, M., SWRL: A Semantic Web Rule Language. Combining OWL and RuleML. W3C Member Submission 21 May 2004, online Janowicz, K., Sim-DL: Towards a Semantic Similarity Measurement Theory for the Description Logic ALCNR in Geographic Information Retrieval. In: R. Meersman, Z. Tari, P. Herrero et al. (eds.): SeBGIS 2006, OTM Workshops 2006, LNCS 4278, Springer, pp

38 Synopsis 31 Klien, E., Schade, S. and Hoffmann, J., D3.1 Ontologies in the SWING Application - Requirement Specification. Deliverable of the SWING project, online Kuhn, W., Geospatial Semantics: Why, of What, and How? In: Spaccapietra, S and Zimányi E. (eds), Journal of Data Semantics III, LNCS 3534, Springer, pp Lemmens, R. and Vries, M., Semantic Description of Location Based Services using an Extensible Location Ontology. Proceedings of Muenster GI-Days 2004, Muenster, Germany, pp Lutz, M., Ontology-Based Descriptions for Semantic Discovery and Composition of Geoprocessing Services. GeoInformatica, forthcoming. Lutz, M. and Kolas, D., Rule-based Discovery in Spatial Data Infrastructures. Transactions in GIS, Special Issue on the Geospatial Semantic Web, 11 (3), pp Masolo, C., Borgo, S., Gangemi, A., Guarino, N. and Oltramari, A., Ontology Library. WonderWeb Deliverable D18, ISTC-CNR, online Maué, P., Collaborative Metadata for Geographic Information. Proceedings of the Conference on Social Semantic Web (CSSW), Leipzig, Germany, CEUR Proceedings, online Nientiedt, M., Category Membership Evaluation based on Spatial Analysis Procedures. Diploma thesis. Institute for Geoinformatics, University of Muenster, Muenster. Noy, N. F. and Musen, M. A., PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Proceedings of Seventeenth National Conference on Artificial Intelligence (AAAI-2000), Austin, USA, pp OGC, OpenGIS Reference Model. OpenGIS Consortium. OGC , online OGC, OpenGIS Web Feature Service Implementation Specification 1.1. OpenGIS Standard, Open Geospatial Consortium Inc, online /standards/wfs. OGC, OpenGIS Web Processing Service. OpenGIS Standard, OGC r7, Open Geospatial Consortium Inc., online Papadias, D. and Kavouras, M., Acquiring, Representing and Processing Spatial Relations. In: Kraak, J.-M. and Molenaar, M. (eds.), Proceedings of Sixth International Symposium on Spatial Data Handling, Edinburgh, Scotland, Taylor & Francis, pp Probst, F., An Ontological Analysis of Observations and Measurements. In: Raubal, M., Miller, H.J., Frank, A.U. and Goodchild, M.F. (eds.), Proceedings of Fourth International Conference of Geographic Information Science (GIScience), Muenster, Germany, Springer, pp Probst, F., Semantic Reference Systems for Observation and Measurement. Dissertation thesis, Institute for Geoinformatics, University of Muenster, Muenster. Roman, D., Keller, U., Lausen, H., de Bruijn, J., Lara, R., Stollberg, M., Polleres, A., Feier, C., Bussler, C. and Fensel, D., Web Service Modeling Ontology. Applied Ontology, 1 (1), pp

39 Chapter 1 32 Roman, D. and Klien, E., SWING - A Semantic Framework for Geospatial Services. In: Scharl, A. and Tochtermann, K. (eds.), The Geospatial Web, How Geo-Browsers, Social Software and the Web 2.0 are Shaping the Network Society, Springer, pp Schade, S., Maué, P., Klien, E. and Fitzner, D. I., D3.2 Modeling Methodology. Deliverable of the SWING project, online Schneider, L., Designing Foundational Ontologies - The Object-Centered High-level Reference Ontology OCHRE as a Case Study. In: Song, I.-Y., Liddle, S.W., Ling, T.W. and Scheuermann, P. (eds.), Conceptual Modeling - ER 2003, 22nd International Conference on Conceptual Modeling, LNCS 2813, Springer, pp Schwering, A., Hybrid model for semantic similarity measurement. Proceedings of 4th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE05), Agia Napa, Cyprus, Springer, Sheth, A. P., Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics. In: Goodchild, M. F., Egenhofer, M., Fegeas, R. and Kottman, C. A. (eds.), Interoperating Geographic Information Systems, Kluwer, pp Silva, M., Camara, G., Souza, R., Valeriano, D. and Escada, I., Mining Patterns of Change in Remote Sensing Image Databases. Proceedings of Fifth IEEE International Conference on Data Mining, Houston, TX, USA, pp Sloman, S. A., Love, B. C. and Ahn, W.-K., Feature centrality and conceptual coherence. Cognitive Science, 22 (2), pp Smith, B. and Mark, D., Ontology and Geographic Kinds. In: Poiker, T. and Chrisman, N. eds.), Proceedings of 8th Int. Symposium on Spatial Data Handling (SDH'98), Vancouver, Canada, pp Speller, E., Collaborative tagging, folksonomies, distributed classification or ethnoclassification: a literature review. Library Student Journal, University of Buffalo, online Welty, C. and Guarino, N., Support for ontological analysis of taxonomic relationships. Data Knowledge Engineering, 39 (1), pp

40 Chapter 2 Ontology-Based Retrieval of Geographic Information Lutz, M. and E. Klien (2006). Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science (IJGIS) 20(3): Abstract. Discovering and accessing suitable geographic information (GI) in the open and distributed environments of current Spatial Data Infrastructures (SDIs) is a crucial task. Catalogues provide searchable repositories of information descriptions, but the mechanisms to support GI retrieval are still insufficient. Problems of semantic heterogeneity caused by the ambiguity of natural language can arise during keyword-based search in catalogues and when formulating a query to access the discovered data. In this paper we present an approach to ontology-based GI retrieval that contributes to solving existing problems of semantic heterogeneity and hides most of the complexity of the required procedure from the requester. A query language and graphical user interface allow a requester to intuitively formulate a query using a well-known domain vocabulary. From this query, an ontology concept is derived, which is then used to search a catalogue for a data source that provides all the information required to answer the requester s query. If a suitable data source is discovered, the relevant data is accessed through a standardised interface. The approach is implemented through several components that can be used as an extension to standard SDIs.

41 Chapter Introduction Efficient retrieval of distributed geographic information (GI) is a key factor in planning and decision-making in a variety of domains. The specifications provided by the Open Geospatial Consortium (OGC) enable syntactic interoperability and cataloguing of GI. However, a number of problems caused by semantic heterogeneity still present challenges for GI retrieval in the open and distributed environments of Spatial Data Infrastructures (SDIs). One possible approach to overcome these problems is the explication of knowledge by means of ontologies (Wache et al. 2001), i.e. explicit formal specifications of shared conceptualizations (Gruber 1995, Studer et al. 1998). In (Klien et al. 2004) we have presented an approach and architecture for ontology-based discovery and retrieval of geographic information that overcomes some problems caused by semantic heterogeneity. It supports a requester in formulating queries for (1) finding one GI source in a catalogue service that provides all the information required to solve the requester s problem, and for (2) retrieving the discovered information through a Web Feature Service (WFS). However, the support for retrieval in this approach was limited. The requester was provided with an ontological description of the selected feature type and its properties. Based on this description, it was then up to the requester to interpret the meaning of the feature type s property names and to select the appropriate ones for formulating a WFS query. In this paper we present an extension of the original approach that makes the overall task of GI retrieval more user-friendly by hiding the processes of discovering appropriate WFS and of formulating the actual WFS query. Thus, the requester s only task is to formulate one query statement using terms from existing ontologies from which both the catalogue query and the WFS query can automatically be derived. The remainder of the paper is structured as follows. Section 2.2 introduces a motivating example for our work and gives a detailed description of the types of semantic heterogeneity problems that may occur during the process of GI retrieval. The theoretical basis for the presented approach is described in section 2.3, which introduces guidelines for ontology development, and Section 2.4, which introduces our method to ontology-based discovery of geographic information. In Section 2.5, the approach to ontology-based retrieval is presented, including a description of the workflow, the query languages, the user interface and an example walk through illustrating a prototypical implementation. The work is compared to related work in Section 2.6, a conclusion and pointers to future work are given in Section Semantic Heterogeneity Problems in GI Discovery and Retrieval Emerging SDIs facilitate the discovery and retrieval of distributed geographic information. However, some key problems caused by semantic heterogeneity remain to be solved. In this section, we present mechanisms and components for GI discovery and retrieval that already exist in current SDIs (Section 2.2.1). We then introduce a running example that we use through-

42 Ontology-Based Retrieval of Geographic Information 35 out the paper (Section 2.2.2) and point out problems caused by semantic heterogeneity in this example (Section 2.2.3) GI Discovery and Retrieval in Current Spatial Data Infrastructures A main motivation for setting up SDIs is to make the work with geodata more efficient (McKee 2000, Nebert 2001) by addressing problems that occur with conventional GIS technology and geographic data sets. For the work presented here, the most crucial of these problems are that data sets exist in a plethora of different data formats, are stored in a variety of different systems and that both are often not sufficiently or not at all documented. There are two main standardisation efforts in the geospatial domain, whose goal is to overcome these problems: the ISO Technical Committee (TC) 211, which develops the series of standards, and the Open Geospatial Consortium (OGC). GI Discovery in SDIs. In an SDI context, clients and data sources are usually arbitrarily distributed in large networks and unaware of each other. In such a scenario, missing or insufficient documentation makes it difficult or even impossible for users to discover data sets and to assess whether a given data set is useful for their tasks. Catalogues are used to solve these problems, which makes them a fundamental part of SDIs. They allow a client (service consumer) to find spatial resources (data and services) available on servers (service providers) that are unknown to the client and fit the client s needs. Service providers offer particular data access and geoprocessing (data manipulation) services. Both types of spatial resources are described by metadata. The catalogue itself consists of the metadata and the operations on these metadata. In general each ser-vice provider has to register (publish) his offerings by means of metadata to a catalogue to enable accessibility. A catalogue may also collect metadata from known service providers (pull). In addition to these registration functions a catalogue provides library functions (discovery, browsing, querying) for service consumers. GI Retrieval in SDIs. The heterogeneity of formats and systems makes it difficult to pose queries and to use data sets in one s own system and usually requires some form of conversion. In SDIs, this problem is addressed by standardising service interfaces and exchange formats The standardisation of interfaces allows the classification of services in well-known service types that provide the behaviour specified by the interface. Thus, it becomes possible to connect arbitrary service instances as long as they are of a well-known service type. The specification of the Web Feature Service (WFS) (OGC 2002b) proposes interfaces for describing data manipulation operations on geographic features. Among other things, the WFS interface provides the ability to retrieve features based on spatial and non-spatial constraints (through the GetFeature operation) and to generate a schema description of the feature types provided by a WFS implementation (through the DescribeFeatureType operation). The definition of the Geography Markup Language (GML) (OGC 2003) as an open, vendorneutral XML encoding for the definition of geospatial application schemas and objects increases the ability of organisations to share geographic information. GML provides a variety of kinds of objects for describing geography including features. A feature is an abstraction of a real world

43 Chapter 2 36 phenomenon; it is a geographic feature if it is associated with a location relative to the Earth. The number of properties a feature may have, together with their names and types, are determined by its type definition A Running Example Throughout this paper we use a running example to point out semantic heterogeneity problems that can occur when using state-of-the-art GI retrieval mechanisms and to illustrate how these problems can be solved using our proposed approach. Our work, however, is not restricted to only this example but is designed to be independent of a particular GI domain. Susan is a hydrologist who is interested in water levels of the Elbe River. As an expert in the field she knows the existing control points in the river. She wants to know the measurement of the water level at a specific control point at a specified time. Since Susan does not know about an existing WFS offering this kind of information, she makes use of an OGC-compliant catalogue in order to find appropriate information for answering her question: What is the water level at control point X at time Y in the Elbe River? Note that in this paper, we always assume that a requester searches for only one source that provides all the required information. There are several agencies that offer information about water levels in the Elbe River: The Federal Agency for Hydrology (Bundesanstalt für Gewässerkunde, BafG) The Electronic Information System for Waterways (Elektronisches Wasserstraßen- Informationssystem, ELWIS) The Czech Hydrometeorological Institute (CHMI) While currently, these agencies only provide their data as HTML pages, the information can easily be parsed and provided through standardized WFS interfaces. The access points to these services are provided at Table 2.1 lists the names of the GML features returned by these WFS and their property names. Table 2.1: Names of the GML features returned by three WFS and their properties WFS BafG ELWIS CHMI Feature Pegelmessung WasserstandMessung StavVody Name of the control point name pegel stanice Water level in m (BafG) or cm (ELWIS and CHMI) wasserstand_m hoehe stav Date and time of the measurement zeitpunkt datum Date of the measurement Time of the measurement datum uhrzeit Geometry as Point gml: pointproperty standort gml:position Name of the river Discharge in cubic meters per second tok prutok

44 Ontology-Based Retrieval of Geographic Information 37 In order to obtain the required information Susan has to perform two tasks: first she has to find a WFS that provides suitable information to answer this question (GI discovery) and second she has to formulate a WFS query filter for accessing the data she needs (GI retrieval) Problems Caused by Semantic Heterogeneity In current standards-based catalogues (e.g. (GDI-NRW 2002)) users can formulate queries using keywords and/or spatial filters. The metadata fields that can be included in the query depend on the metadata schema used (e.g. ISO 19115) and on the query functionality of the service that is used for accessing the metadata. Even though natural language processing techniques can increase the semantic relevance of search results with respect to the search request (e.g. (Richardson and Smeaton 1995)), keyword-based techniques are inherently restricted by the ambiguities of natural language. If different terminology is used by providers and requesters keyword-based search can have low recall, i.e. not all relevant information sources are discovered. If terms are homonymous or because the ability to express complex queries in keyword-based search is limited precision can also be low, i.e. some of the discovered services are not relevant (Bernstein and Klein 2002). For example, if Susan, who is interested in the measurement of the water level at a specific control point of the river Elbe, uses water level as a keyword she may fail to find the existing WFS that are offering this information (low recall), because their metadata descriptions use slightly different terminologies, e.g. depth or watermark (table 2.2). Furthermore, she might also discover GI that are annotated with this keyword but not appropriate for answering her question (low precision), e.g. a service providing groundwater rather than surface water levels. Table 2.2: Keywords in the metadata used to describe the three WFS WFS Bafg ELWIS CHMI Keywords in the Metadata water level, measurement, Elbe control point, tide scale, river, depth watermark, measurement gauge, Elbe Another major difficulty can arise during the second task when Susan wants to access GI via one of the discovered WFS. The DescribeFeatureType request (OGC 2002b) returns the application schema for the feature type, which is essential for formulating a query filter. Susan now runs into trouble if the property names are not intuitively interpretable. For example, she can only guess that the property names hoehe (ELWIS) or wasserstand_m (BafG) (see table 2.1) both refer to the measurement of the water level in a river. Also, it is not obvious that the first measurements are given in centimeters while the second measurements are given in meters. In this scenario it might be sufficient to offer Susan a natural language description for each property. However, our work is aiming at automating the process of discovery and retrieval and this makes a machine-interpretable description of the properties indispensable.

45 Chapter Semantic Descriptions of Geographic Information To overcome the problems described above, we propose to use ontological descriptions of information sources. An ontology is an explicit formal specification of a shared conceptualization (Gruber 1995, Studer et al. 1998), where a conceptualization can be defined as a way of thinking about some domain (Uschold 1998). By using ontologies to enrich the description of information sources, the semantics of their content become machine-interpretable, and users are enabled to pose concise and expressive queries. Furthermore, logical reasoning can be used to discover implicit relationships between search terms and information descriptions as well as to flexibly construct taxonomies for classifying information sources. In this section, we first introduce the overall approach to building ontologies in Section We then introduce guidelines for building domain and application ontologies (Sections and 2.3.3). We conclude the section with introducing the link between the schema of a feature type and its application concept through registration mappings in Section Ontology Approach The ontologies used for making the semantics of information sources explicit can be organized in different ways. In (Wache et al. 2001), a classification of different ontology architectures for data integration is introduced. In the following, we translated the different approaches into the domain of GI discovery. In multiple ontology approaches, each information source or query is described by its own local (or application) ontology. In principle, each of these local ontologies can be a combination of several other ontologies. However, it cannot be assumed that several local ontologies share the same vocabulary. This lack of a common vocabulary makes it difficult to compare different application ontologies. In contrast, single ontology approaches use one global ontology, which provides a shared vocabulary for specifying the semantics of all information sources and queries. Such approaches can be applied to problems where all semantic descriptions available in a catalogue have been created with a very similar view on a domain, which also has to be shared by all requesters. Hybrid approaches also use a global shared vocabulary, which contains basic terms (the primitives) of a domain. These can be combined the describe the (more complex) semantics of each information source or query in separate application ontologies. In contrast to multiple ontology approaches, the concepts in these application ontologies remain comparable, because they are based on the primitives from the shared vocabulary. In both the single ontology and the hybrid approach, it is assumed that the semantics of the primitives is understood (and this understanding is shared) by all requesters and providers in the domain. Therefore, the primitives require no further formal definitions. Nevertheless, it can sometimes be useful to represent a shared vocabulary as an ontology in order to impose a structure on it. Such an ontology is called a domain ontology. In our work we adopt the hybrid ontology approach. This allows providers and requesters to flexibly build application ontologies. At the same time, the application ontologies remain comparable, which is a crucial prerequisite for matching queries to advertisements. We consider the shared vocabulary to consist of several domain ontologies, each of which describes concepts

46 Ontology-Based Retrieval of Geographic Information 39 and relations in a particular domain of interest. Not all of these domain ontologies have to be on the same level of abstraction. Also, it is possible for ontologies on a more specific level to include concepts from ontologies on a more abstract level. In the running example, two domain ontologies describe the more abstract domain of measurements and the more specific domain of hydrology. The specialized application ontologies, which use concepts and relations from one or several domain ontologies, are used for annotating specific information sources such as the ELWIS, BafG and CHMI WFS (Figure 2.1). shared vocabulary domain ontology Measurements domain ontology Hydrology provides basic concepts and relations for specifying application ontology ELWIS application ontology BafG application ontology CHMI are used for semantic annotations of ELWIS BafG CHMI Figure 2.1: The hybrid ontology approach, modified from (Wache et al. 2001). The ontologies shown in this paper are expressed using a Description Logic (DL) (Baader and Nutt 2003) notation used in the RACER system (Haarslev and Möller 2004). DL is a family of knowledge representation languages that are subsets of first-order logic (for a mapping from DL to FOL, see e.g. (Sattler et al. 2003)). They provide the basis for the Ontology Web Language (OWL), the proposed standard language for the Semantic Web (Antoniou and Van Harmelen 2003). The basic syntactic building blocks of a DL are atomic concepts (unary predicates), atomic roles (binary predicates), and individuals (constants). The expressive power of DL languages is restricted to a small set of constructors for building complex concepts and roles. Implicit knowledge about concepts and individuals can be inferred automatically with the help of inference procedures (Baader and Nutt 2003). A DL knowledge base consists of a TBox containing intensional knowledge (declarations that describe general properties of concepts) and an ABox containing extensional knowledge that is specific to the individuals of the domain. In our work, we only use TBox language features, namely concept definition: (define-concept C D),

47 Chapter 2 40 concept inclusion: (implies C D), and role definition: (define-primitive-role R :parent P :domain C :range D). The domain of a role is a concept describing the set of all things from which this role can originate. This notion of the term should not be confused with the notion domain of interest (as in domain ontology). The range of a role is a concept describing the set of all things the role can lead to. Concepts can be defined using the following constructors: D (and E F) (intersection) (or E F) (union) (all R C) (value restriction) (some R C) (existential quantification) (at-least at-most exactly n R) (number restrictions) One major advantage of (simple) DLs (like the one employed in this paper) over FOL is that their inference procedures are decidable (Sattler et al. 2003). Of the available inference procedures, the possibility to compute subsumption relationships between concepts is of special importance for our work. Popular DL reasoners include e.g. RACER (Haarslev and Möller 2004) and Pellet ( For a more detailed introduction to DL languages including different subsumption algorithms see (Baader and Nutt 2003) Creating Domain Ontologies It has been suggested that relationships or roles play a central part in ontology engineering (Hart et al. 2004). Rather than building simple concept hierarchies, as is common practice in many existing approaches to semantic information discovery (e.g. (Paolucci et al. 2002, Stuckenschmidt et al. 2004, Vögele and Spittel 2004), we suggest that roles be used wherever possible for defining concepts. Using this approach, ontology engineers can build richer ontologies, which contain not only taxonomic but also non-taxonomic relationships. Thus, a concept does not have to be given a fixed position in a static hierarchy. Rather, its position in the hierarchy can be dynamically inferred based on existing concept and role definitions using subsumption reasoning. Based on these assumptions, we suggest a few guidelines that are meant to facilitate and (to some degree) standardise the development of domain ontologies. They will be illustrated using examples from the domain ontologies shown in Figure 2.2. For readers who are unfamiliar with the DL notation used, a schematic representation of the ontologies is given in Figure 2.3. In order to link roles and concepts, the ranges and, if possible, domains of roles should be defined. For example, the domain of the role observable is restricted to the concept Quantity and its range to the concept Phenomenon. o o If a role s range has been defined, it is sufficient to state in a concept definition that the concept has this role. The range does not have to be specified again (unless it is to be restricted within the scope of that concept). If a role s domain has been defined and the role is used in some concept definition, it can be inferred that the concept is a subconcept of the role s domain. For this reason, domains should not be specified if a role might also be used with other concepts. For

48 Ontology-Based Retrieval of Geographic Information 41 example, the domain of the role location might include other concepts than Measurement (i.e. all concepts that have a location) and therefore is left undefined. MEASUREMENTS Concept Definitions (define-concept Measurement (and (at-least 1 quantityresult) (exactly 1 location) (exactly 1 timestamp))) (define-concept Quantity (and (exactly 1 observable) (exactly 1 value) (exactly 1 unitofmeasure))) (implies Depth Phenomenon) (implies Centimeter Unit) Ranges and Domains of Roles (define-primitive-role quantityresult :domain Measurement :range Quantity) (define-primitive-role location :range gml_point) (define-primitive-role timestamp :range (or xsd_date xsd_datetime)) (define-primitive-role value :domain Quantity :range xsd_decimal) (define-primitive-role unitofmeasure :domain Quantity :range Unit) (define-primitive-role observable :domain Quantity :range Phenomenon) HYDROLOGY Concept Definitions (define-concept HydrologicalQuantity (and Quantity (all observable HydrologicaPhenomenon) (exactly 1 observedwaterbody))) (implies HydrologicalPhenomenon Phenomenon) (implies WaterLevel (and Depth HydrologicalPhenomenon)) (implies Discharge HydrologicalPhenomenon) (implies Lake WaterBody) (implies River WaterBody) Ranges and Domains of Roles (define-primitive-role observedwaterbody :domain HydrologicalQuantity :range WaterBody) Figure 2.2: Examples for role and concept definitions from the domains of Measurements and Hydrology As the relationships between concepts and roles have already been established by defining the ranges and domains of roles, concept definitions can be kept relatively simple. o o Peripheral concepts can be derived as subconcepts from other concepts to form simple hierarchies, e.g. to state that a Depth is a kind of Phenomenon. However, this should not be done for terms central to the domain, such as Measurement or Quantity in the domain of measurements. Central concepts should be defined using (a) value and (b) cardinality restrictions on existing roles. These restrictions must at least be sufficient to distinguish the concept from all other concepts in the domain. Ideally, they should also restrict possible interpretations of the defined concept to the intended interpretation.

49 Chapter 2 42 (a) Value restrictions are required to further constrain the range of a role for instances of the given concept. For example, all instances of a HydrologicalQuantity are only allowd instances of a HydrologicalPhenomenon (rather than any Phenomenon) as an observable. The value restrictions have to be consistent with the overall range definitions, i.e. HydrologicalPhenomenon has to be subsumed by Phenomenon. (b) Cardinality restrictions limit the number of occurrences of the restricted role in the given concept. For example, instances of Measurement must at least have one quantityresult and exactly one location and timestamp. Figure 2.3: Schematic representation of the DL definitions given in Figure 2.2. Where possible, the ranges of roles should be mapped to XML schema datatypes (W3C 2001) such as string or decimal or simple GML geometry types (OGC 2003) such as point or polygon (Table 2.3). This ensures that value comparisons can be used and evaluated in the user s query statements. For example, the range of timestamp is defined as being either a xsd_date or xsd_datetime. We assume that the semantics (and syntax) of these datatypes are well-known and agreed-upon. Therefore, for each XML schema datatype one equivalent concept is introduced (in specific XML datatypes and GML geometry types domain ontologies) without any further definitions.

50 Ontology-Based Retrieval of Geographic Information 43 Table 2.3: XML schema datatypes and GML geometry types primitive string boolean decimal float double duration datetime time date XML schema datatypes derived integer long int short byte nonnegativeinteger GML geometry types PointType LineStringType PolygonType CurveType ArcType CircleType Creating Application Ontologies The focus of application ontologies is on describing a concept that represents a geographic feature type. This feature type concept is defined by referring to and further restricting existing concepts and roles from the domain ontology. In our example, one concept is defined for each of the feature types provided by the three WFSs (table 2.1). The creation of application ontologies should follow the same guidelines as those described above for domain ontologies. Additionally, the following guidelines apply: The purpose of the application ontology is to represent the feature type s semantics rather than to capture its application schema. Therefore, application concepts should be derived (as subconcepts) from existing domain concepts. This strategy ensures that not only explicit information (i.e. what is represented in the schema) but also implicit information (such as the kind of observable or the unit of measurement in our example) is included in the concept definition. When application concepts are derived from a domain superconcept, their definitions only have to include (a) axioms that further restrict the superconcept s definition or (b) additional roles. (a) The axioms can be (all-quantified) value restrictions, which further constrain the range of a role, or cardinality restrictions, which further constrain a role s cardinality. (b) Typically, application ontologies will contain few definitions of new roles. Additional roles should be introduced to express application-specific relationships that do not exist in the domain ontology. These roles, however, cannot be used by requesters in their queries and therefore are not useful in GI discovery. Of course, it is always possible for a provider to include the new role(s) in the definition of a new domain ontology if he considers them to be relevant for more than just the application at hand. Subroles of existing roles from the domain ontology should be introduced, if otherwise the domain role occurred several times with different ranges in a concept definition. If R 1 is a super-role of R 2, then for all pairs of individuals between which R 2 holds, R 1 must hold too (Haarslev and Möller 2004). This means that the range and/or the domain of sub-roles are more restricted than those of the super-roles. The introduction of

51 Chapter 2 44 subroles with different ranges introduces an additional layer of granularity that allows to distinguish each occurrence of a role in a concept definition. This is a requirement of the registration mapping approach (Section 2.3.4). To illustrate these guidelines, let us consider the concept elwis_measurement, which represents the ELWIS feature type (Figure 2.4). It is defined as a specific kind of Measurement (the domain superconcept) with exactly 1 quantityresult (cardinality restriction) measuring a Water- Level in a River in Centimeter (value restrictions). Its timestamp is given as an xsd_date (value restriction) and it includes the additional roles elwis_timeofday and name. ELWIS BafG CHMI Concept Definitions (define-concept elwis_measurement (and Measurement (exactly 1 quantityresult) (all quantityresult (and (all unitofmeasure Centimeter) (all observable WaterLevel) (all observedwaterbody River))) (all timestamp xsd_date) (exactly 1 elwis_timeofday) (exactly 1 name) ) ) (define-concept bafg_measurement (and Measurement (exactly 1 quantityresult) (all quantityresult (and (all unitofmeasure Meter) (all observable WaterLevel) (all observedwaterbody River))) (all timestamp xsd_datetime) (exactly 1 name) ) ) (define-concept chmi_measurement (and Measurement (exactly 1 chmi_qrwaterlevel) (exactly 1 chmi_qrdischarge) (all timestamp xsd_datetime) (exactly 1 name) ) ) Role Definitions (define-primitive-role elwis_timeofday :range xsd_time) (define-primitive-role chmi_qrwaterlevel :parent quantityresult :range (and (all unitofmeasure Centimeter) (all observable WaterLevel) (all observedwaterbody (and River (exactly 1 name)))) (define-primitive-role chmi_qrdischarge :parent quantityresult :range (and (all unitofmeasure CubicMeter) (all observable Discharge) (all observedwaterbody River (and River (exactly 1 name)))) Figure 2.4: Examples for defining application concepts and roles. The definition of the concept representing the BAFG feature type is very similar. In contrast, the concept representing the CHMI feature type differs considerately from the other two. As the CHMI feature type contains two measurement values, one for a water level and one for the discharge, the corresponding concept definition would have to include two quantityresult roles

52 Ontology-Based Retrieval of Geographic Information 45 with different ranges. In order to distinguish between both roles, two subroles, chmi_qrwaterlevel and chmi_qrdischarge, are introduced. Both roles are derived from the parent role quantityresult and have different range restrictions Registration Mappings For ontology-based GI discovery it is sufficient to describe and reason about application concepts that represent feature types. For ontology-based GI retrieval, however, more specific information on the feature type s structure is required. To describe the relationships between feature type structure and application ontology, we have adopted the notion of registration mappings suggested in (Bowers and Ludäscher 2004). An example registration mapping for the CHMI feature type is shown in Figure 2.5. <?xml version="1.0" encoding="utf-8"?> <StavVody xmlns=" ( ) > <gml:position> <gml:point> <gml:coordinates> , </gml:coordinates> </gml:point> </gml:position> <tok>labe</tok> <stanice>usti N.L.</stanice> <stav>151</stav> <prutok>98.9</prutok> <datum> t07:00:00</datum> </StavVody> Structural Path /StavVody /StavVody/gml:position/gml:Point /StavVody/tok/text() /StavVody/stanice/text() /StavVody/stav /StavVody/stav/text() /StavVody/prutok /StavVody/prutok/text() /StavVody/datum/text() Conceptual Path chmi_measurement chmi_measurement.location chmi_measurement.quantityresult.observedwaterbody.name chmi_measurement.name chmi_measurement.chmi_qrwaterlevel chmi_measurement.chmi_qrwaterlevel.value chmi_measurement.chmi_qrdischarge chmi_measurement.chmi_qrdischarge.value chmi_measurement.timestamp Figure 2.5: Sample GML file (top) and registration mapping (bottom) for the CHMI feature type. The main idea of registration mappings is to have separate descriptions of the application concept C (called semantic type in (Bowers and Ludäscher 2004)) and of the structural details of the feature type it describes (called structural type). This has the following advantages:

53 Chapter 2 46 The application concept can be defined or updated after the service is deployed, without requiring changes to the structural type. The semantics of the feature type can be specified more accurately in application concepts if the specification does not try to mirror the feature type s structure. This is especially true for feature types that have a flat structure that does not well reflect the conceptual model of the domain (e.g. the property tok in Figure 2.5, which represents the name of the river, where the water level measurement was taken). A registration mapping consists of a set of rules that define associations between a feature type s structural and semantic types. They can be used to derive data transformations or, in our case, to specify a query filter for a WFS query. The rules have the form q p, where q is a query expression that selects instances of the structural type to register to a concept denoted by the contextual path p. The structural-type query is defined for XML using a subset of XPath (W3C 1999). A query q is expressed using the syntax shown in Figure 2.6, where n represents an element tag name and v represents a text value. The expression /text(), which selects the PCDATA content of an element, usually maps to a contextual path whose range is an XML schema datatype. Taking an example from the CHMI feature type (table 2.1), /StavVody/stav/text() selects the content of the feature type s stav property (i.e. the result Susan is interested in). A contextual path denotes a concept, possibly within the context of other concepts. It takes the form C.r 1.r 2..r n for n 0, where r 1 to r n are valid properties defined for the semantic type of P. Taking an example from the domain ontology shown in Figure 2.2, Measurement.quantityResult.unit is a contextual path, where the concept selected by the path is Unit within the context of a Measurement s result. q := /p p := n p/p p/text() p[c] c := p p=v Figure 2.6: Syntax of structural-type queries as defined in (Bowers and Ludäscher 2004) The registration mapping shown in Figure 2.5 illustrates the need to introduce new roles in the application ontology in order to prevent multiple occurrences of the same role in a registration mapping (cf. Section 2.3.3). The CHMI feature type contains measurement results for two observables (water level and discharge). Both results could be described using the domain role quantityresult. Then, however, both the stav and the prutok elements in the XML file would be mapped to the same conceptual path chmi_measurement.quantityresult. By introducing two subroles of quantityresult (chmi_qrwaterlevel and chmi_qrdischarge) the resulting ambiguity can be prevented. In some cases, the representation of the feature s geometry can also present a problem. In our example, we assume that Susan is familiar with simple GML geometry types such as the gml:point element used in the CHMI feature type. Therefore, we map the whole element to the contextual path chmi_measurement.location and leave the interpretation of that element to Susan. Specifying a more detailed mapping that also describes the coordinates of the point can

54 Ontology-Based Retrieval of Geographic Information 47 be problematic in some cases, e.g. with the CHMI feature type. Here, both X and Y coordinates are provided as comma-separated values in a single coordinates element. To extract each coordinate value separately, more sophisticated XPath methods, e.g. substring-before() and substring-after(), are required. In order to be able to use registration mappings in our approach, a machine-interpretable representation is also required. We have defined a simple XML schema for this purpose. Thus, the registration mapping can be stored in an XML file and put on some web server. The semantic annotation of a feature type is then simply implemented as a reference to this XML file in its metadata document. 2.4 Ontology-Based GI Discovery Our approach for ontology-based discovery of GI is based on semantic matchmaking between DL concepts representing geographic feature types (i.e. classes of geographic objects with common characteristics) on the one hand and the user s query on the other hand. Feature types are described by specific application concepts that are built using roles and concepts from a shared vocabulary as described in Section The user s query concept can either be a concept from an existing application ontology, or it can be defined based on terms from the shared vocabulary. A terminological reasoning engine, in our case RACER (Haarslev and Möller 2004), is used to find out which of the application concepts are equal to or subsumed by (i.e. are more specific than) the query concept. All concepts for which this is the case are considered to be a match for the query. In this section, we show how ontology concepts can be used to enhance GI discovery in our motivating example (Section 2.4.1) and we introduce the architecture that supports ontologybased discovery (Section 2.4.2) Ontology-based GI Discovery in the Running Example In our example, for each of the feature types provided by a WFS a concept is defined (Figure 2.4). All concepts are derived from the domain concept Measurement and introduce name as an additional property. However, the concepts differ in the restrictions they place on the range of the water level s unitofmeasure (Meter for bafg_measurement, Centimeter for the other two concepts). Also, some provide further application-specific properties (e.g. chmi_rivername for chmi_ Measurement). Through their reference to the domain concept, all definitions also imply that the feature types provide at least one quantityresult and exactly one location and one timestamp. To illustrate our approach, we show three query concepts representing possible queries defined by Susan. In all queries, Susan is interested in feature types that have WaterLevel as an observable and provide a location, a timestamp and a name (Figure 2.7). While in the first query Susan does not require anything else, in her second query she wants the unit of measure to be Centimeter. In her third query, in addition to the water level in centimeters, she wants the measurement to also contain the discharge in cubic meters.

55 Chapter 2 48 Query Concepts Query_1 Query_2 Query_3 (define-concept Query_1 (and (some quantityresult (all observable WaterLevel)) (some location *top*) (some timestamp *top*) (some name *top*) ) ) (define-concept Query_2 (and (some quantityresult (and (all unitofmeasure Centimeter) (all observable WaterLevel))) (some location *top*) (some timestamp *top*) (some name *top*) ) ) (define-concept Query_3 (and (some quantityresult (and (all unitofmeasure Centimeter) (all observable WaterLevel))) (some quantityresult (and (all unitofmeasure CubicMeter) (all observable Discharge))) (some timestamp *top*) (some name *top*) ) ) Figure 2.7: Examples for defining query concepts. A classification of the application and query concepts in RACER (Figure 2.8) shows that all three feature type concepts are subsumed by the first query concept. Thus, in contrast to the keyword-based search, all services are correctly discovered. The second query concept only subsumes elwis_measurement and chmi_measurement, while the third query concept only subsumes chmi_measurement. Again, this is the desired result as both feature types provide water level measurements in centimeters, but only chmi_measurement also provides discharge measurements. This illustrates that compared to keyword queries the ontology-based approach can increase recall and precision. Figure 2.8: Subsumption hierarchy including three query concepts and the application concepts for the three feature types introduced in Section Architecture for GI Discovery In order to support the advanced query capabilities described above, some new service interfaces and information items are needed in addition to the well-known components as catalogue services of current SDIs. First, we have to provide the application ontologies. For each application schema offered via a WFS there is one application ontology described with the shared vocabulary of the corresponding domain ontologies (as described in Section 2.3.3). These ontologies provide the formal description of the application schema of a data source. The application ontology can be accessed through a reference in the ISO metadata documents for that data source (a detailed description of this reference can be found in Section 2.3.4). To provide access to the ontologies, two new components are defined. The Ontology-based Reasoner is a central component responsible for storing, managing and reasoning on the ontologies in a domain. It provides two interfaces, one for accessing the shared vocabulary and application ontologies and one for reasoning about possible matches with simple and defined concept search (using RACER). A concept is considered a match if it is equal to or subsumed by the

56 Ontology-Based Retrieval of Geographic Information 49 query concept. The Cascading Catalogue Service extends the functionality of the conventional catalogue service by analysing and manipulating the filters of metadata queries that are enriched with DL query concepts. It provides access through the standard OGC Stateless Catalogue Service interface. If a catalogue query contains a DL query concept, RACER is accessed to compute a list of subconcepts from existing application ontologies. These concepts are added to the query, which can then be sent to any conventional standard catalogue service because the expanded query requires only a match based on string comparison. Finally, a client supports the user in formulating catalogue queries that contain a DL query concept for the required feature type. In the current implementation, the user interface is based on so-called Query Templates, which contain allowed combinations of roles and concepts. On the one hand, these templates prevent inexperienced users from defining queries that do not make sense and reduce the amount of terms that are presented to the user. On the other hand, they also seriously limit the expressiveness of possible user queries. One of the goals of the research presented in this paper was to increase the expressiveness of user queries while at the same time keeping the user interface easy to use. The information flow within the described architecture is described in more detail in (Klien et al. 2004). A prototypical implementation can be accessed from Ontology-Based GI Retrieval After having set the stage in the previous sections, we now describe the ontology-based retrieval of geographic information in more detail. Section will give an overview of the steps that are required for ontology-based GI retrieval. In Section 2.5.2, we introduce a simple syntax for an ontology-based GI query language. This query language provides the basis for the implementation of a user interface that helps the requester to define a semantic query (Section 2.5.3). How DL query concepts are derived from this query is illustrated in Section Finally, Section 2.6 describes how to build the query and filter for the selected WFS Ontology-Based GI Retrieval in the Running Example Our goal is to make the overall task of GI retrieval more user-friendly by hiding the processes of discovering appropriate WFSs and of formulating the actual WFS query. Thus, Susan s only task should be to formulate a query for GI retrieval using terms from existing domain ontologies. This section gives an overview on how this is to be achieved by giving a detailed account of what happens behind the scenes. The individual steps are illustrated in two UML sequence diagrams in Figure 2.9 (for GI discovery) and Figure 2.10 (for GI retrieval).

57 Chapter 2 50 Susan:Requester :Query Client :OBR :CS-W 1. getvocabulary 3. submit query 2. providequeryui 4. derivefeaturetypequeryconcepts 5. getrelatedconcepts 6. buildcsquery 7. query Figure 2.9: UML sequence diagram illustrating steps 1-7 in the proposed approach to ontology-based retrieval of geographic information. Susan s entry point for the ontology-based retrieval of geographic information is a component which provides an intuitive query language and user interface (subsequently called query client). By intuitive we mean that Susan should be familiar with the elements of the language and the user interface. The research described in this paper is restricted to a simple query language (Section 2.5.2) and user interface (Section 2.5.3); the implementation of a more sophisticated user interface offering a well-designed working environment is left to future research. The query client queries the ontology-based reasoner for terms from existing domain ontologies (step 1) and provides Susan with a user interface for formulating her query (step 2). After Susan has submitted her query (step 3), it is translated into one or several DL concepts (step 4). These concepts are used as query concepts for discovering WFS that provide semantically appropriate feature types. This discovery process follows the approach described in Section (steps 5 to 7).

58 Ontology-Based Retrieval of Geographic Information 51 Susan:Requester :Query Client :WFS 9. choose WFS 8. displayresults 10. derivepropertynames 11. buildwfsquery 12. GetFeature 13. derivepropertynames 14. displayresults Figure 2.10: UML sequence diagram illustrating steps 8-14 in the proposed approach to ontology-based retrieval of geographic information. If the ontology-based search yields no or more than one result, Susan gets notified (step 8). In the former case, she modifies her query, in the latter case she selects one of the discovered feature types (step 9). In order to access the chosen source through its WFS interface, a GetFeature query including spatial and/or non-spatial constraints has to be constructed using the property names of the feature type s application schema. These names can be obtained from the registration mapping of the selected feature type (steps 10 and 11). Finally, the WFS query is executed (step 12) and its results are translated into terms from domain ontologies. Again, this translation can be obtained from the registration mapping (steps 13 and 14) Query Language Requesters cannot be expected to formulate complex DL query concepts such as those presented in Section 2.3. Rather, they should be provided with an intuitive query language as well as a graphical user interface. In this section, we propose a simple syntax for semantic queries, which closely resembles an SQL select statement. This syntax provides the basis for the user interface, which is described in the following section.

59 Chapter 2 52 With the proposed language, users should be able to select properties of specific feature types, possibly using one or several constraints. Properties correspond to roles in the shared vocabulary, while feature types correspond to concepts. The connection between roles and concepts is expressed using type variables and the. connector. Constraints are expressed using a where clause and can be combined via conjunction (logical and) or disjunction (logical or). A constraint can either be a type restriction or a comparison with a value specified by the requester. Value constraints can only be defined for roles whose range is an XSD datatype or a GML geometry type. In addition to common string and number comparators (such as >= or startswith), spatial comparators such as withinboundingbox, intersects or within-distance-of can be used. The comparators supported in WFS queries are given in (OGC 2001), their semantics is defined in (ISO TC ). An example query statement for finding water level measurements for the Elbe river provided in centimeters for a given date ( ) and location is given in Figure SELECT x.quantityresult.value FROM Measurement x WHERE (x.quantityresult.observable hastype WaterLevel) AND (x.quantityresult.unitofmeasure hastype Centimeter) AND (x.quantityresult.observedwaterbody.name = Elbe ) AND (x.datestamp = ) AND (x.location iswithinboundingbox (11,52,13,54)) Figure 2.11: Example for a semantic query statement. The keywords of the proposed syntax are shown in capitals, the comparators in italics User Interface As a first step towards intuitive and user-friendly semantic GI retrieval, a desktop client application providing a simple graphical user interface (Figure 2.12) has been prototypically implemented. Transferring the implementation to a web platform will be part of our future work. The design of the user interface is directly derived from the model of the query language. It provides users with a number of concepts corresponding to feature types that are defined in the domain ontology. After selecting one of these concepts, users can build a select statement by choosing one or several roles that are associated with this concept and that represent the feature type s properties (Figure 2.13).

60 Ontology-Based Retrieval of Geographic Information 53 Figure 2.12: Simple user interface for defining semantic queries. Now the constraints can be defined. Again, the user has to select one or several roles that are to be used for these constraints. For each of the selected roles, a constraint input panel and a constraint object are constructed. The types of the input panel and the constraint object depend on the range of the selected role: a value constraint (input panel) if the range is an XML datatype, a type constraint (input panel) otherwise. Also, the logical connector (AND or OR) can be chosen. In the lower part of the screen, the constructed query and the derived DL query concept are displayed. Both are dynamically adapted as the user chooses properties and defines constraints. The detailed procedure for deriving the query concept from the query is described in the following section. Figure 2.13: Dialog for selecting roles that represent properties that are to be selected or used in constraints.

61 Chapter Deriving DL Query Concepts for GI Discovery In order to discover WFSs that provide suitable feature types for answering the query, the query statement has to be translated into one or several DL query concepts (step 4 in the overall approach described in Section 2.5.1). These concepts are used for the ontology-based discovery as described in Section Step-by-step instructions for this translation are given below using the example query shown in Figure In the illustrating figure (Figure 2.14), the changes from the previous step are shown in bold. Select statement. The specification of the query concept is based on the concept specified after the FROM keyword, i.e. Measurement in our example (Figure ). Where clause. The where clause is represented as a disjunction or conjunction of the DL equivalents of its constraints (depending on the logical connectors defined in the query statement). The where clause is then added as a conjunction to the query concept. (Figure ). Type constraints are expressed through (all-quantified) value restrictions in DL (Figure ). If a property is represented by several roles (e.g. x.quantityresult.unitofmeasure) only the range of the last role (unitofmeasure) can be restricted. The other roles are represented using existential quantification. E.g. the expression (some quantityresult (all unitofmeasure Centimter)) specifies a concept that has at least one quantityresult whose unitofmeasure is restricted to Centimeter. Value constraints. We assume that the ranges of the roles provided in the shared vocabulary have already been defined. Therefore, value constraints are simply represented using existential quantification on the specified roles (Figure ). All roles whose domain contains the concept that represents the selected feature type (timestamp and location in our example) do not have to be explicitly included in the query concept. (1) User Query: SELECT x.quantityresult.value FROM Measurement x DL Query Concept: (define-concept query Measurement ) (2) User Query: SELECT x.quantityresult.value FROM Measurement x WHERE (x.quantityresult.observable hastype WaterLevel) AND (x.quantityresult.unitofmeasure hastype Centimeter) DL Query Concept: (define-concept query (and Measurement (some quantityresult (all observable WaterLevel) (all unitofmeasure Centimeter)))) (3) User Query: SELECT x.quantityresult.value FROM Measurement x WHERE (x.quantityresult.observable hastype WaterLevel) AND (x.quantityresult.unitofmeasure hastype Centimeter) AND (x. quantityresult.observedwaterbody.name = Elbe ) AND (x.datestamp = ) AND (x.location iswithinboundingbox (11,52,13,54))

62 Ontology-Based Retrieval of Geographic Information 55 DL Query Concept: (define-concept query (and Measurement (some quantityresult (all observable WaterLevel) (all unitofmeasure Centimeter) (some observedwaterbody (some name *top*))))) Figure 2.14: Mapping user queries to DL query concepts Deriving a WFS Query Filter for GI Retrieval After having discovered a WFS providing an appropriate feature type, the actual WFS query filter has to be built (step 11 in the overall approach described in Section 2.5.1). For this, the property names used in the chosen WFS that are equivalent to the domain ontology terms used in the query statement have to be derived. Also, the structure of the WFS s feature type has to be known. All the required information can be accessed from the feature type s registration mapping. Again, we refer to the CHMI feature type for illustration (see the registration mapping in Figure 2.5). For this feature type, the example query shown in Figure 2.11 can be translated into the WFS GetFeature request (OGC 2002b) and filter expression (OGC 2001) shown in Figure <GetFeature service="wfs" version="1.0.0" outputformat="gml2" ( ) > <Query typename="stavvody"> <PropertyName>stav</PropertyName> <Filter> <And> <PropertyIsEqualTo> <PropertyName>StavVody/tok</PropertyName> <Literal>Elbe</Literal> </PropertyIsEqualTo> <PropertyIsEqualTo> <PropertyName>StavVody/datum</PropertyName> <Literal> </Literal> </PropertyIsEqualTo> <Within> <PropertyName>gml:position</PropertyName> <gml:box srsname=" > <gml:coordinates>11.0, ,54.0</gml:coordinates> </gml:box> </Within> </And> </Filter> </Query> </GetFeature> Figure 2.15: A WFS query that requests the property stav from a feature type called StavVody. The filter expression constrains the query to features whose tok property equals Elbe, whose datum property equals and whose position property is within the specified bounding box.

63 Chapter Discussion and Related Work The approach presented in this paper is related to previous work in the fields of geographic information science, information discovery and retrieval, data integration and artificial intelligence. The work is motivated by problems that occur when dealing with distributed and open infrastructures for geographic information (rather than monolithic GIS). In these environments, interoperability between different GIS is prevented by semantic heterogeneity, a problem first introduced in (Bishr 1998). It has been acknowledged that research is required that addresses this heterogeneity in order to enable interoperability (Sheth 1999, Sondheim et al. 1999, Egenhofer 2002). A first step towards overcoming semantic heterogeneity has been the proposal of Integrated Geographic Information Systems (IGIS). In (Hinton 1996), IGIS are defined as systems that integrate diverse GIS technologies or reflect a particular point of view of a community. The idea of IGIS is advanced in (Fonseca et al. 2002a, Fonseca et al. 2002b) by introducing ontologies as means for supporting representations of incomplete information, multiple representations of geographical space, and different levels of detail. In SDIs, where geographic information is usually highly distributed and heterogeneous, solving heterogeneity problems becomes a prerequisite. One focus of the research presented here is to transfer the ontology approach for dealing with semantic heterogeneity to the SDI domain and to demonstrate how it can be integrated into existing standards-based architectures. Work in the field of information discovery and retrieval is manifold. There is widespread agreement among researchers in this field that declarative content descriptions and query capabilities are necessary (Mena et al. 1998, Czerwinski et al. 1999, Guarino et al. 1999, Heflin and Hendler 2000). The vision of most research in this domain is that users should be able to express what they want, and the system will find the relevant sources and obtain the answer (Levy et al. 1996). As this might involve combining data from multiple sources, information discovery and retrieval is closely related to data integration, whose goal it is to provide a uniform interface (through a global schema) to a multitude of data sources (each with a local schema). In data integration terminology (Levy 2000), our approach can be considered as a Local As View approach. This means that the contents of a data source are described as a query over the mediated schema, which in our case is substituted by the ontology (see (Mädche et al. 2001, Guha et al. 2003) for other examples, where ontologies are used in search and retrieval mechanisms). Usually, a query through the mediated schema in this approach requires complex query rewriting methods. However, this is not necessary in our case because we assume that all the required information is provided by one source and therefore a combination of sources is not necessary. Dropping this assumption in order to enable more powerful query answering could be a possible future extension to our approach. The idea of primarily using roles for building ontologies and for identifying taxonomic relationships between query and application concepts is closely related to feature-based and geometric approaches to ontology integration (see (Goldstone and Son 2005) for an overview and (Raubal 2004, Rodríguez and Egenhofer 2004) for applications in the geospatial domain). As these approaches usually compute a numeric similarity value between two concepts they can express gradual differences between them. Conversely, our approach considers only subsumption relationships, which allow no gradual differentiation. This is important for scenarios, where

64 Ontology-Based Retrieval of Geographic Information 57 the discovered data has to have certain properties (e.g. because these are required for further processing). If, in contrast, the properties are only used to describe the characteristics of a certain feature type, feature-based or geometric similarity measures could also be used in a retrieval scenario. Context can be considered a key element for enabling semantic interoperability (Brézillon 1999). In the discovery and retrieval setting we are dealing with, providers and requesters can explicitly and formally describe their context (i.e. their application or query) in terms of the roles offered in the shared vocabulary. However, the role of context should be further elaborated in future research. The issue of providing an ontology-based query interface that enables uniform access to heterogeneous data sources and supports the user in formulating a precise query has been addressed by the SEWASIE (Semantic Webs and Agents in Integrated Economies) project (Dongilli et al. 2004). They employ the same ontology and matchmaking approach for information retrieval. Moreover, the SEWASIE query interface enables an iterative refinement process of the query and employs natural language as query representation. While this certainly represents a user-friendly approach, it also additionally requires that the ontology engineer provides verbalizations for each ontology term. In contrast, we propose an intuitive but still formal query language. And whereas the SEWASIE query interface is developed for the needs of the Semantic Web in general, we are focused on geospatial information infrastructures. There are also a number of projects that have addressed problems caused by semantic heterogeneities in the geospatial domain: In the GiMoDig (Geospatial info-mobility service by real-time data-integration and generalization) project (Lehto et al. 2004) the problem of semantic heterogeneities in different local GML application schemas is addressed by transforming the data streams from local GML schemas to a global schema. In contrast to the GiMoDig approach, we assume only one source (see above) and do not use static mapping definitions between local and global schemas. In the BUSTER (Bremen University Semantic Translator for Enhanced Retrieval) project (Vögele et al. 2003), DL descriptions have been used to describe and query classifications (Visser and Stuckenschmidt 2002) and data content (Hübner et al. 2004, Vögele and Spittel 2004). However, these approaches use simple ontologies, and queries only have limited expressivity. A similar strategy for supporting the discovery and integration of datasets and services is used in the Science Environment for Ecological Knowledge (SEEK) project (Pennington et al. 2004). We have benefited from the work conducted in SEEK by adapting the method of registration mapping (Bowers and Ludäscher 2004) for our purposes. The key difference of our approach lies in the combination of both tasks, the information discovery and retrieval, and to hide the complexity from the user. Last but not least, the question of scalability must be addressed if the presented approach is to work in SDIs containing a multitude of data sources. With an increasing number of sources the computational complexity of subsumption reasoning, which we apply for matching queries to advertisements, will increase as well. While, so far, we have only tested the approach with a small number of sources, sufficient evidence is given in (Li and Horrocks 2003) that the ap-

65 Chapter 2 58 proach scales up to larger applications. This requires, however, that the TBox classification (i.e. the computation of all subsumption relationships within the ontology) is done offline before the matchmaking process starts and the classified TBox is then used to reason about requests. 2.7 Conclusions and Future Work We have presented an approach for ontology-based retrieval of geographic information that can contribute to solving existing problems of semantic heterogeneity and hides most of the complexity of the required procedure from the requester. The approach has been implemented in several components that can be used as extensions to standard SDIs. Also, a query language and graphical user interface have been devised and prototypically implemented that allow a requester to intuitively formulate a query using a well-known domain vocabulary. There are several issues that require further research: Extension of the tested scenario. The tested scenario comprises requests and application schemas with a relatively simple structure. Also, the effects of the scale of a data source have not been taken into account. Future tests of the architecture will therefore include more complex request possibilities (like support for spatial comparators and nested queries) and data sources at different scales. Also, the effectiveness of the approach will be tested in a more generic setting with complex application schemas and examples from other domains. Support for spatial comparators. More spatial comparators should be supported in the query language and user interface, e.g. disjoint, touches, intersects, contains and withindistanceof. Also, a map window should be provided, in which the requester can specify bounding boxes and/or or select features for spatial filters. Implementation of more complex queries. So far, we have assumed that only one dataset is being searched that provides all information needed by the requester. In future work, in case a single dataset cannot be found, the discovery problem will be extended to selecting a combination of existing data sources that can fulfil it. For example, the query shown in Figure 2.16 looks for the result of a measurement whose location is within a certain distance of the location of a city whose name is Dresden. To answer such a query statement requires two information sources and consequently also two DL query concepts: one for a measurement and one for a city containing name and location properties. Unfortunately, the WFS specification does not support nested queries. Thus, two WFS queries are necessary, and the outer query (for a measurement result) can only be formulated once the inner query (for a city location) has been executed and returned the desired result. Also, this kind of query only works if the inner query returns exactly one result. SELECT x.quantityresult.value FROM Measurement x WHERE (x.location is-within (50 km) of (SELECT y.location FROM City y WHERE (y.name = Dresden)) ) Figure 2.16: A nested query.

66 Ontology-Based Retrieval of Geographic Information 59 Built-in conversion functions. The usability of the approach would be increased if it was able to perform simple conversions during both GI discovery and retrieval. For example, if the system knew a conversion between meters and centimeters, a query for measurements in centimeters would also return a feature type providing measurements in meters. Such simple computations could be performed to convert between units of measure, e.g. cm m, to combine or split data values, e.g. date + time datetime, or to convert between place names and coordinates for spatial filters (using Gazetteers and Geocoders (OGC 2002a)). The main research questions in this context are how conversion functions can be taken into account during discovery and retrieval and whether or how the representation of domain and application ontologies has to be adapted for this purpose. As the resolution and precision of data can vary widely, they have to be taken into account when considering which conversion functions are actually permissable for a given data set. User-friendly generation of application ontologies. While our approach hides much of the complexity of the ontology-based GI retrieval from the requester, the data provider still has to create and register rather complex application ontologies. We are aware that this is one of the crucial bottlenecks for our approach to be accepted and used in future SDIs. Future work will therefore address how the process of creating formal descriptions of the geodata could be automated. First ideas on how this can be achieved using spatial analyses of geographic datasets are presented in (Klien and Lutz 2005). Extension of the architecture. The presented architecture is component-based, i.e. it is extendable in various directions. So far, the Enhanced Cascading Catalogue Service and the Reasoner component are tightly coupled in the architecture. However, the standardized interfaces allow to extend the architecture with multiple and exchangeable components. It is also planned to extend the architecture with modules for spatial and temporal reasoning. Acknowledgements The work presented in this paper has been supported by the German Federal Ministry for Education and Research as part of the GEOTECHNOLOGIEN program (grant number 03F0369A). It can be referenced as publication no. GEOTECH-184. We are grateful to Udo Einspanier, Martin Raubal, Florian Probst, and Werner Kuhn for their input at various stages of this paper. References ANTONIOU, G. and VAN HARMELEN, F., 2003, Web Ontology Language: OWL. In Handbook on Ontologies, S. Staab and R. Studer (Eds.), pp Springer). BAADER, F. and NUTT, W., 2003, Basic Description Logics. In The Description Logic Handbook. Theory, Implementation and Applications, F. Baader, D. Calvanese, D. McGuinness, D. Nardi and P. Patel-Schneider (Eds.), pp (Cambridge: Cambridge University Press).

67 Chapter 2 60 BERNSTEIN, A. and KLEIN, M., 2002, Towards High-Precision Service Retrieval. In The Semantic Web - First International Semantic Web Conference (ISWC 2002), June 9-12, 2002, Sardinia, Italy, pp BISHR, Y., 1998, Overcoming the Semantic and Other Barriers to GIS Interoperability. International Journal of Geographical Information Science, 12, pp BOWERS, S. and LUDÄSCHER, B., 2004, An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In International Workshop on Data Integration in the Life Sciences (DILS'04), March 25-26, 2004, Leipzig, Germany. BRÉZILLON, P., 1999, Context in Problem Solving: A Survey. The Knowledge Engineering Review, 14, pp CZERWINSKI, S., ZHAO, B. Y. and HODES, T., 1999, An architecture for a secure service discovery service. In Fifth ACM/IEEE International Conference on Mobile Computing and Networking, Seattle, Washington, USA, pp DONGILLI, P., FRANCONI, E. and TESSARIS, S., 2004, Semantics driven support for query formulation. In International Workshop on Description Logics, Whistler, BC, Canada. EGENHOFER, M., 2002, Toward the Semantic Geospatial Web. In The 10th ACM International Symposium on Advances in Geographic Information Systems (ACM-GIS), McLean, VA. FONSECA, F., EGENHOFER, M., DAVIS, C. and CÂMARA, G., 2002a, Semantic Granularity in Ontology-Driven Geographic Information Systems. Annals of Mathematics and Artificial Intelligence, 36, pp FONSECA, F. T., EGENHOFER, M. J., AGOURIS, P. and CAMARA, G., 2002b, Using Ontologies for Integrated Geographic Information Systems. Transactions in GIS, 6, pp GDI-NRW, 2002, Catalog Services für GeoDaten und GeoServices, Version 1.0 (Geodateninfrastruktur NRW). GOLDSTONE, R. L. and SON, J., 2005, Similarity. In Cambridge Handbook of Thinking and Reasoning, K. Holyoak and R. Morrison (Eds.), pp (Cambridge: Cambridge University Press). GRUBER, T. R., 1995, Toward Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human-Computer Studies, 43, pp GUARINO, N., MASOLO, C. and VETERE, G., 1999, OntoSeek: Content-Based Access to the Web. IEEE Intelligent Systems, 14, pp GUHA, R., MCCOOL, R. and MILLER, E., 2003, Semantic Search. In 12th International Conference on World Wide Web, (ACM Press), pp HAARSLEV, V. and MÖLLER, R., 2004, RACER User s Guide and Reference Manual. Version Available online at: 19.pdf (accessed July 29, 2005). HART, G., TEMPLE, S. and MIZEN, H., 2004, Tales of the Riverbank, First Thoughts in the Development of a Topographic Ontology. In 7th Conference on Geographic Information Science (AGILE 2004), Heraklion, Greece (Crete University Press), pp HEFLIN, J. and HENDLER, J., 2000, Searching the Web with SHOE. In Papers from the AAAI Workshop, Menlo Park, CA (AAAI Press), pp

68 Ontology-Based Retrieval of Geographic Information 61 HINTON, J., 1996, GIS and Remote Sensing Integration for Environmental Applications. International Journal of Geographical Information Science, 10, pp HÜBNER, S., SPITTEL, R., VISSER, U. and VÖGELE, T., 2004, Ontology-Based Search for Interactive Digital Maps. IEEE Intelligent Systems, 19, pp ISO TC 211, 2002, ISO Spatial Schema (International Standardization Organization). KLIEN, E., EINSPANIER, U., LUTZ, M. and HÜBNER, S., 2004, An Architecture for Ontology- Based Discovery and Retrieval of Geographic Information. In 7th Conference on Geographic Information Science (AGILE 2004), Heraklion, Greece (Crete University Press), pp KLIEN, E. and LUTZ, M., 2005, The Role of Spatial Relations in Automating the Semantic Annotation of Geodata. In Conference on Spatial Information Theory (COSIT 2005), September 14-18, 2005, Ellicottville, NY, USA. LEHTO, L., SARJAKOSKI, T., HVAS, A., HOLLANDER, P., RUOTSALAINEN, R. and ILLERT, A., 2004, A Prototype Cross-Border GML Data Service. In 7th Conference on Geographic Information Science (AGILE), Heraklion, Greece, pp LEVY, A. Y., 2000, Logic-Based Techniques in Data Integration. In Logic Based Artificial Intelligence, J. Minker (Eds.), pp (Dordrecht, NL: Kluwer). LEVY, A. Y., RAJARAMAN, A. and ORDILLE, J., 1996, Querying heterogeneous information sources using source descriptions. In 22nd VLDB Conference, Bombay, India, pp LI, L. and HORROCKS, I., 2003, A Software Framework for Matchmaking Based on Semantic Web Technology. In The Twelfth International World Wide Web Conference, May 2003, Budapest, Hungary (ACM Press, New York, NY, USA), pp MÄDCHE, A., STAAB, S., STOJANOVIC, N., STUDER, R. and SURE, Y., 2001, SEAL - A Framework for Developing SEmantic portals. In 18th British National Conference on Databases, Oxford, UK (Springer Verlag), pp MCKEE, L., 2000, Who wants a GDI? In Geospatial Data Infrastructure - Concepts, cases, and good practice, R. Groot and J. McLaughlin (Eds.), pp (New York: Oxford University Press). MENA, F., KASHYAP, V., ILLARRAMENDI, A. and SHETH, A., 1998, Domain Specific Ontologies for Semantic Information Brokering on the Global Information Infrastructure. In First International Conference on Formal Ontologies in Information Systems, Trento, Italy. NEBERT, D., 2001, Developing Spatial Data Infrastructures: The SDI Cookbook, Version 1.1, pp. Global Spatial Data Infrastructure, Technical Comittee). OGC, 2001, Filter Encoding Implementation Specification, Version (OpenGIS Consortium). OGC, 2002a, Gazetteer Service Profile of the Web Feature Service Implementation Specification (OGC Discussion Paper) (OpenGIS Consortium). OGC, 2002b, Web Feature Service Implementation Specification, Version (Open GIS Consortium). OGC, 2003, Geography Markup Language (GML) Implementation Specification, Version 3.0 (Open GIS Consortium).

69 Chapter 2 62 PAOLUCCI, M., KAWAMURA, T., PAYNE, T. R. and SYCARA, K., 2002, Semantic Matching of Web Service Capabilities. In 1st International Semantic Web Conference (ISWC2002), Sardinia, Italy (Springer), pp PENNINGTON, D., MICHENER, W. K., BERKLEY, C., HIGGINS, D., JONES, M. B., SCHILDHAUER, M., BOWERS, S., LUDÄSCHER, B. and RAJASEKAR, A., 2004, Building SEEK: The Science Environment for Ecological Knowledge (SEEK): A Distributed, Ontology-Driven Environment for Ecological Modeling and Analysis (Abstract). In The Third Conference of Geographic Information Science (GIScience 2004), October 20-23, 2004, Adelphi, MD, USA (Regents of the University of California). RAUBAL, M., 2004, Formalizing Conceptual Spaces. In Formal Ontology in Information Systems, Proceedings of the Third International Conference (FOIS 2004). Frontiers in Artificial Intelligence and Applications 114, A. Varzi and V. L. (Eds.), pp (Amsterdam, NL: IOS Press). RICHARDSON, R. and SMEATON, A. F., 1995, Using WordNet in a Knowledge-based Approach to Information Retrieval (Technical Report CA-0395) (Dublin, Ireland: Dublin City University). RODRÍGUEZ, A. and EGENHOFER, M., 2004, Comparing geospatial entity classes: an asymmetric and context-dependent similarity measure. International Journal of Geographical Information Science, 18, pp SATTLER, U., CALVANESE, D. and MOLITOR, R., 2003, Relationships with other Formalisms. In The Description Logic Handbook. Theory, Implementation and Applications, F. Baader, D. Calvanese, D. McGuinness, D. Nardi and P. Patel-Schneider (Eds.), pp (Cambridge: Cambridge University Press). SHETH, A. P., 1999, Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics. In Interoperating Geographic Information Systems, M. F. Goodchild, M. Egenhofer, R. Fegeas and C. A. Kottman (Eds.), pp (Dordrecht, NL: Kluwer). SONDHEIM, M., GARDELS, K. and BUEHLER, K. (Eds.), 1999, GIS Interoperability, pp. (New York: John Wiley & Sons). STUCKENSCHMIDT, H., VAN HARMELEN, F., DE WAARD, A., SCERRI, T., BHOGAL, R., VAN BRUEL, J., CROWLESMITH, I., FLUIT, C., KAMPMAN, A., BROEKSTRA, J. and VAN MULLIGEN, E., 2004, Exploring Large Document Repositories with RDF Technology: The DOPE Project. IEEE Intelligent Systems, 19, pp STUDER, R., BENJAMINS, V. R. and FENSEL, D., 1998, Knowledge Engineering: Principles and Methods. Data and Knowledge Engineering, 25, pp USCHOLD, M., 1998, Knowledge level modelling: concepts and terminology. The Knowledge Engineering Review, 13, pp VISSER, U. and STUCKENSCHMIDT, H., 2002, Interoperability in GIS - Enabling Technologies. In 5th AGILE Conference on Geographic Information Science, Palma de Mallorca, Spain, pp VÖGELE, T., HÜBNER, S. and SCHUSTER, G., 2003, BUSTER - An Information Broker for the Semantic Web. KI - Künstliche Intelligenz, 03, pp

70 Ontology-Based Retrieval of Geographic Information 63 VÖGELE, T. and SPITTEL, R., 2004, Enhancing Spatial Data Infrastructures with Semantic Web Technologies. In 7th Conference on Geographic Information Science (AGILE 2004), Heraklion, Greece. W3C, 1999, XML Path Language (XPath), Version 1.0 (World Wide Web Consortium). W3C, 2001, XML Schema Part 2: Datatypes. W3C Recommendation (World Wide Web Consortium). WACHE, H., VÖGELE, T., VISSER, U., STUCKENSCHMIDT, H., SCHUSTER, G., NEUMANN, H. and HÜBNER, S., 2001, Ontology-Based Integration of Information A Survey of Existing Approaches. In IJCAI-01 Workshop: Ontologies and Information Sharing, Seattle, WA, pp

71 Chapter 3 Requirements for Geospatial Ontology Engineering Klien, E. and Probst, F. (2005). Requirements for Geospatial Ontology Engineering. In: Toppen, F. and Painho, M. (eds.). Proceedings of the 8th Conference on Geographic Information Science (AGILE 2005), Estoril, Portugal, pp Abstract. Ontologies have been acknowledged to be the core methodology for capturing and sharing semantics of geospatial information (GI). Ontologies, specifically domain-specific ontologies, are at the heart of most semantic approaches to interoperability. In this paper, we want to make a strong case for the importance of domain ontologies in the context of geospatial web service environments. We present an ontology application example and derive from this a requirement specification for geospatial ontologies and the ontology architecture they are embedded in. We claim that the lack of a supportive environment for ontology engineering and maintaining decelerates the efficient use of ontologies in the GI community. Taking into account the requirements we identify a research action line which will help to establish such an environment.

72 Requirements for Geospatial Ontology Engineering Introduction Geospatial information is the key to effective planning and decision-making in a variety of application domains. It also plays an important role as integrative factor across applications. Ontologies have been acknowledged to be the core methodology for capturing and sharing semantics of geospatial information. Ontologies, specifically domain-specific ontologies, are at the heart of most semantic approaches to interoperability. Domain ontologies help to manage semantics of terms used in application schemas and they may enable semantic matchmaking. This is crucial for realising semantic interoperability between different information communities (IC). For the description of geospatial web services we need geospatial domain ontologies as common ground to which members of different communities can commit. In this paper we want to make a strong case for the importance of domain ontologies in the context of geospatial web service environments. Work on geospatial ontologies is conducted in several research groups and projects 1. The workshop on Geo-Ontologies 2002 organized by Ordnance Survey (Harding, 2002), the workshop on Action-Oriented Approaches in Geographic Information Science in Maine (ACTOR, 2002), and the workshop on Fundamental Issues in Spatial and Geographic Ontologies held at the COSIT 2003 (COSIT, 2003) showed the variety of approaches and perspectives on ontologies for geographic information in the GI science community. Current research is focused on modelling geospatial ontologies and adequate representation of space and time (Arpinar et al., 2004; Frank, 2003; Grenon & Smith, 2004; Tomai & Kavouras, 2004), theories of vagueness, uncertainty and granularity (Bennett & Cristani, 2004), ontologies for discovery and retrieval of GI (Hiramatsu & Reitsma, 2004; Klien, Lutz, Einspanier, & Hübner, 2004; Lemmens & Vries, 2004; Lutz & Klien, 2005), ontologies for mediation and transformation (Bowers & Ludäscher, 2004; Fonseca, Egenhofer, Agouris, & Camara, 2002), and ontology grounding (Kuhn, 2003, 2005). The remainder of the paper is structured as follows: we first introduce a GI web service discovery example to illustrate how ontologies are used for inferring the compatibility of offers and requests. Based on this reasoning task, we then specify the requirements a geospatial ontology should meet. Finally we formulate an agenda listing problems and research questions which have to be tackled in order to fulfil the specified requirements. 3.2 Ontology Application Example We apply ontologies in order to realise ontology-based discovery in geospatial web service environments. The matchmaking, which underlies the ontology-based discovery, is a reasoning process with the goal of deciding, which of the available information sources match the request. Reasoning is the fundamental procedure enabling matchmaking (Sycara, Klusch, Widoff, & Lu, 1999). The main task of the matchmaking process is to resolve semantic heterogeneities be- 1 MUSIL ( OntoGeo ( OntoSpace ( SEEK ( SWEET (

73 Chapter 3 66 tween the request and the offer (Klien et al., 2004). This reasoning perspective emphasizes the need for approaches that go beyond the mere construction of ontologies and involve their use in discovery, evaluating, and combining geospatial information (Kuhn, 2005). Semantic matchmaking mechanisms will (a) lead to enhanced usability of heterogeneous and distributed GI sources and (b) facilitate the task of automatic service composition. In order to illustrate the matchmaking process which underlies the ontology-based discovery we introduce an example in Figure 3.1. The domain ontology contains the basic terms of a certain domain (in our case hydrology). It is assumed that all actors within a domain share a common understanding of the concepts provided on the domain level (Wache et al., 2001). These concepts are combined and extended in the application ontologies in order to describe the information sources. In our example the information source is a Web Feature Service (WFS) that provides features representing water level measurements. The user in our example searches for water level measurements. He formulates his request for a water level at time x at control point y on basis of the concepts of a geospatial ontology. Note, that the application ontologies describing the available information sources are also described using the concepts provided in the same geospatial domain ontology. As a consequence, the user s query becomes machine-comparable to all application ontology concepts in this catalogue. By subsumption reasoning, a terminological reasoner can automatically infer if application concepts are equivalent or sub-concepts to the query concept. As shown in (Klien et al., 2004) the integration of the matchmaking capability into Spatial Data Infrastructures overcomes some of the semantic heterogeneity problems in service discovery and thus leads to increased recall and precision. Figure 3.1: Example to illustrate semantic matchmaking for ontology-based service discovery. 3.3 Requirement Specification for Geospatial Ontologies What needs to be semantically defined in order to support the matchmaking approach introduces in our example? The decisions on what and how things are represented in an ontology are design decisions (Gruber, 1993). In the following, we identify the core requirements geospatial ontologies should meet in order to be employed successfully.

74 Requirements for Geospatial Ontology Engineering Separation of Real World Phenomena and Data Representation According to the OGC Reference Model, a geographic feature is the starting point for modelling geographic information. They define a feature as an abstraction of a real world phenomenon and a geographic feature as a feature associated with a location relative to the Earth (OGC, 2003). Analogous, we model conceptualisations of real world phenomena that can be located relative to the earth in geospatial ontologies. We use the term geospatial concept to refer to these conceptualizations. It is important to note, that data representation features (like point, line, and polygon) that are needed to abstract the real world phenomena, are not part of geospatial ontologies since they deal with the implementation structure of data and not with the semantics of a term referring to a real world phenomenon (Figure 3.2). Figure 3.2: The distinction between three types of concepts leads to geospatial domain ontologies which are not biased by implementation needs. For example, a town is often represented as a point feature in geospatial applications. But in the first place, the real world town has no ontological relation to the representational structure of a point. The domain of geospatial concepts should thus be strictly separated from the domain of data representations. If towns are modelled in an application by representing them as points, then this relation between town and its geometrical representation will be part of the application ontology. This view is also reflected in Figure 3.3, where the domain concepts and representation concepts are distinguished by their colourings. The requirement of keeping geospatial ontologies independent from the implementation view is also a strong argument for introducing a layered ontology architecture as shown in Figure Geospatial Sub-Domains In the definition above, a distinction is made between concepts for real world geospatial phenomena and concepts for representing them. Defining the scope of the latter ontologies is relatively simple as they are based on existing models for implementing geographic information, e.g. the specifications of the Open Geospatial Consortium (Lemmens & Vries, 2004; Probst et al., 2004).

75 Chapter 3 68 In contrast, defining the extent of a geospatial ontology is much more difficult since ontologies on the domain level claim to comprise the basic concepts of a common conceptualisation. Great care must be taken to define the concepts and relations on an appropriate level of expressiveness. The terms have to be general enough to allow the annotation of all information sources, but specific enough to make meaningful definitions possible (Schuster & Stuckenschmidt, 2001). In consequence, geospatial ontologies require to be defined within a certain context and for a well-known user community, i.e. we have to come up with adequate and manageable subsets of the geospatial domain. Moreover, to serve as source for building application ontologies, the domain ontology needs to meet the requirement of high stability. This is, the ontologies should reach after an iterative development phase a status comparable to a standard. Frequent changes in the domain ontologies would discourage service providers to reference their application ontologies on them Internal Ontology Structure The structure of efficiently applicable geospatial ontologies has to meet the requirements of the semantic matchmaking approach in the example. Taxonomic reasoning is useful but not sufficient. Equally, or more important are non-taxonomic relationships, e.g. that a quantity has a unit of measure. Consequently, we need ontologies that describe not only simple taxonomic relationships but provide suitable axioms to express other relationships between concepts and to constrain their intended interpretation (Guarino, 1998). MEASUREMENTS gml_point 1 location xsd_decimal 1 value Centimeter Measurement timestamp 1 xsd_datetime OR quantityresult 1..n xsd_date Quantity unitofmeasure 1 Unit observable 1 Phenomenon Depth Lake concept HYDROLOGY gml_point GML geometry type Hydrological observable Quantity 1 Hydrological Phenomonon xsd_string XML datatype observedwaterbody Discharge WaterLevel taxonomic relationship 1 WaterBody observable 1 cardinality constraint non-taxonomic relationship River Lake Figure 3.3: Schematic representation of extracts from the domain ontologies of Hydrology and Measurement (Lutz & Klien, 2005). Non-taxonomic relationships play a central part in ontology engineering and should be used wherever possible for defining concepts (Hart, Temple, & Mizen, 2004; Lutz & Klien, 2005;

76 Requirements for Geospatial Ontology Engineering 69 Tomai & Kavouras, 2004). This strategy leads to domain ontologies, which contain not only taxonomic but also non-taxonomic relationships. Figure 3.3 depicts extracts from domain ontologies for Measurements and Hydrology. In this ontology, taxonomic as well as nontaxonomic relations are defined. Thus, a concept does not have to be given a fixed position in a static hierarchy. Rather, its position in the hierarchy can be dynamically inferred based on existing concept and role definitions using subsumption reasoning. This is fundamental for enabling the ontology-based search for unknown information sources. Some guidelines for the formalisation of domain ontologies are proposed in (Lutz & Klien, 2005) Representation Language The selection of the ontology representation language should be based on the inference mechanisms needed by the application that uses the ontology. For achieving semantic interoperability in web service environments, crucial requirements regarding the representation language are the availability of a reasoning engine, the ability to scale up with the requirements of web applications, and the expressiveness to meet the ontology engineering criteria. Currently, only with Description Logic (DL)-based languages, the inference engine (reasoner) can infer concept taxonomies at run time (Gómez-Pérez, Férnandez-López, & Corcho, 2003), which is needed for semantic matchmaking. These requirements are partly met by the Web Ontology Language (OWL) in its DL or Lite version (W3C, 2004b). OWL-S is a semantic markup language for Web Services and defines an OWL ontology of services (W3C, 2004a). It enables users and software agents to automate the process of discovering, invoking, composing, and monitoring Web resources that offer particular services and have particular properties. More comprehensive service modelling efforts like the Web Service Modelling Language (de Bruijn, 2005) are under way Usability Users on the application level are usually not involved in the development process of ontologies on the domain level. They have to face the task of exploring and understanding the ontology in order to be able to commit to it or not. The application of ontologies will only become widely accepted if methods and tools are provided that support creation and usage of ontologies. Such support is given by possibilities to visualize, browse and query the internal structure of ontologies and by support for implementing a multi-level ontology structure. Whether these possibilities do exist depends on the representation language. Currently, the open source tool Protégé ( has a rapidly growing user community. It offers a number of functionalities like visual ontology navigation, consistency checking and importing/ exporting different representation languages Knowledge Sources In order to achieve the highest acceptance possible in a user community, it is crucial to base the ontology development on agreed upon knowledge sources.

77 Chapter 3 70 Standards are sources for the concepts used to describe the models of representing and implementing geospatial information. This has been shown for ISO/OGC standards in (Lemmens & Vries, 2004; Probst et al., 2004). Geospatial ontologies should be build on agreed upon terminologies and domain expert knowledge whenever possible. Most natural sciences have well-defined terminologies (e.g. geology, hydrology, and meteorology) but not in formalised and machine-interpretable formats. That means, the expenses for creating geospatial ontologies are quite high as they have to be build from scratch. This is surely one of the reasons why only a few geospatial domain ontologies exist in this area Ontology Architecture The approach of a three-layered ontology architecture (Figure 3.4) provides a solid foundation for ontology engineering which single or two-layered architectures are lacking. The heart of this architecture is the geospatial ontology on the domain level. As indicated before, it is crucial to develop ontologies on the domain level with the right granularity and with a high level of stability. The domain level contains basic terms of a domain which are combined and extended in the application ontologies in order to describe more complex semantics. With respect to our hydrology example, the concepts water level, water body and discharge are formalized on this level. It is stated that every water body has a water level and a discharge and that these qualities can be observed and measured. This general description provides an entry point for the semantic search. How measuring and representing is done for a specific water level measurement service is then formalized on application level. Once the domain level is settled, application ontologies can be added or removed without the need of modifications on domain level which makes the application level highly flexible. Their commitment to the same domain level makes the application ontologies comparable. Also, adhoc concepts (like query concepts) that are build on basis of the domain ontologies become comparable to all application concepts. The task of constructing an application ontology lies in the responsibility of the provider of the information source whereas the construction of (geospatial) domain ontologies is a joint effort of domain experts. In our hydrology example, the fact that the water level is measured in centimetre and not in feet is stated on application level. Additionally all other peculiarities of this specific water level measurement service are describe on application level. This could include legal information of how to use the provided information and data representation issues. The concepts used to describe non geospatial aspects are taken from other types of domain ontologies such as measurement ontologies or data representation ontologies (Figure 3.4). The third layer of the architecture represents the ontological backbone. So far, we have talked little about the philosophical aspects of geospatial concepts. The introduction of an upper level (e.g. DOLCE (Gangemi, Guarino, Masolo, Oltramari, & Schneider, 2002)) could help in achieving not only logical consistency (which is provided by the reasoner) but also ontological consistency. Putting the domain ontologies on the foundation of upper level ontologies could enhance the quality of the domain and application ontologies. This can be shown with the concept measurement (Figure 3.3). On domain level, there are various possibilities for the semantics of measurement. A domain ontology can subsume measurement under the concept magnitude und this

78 Requirements for Geospatial Ontology Engineering 71 in turn under the DOLCE upper-level concept quality, indicating that the measurement has an attributive character to the thing being measured. Or it could also be modelled as the process which has as result a quantity. In this case, it would be subsumed under the DOLCE upper-level concept process. Both conceptualizations might account for the semantics of measurement. However, before a service provider is using the concept measurement in an application ontology, he would have to make sure which conceptualization is behind it. Otherwise the problem of implicit semantics would just have been shifted from application level to domain level with only partly solving it. This is currently done in the approach of shared vocabularies. Figure 3.4: The three-layered ontology architecture. 3.4 Discussion and Agenda It has been shown that the application of ontology-based matchmaking technology may enhance GI service discovery with respect to precision and recall (Bernstein & Klein, 2002; Klien et al., 2004). Also, research is done on how to apply similar techniques for GI retrieval (Lutz & Klien, 2005) and transformation (Bowers & Ludäscher, 2004). In 2001 (Gómez Pérez, 2001) stated that the number of ontologies developed is not large and their practical use in final and real applications is small. This is still true at least for the practical use of applications in the geospatial domain. We believe that part of the problem lies in the lack of a supporting environment for ontology engineering and maintaining. Taking into account the requirement specification for geospatial ontologies given above we propose to concentrate research on the following points. Regarding the content of the domain ontologies, the separation of concepts for data presentation from geospatial concepts is a crucial requirement for consistent, implementationindependent ontologies. Therefore we need to:

79 Chapter 3 72 Investigate to which extent data representation and real world geospatial concepts can be separated and maintained in different domain ontologies (e.g. a point is not a geospatial entity). Regarding the internal structure of the domain ontologies, we need to: Investigate domain ontologies centred on non-taxonomic relations and their potential in contrast to the wide spread concept-centred ontologies. Which requirements regarding the expressivity of the representation language need to be considered and do they collide with the requirements regarding reasoning and inference? Regarding the challenge of revealing the ontological structure of the domain ontology concepts in a philosophical sense, alignment of the domain ontologies with the upper level ontology is needed. This involves: Collaboration between the engineers of formal ontology (referring to a philosophical research field in the sense defined in (Guarino, 1998)) and engineers of geospatial domain ontologies. Investigations on the type of upper-level ontology. Which existing upper-level ontology is most suitable for providing ontological consistency to geospatial domain concepts? This investigation is part of the SeReS (Semantic Reference Systems) project 2. In this project the potential of theories of cognitive semantics to serve as upper-level is examined. Currently the domain ontology engineers can be considered to be the only ones who really can commit to their ontologies. Users have a hard time in understanding the formal statements and fully grasp their meaning. Research on methods and tools is needed in order to: Support application ontology engineering, e.g. by automating the process of creating application ontologies. Support query formulation in an intuitive way, e.g. by hiding the logical statements from the user. Provide means for visualising, browsing and exploring domain ontologies in an intuitive way. Provide adapted evaluation methods as proposed by (Gómez Pérez, 2001) to support the user in deciding on the quality of available domain ontologies as well as evaluating the quality of their own application ontologies. The approach of using ontologies for matchmaking during discovery, retrieval and evaluation of GI is essential for achieving semantic interoperability in web service infrastructures. The goal is to make this approach more widely accepted in the GI community. Specifying in more detail a supportive ontology environment which accounts for changing semantics by providing flexibility and which at the same time serves as semantic reference system by providing stability for the semantic annotations of geospatial applications would certainly support this goal. 2

80 Requirements for Geospatial Ontology Engineering 73 Acknowledgements Many thanks to our colleagues from MUSIL ( for valuable input to this work. The work presented in this paper has been supported by the German Federal Ministry for Education and Research as part of the GEOTECHNOLOGIEN program (grant number 03F0369A) and by the German Research Foundation (DFG) through the Semantic Reference Systems Project (grant KU 1368/4-1). Bibliography ACTOR. (2002). Workshop on Action-Oriented Approaches in Geographic Information Science, from Arpinar, B., Sheth, A., Ramakrishnan, C., Usery, L., Azami, M., & Kwan, M.-P. (2004). Geospatial Ontology Development and Semantic Analytics. In J. P. Wilson & A. S. Fotheringham (Eds.), Handbook of Geographic Information Science: Blackwell Publishing. Bennett, B., & Cristani, M. (Eds.). (2004). Spatial Cognition and Computation: special issue on spatial vagueness, uncertainty and granularity. Spatial Cognition and Computation, Springer. Bernstein, A., & Klein, M. (2002, June 9-12, 2002). Towards High-Precision Service Retrieval. Paper presented at the The Semantic Web - First International Semantic Web Conference (ISWC 2002), Sardinia, Italy. Bowers, S., & Ludäscher, B. (2004, March 25-26, 2004). An Ontology-Driven Framework for Data Transformation in Scientific Workflows. Paper presented at the International Workshop on Data Integration in the Life Sciences (DILS'04), Leipzig, Germany. COSIT. (2003). Workshop on Fundamental Issues in Spatial and Geographic Ontologies, from de Bruijn, J. (Ed.). (2005). The Web Service Modeling Language WSML. Fonseca, F. T., Egenhofer, M. J., Agouris, P., & Camara, G. (2002). Using Ontologies for Integrated Geographic Information Systems. Transactions in GIS, 6(3). Frank, A. (2003). Ontology for spatio-temporal Databases. In M. Koubarakis et al. (Eds.), Spatiotemporal Databases: The Chorochronos Approach (Vol. 2520, pp. 9-77). Berlin: Springer. Gangemi, A., Guarino, N., Masolo, C., Oltramari, A., & Schneider, L. (2002). Sweetening Ontologies with DOLCE. Paper presented at the International Conference on Knowledge Engineering and Knowledge Management. AAAI, Madrid, Spain. Gómez Pérez, A. (2001). Evaluation of ontologies. International Journal of Intelligent Systems, 16(3), Gómez-Pérez, A., Férnandez-López, M., & Corcho, O. (2003). Ontological Engineering. Grenon, P., & Smith, B. (2004). SNAP and SPAN: Towards Dynamic Spatial Ontology. Spatial Cognition and Computation, 4(1),

81 Chapter 3 74 Gruber, T. (1993). A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, 5(2), Guarino, N. (1998). Formal Ontology and Information Systems. In N. Guarino (Ed.), Formal Ontology in Information Systems (pp. 3-15). Trento, Italy: IOS Press. Harding, J. (Ed.). (2002). Geo-ontology Concepts and Issues. Ikley, UK: Ordance Survey. Hart, G., Temple, S., & Mizen, H. (2004). Tales of the Riverbank, First Thoughts in the Development of a Topographic Ontology. Paper presented at the 7th Conference on Geographic Information Science (AGILE 2004), Heraklion, Greece. Hiramatsu, K., & Reitsma, F. (2004). GeoReferencing the Semantic Web: ontology based markup of geographically referenced information. Paper presented at the Joint EuroSDR/ EuroGeographics workshop on Ontologies and Schema Translation Services, Paris, France. Klien, E., Lutz, M., Einspanier, U., & Hübner, S. (2004). An Architecture for Ontology-Based Discovery and Retrieval of Geographic Information. Paper presented at the 7th Conference on Geographic Information Science (AGILE 2004), Heraklion, Greece. Kuhn, W. (2003). Semantic Reference Systems. International Journal of Geographical Information Science, 17(5), Kuhn, W. (2005). Geospatial Semantics: Why, of What, and How? Journal of Data Semantics, accepted for publication. Lemmens, R., & Vries, M. (2004). Semantic Description of Location Based Services using an Extensible Location Ontology. Paper presented at the Muenster GI-Days, Muenster, Germany. Lutz, M., & Klien, E. (2005). Ontology-Based Retrieval of Geographic Information, under review. OGC. (2003). Reference Model (No. OpenGIS Discussion Paper OGC ): OpenGIS Consortium. Probst, F., Gibotti, F., Pazos, A., Esbri, M. A., Benigno, M., Gutiérrez, M., et al. (2004). Connecting ISO and OGC Models to the Semantic Web. Paper presented at the Third International Conference on Geographic Information Science, Adelphi, MD, USA. Schuster, G., & Stuckenschmidt, H. (2001). Building shared ontologies for terminology integration. Paper presented at the KI-01 Workshop on Ontologies, Vienna, Austria. Sycara, K., Klusch, M., Widoff, S., & Lu, J. (1999). Dynamic Service Matchmaking Among Agents in Open Information Environments. SIGMOD Record, 28(1), Tomai, E., & Kavouras, M. (2004). From "Onto-GeoNoesis" to "Onto-Genesis": The Design of Geographic Ontologies. Geoinformatica, 8(3), W3C. (2004a). OWL-S: Semantic Markup for Web Services (W3C Member Submission 22 November 2004). W3C. (2004b). W3C: OWL Web Ontology Language Overview (W3C Recommendation 10 February 2004). Wache, H., Vögele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neumann, H., et al. (2001). Ontology-Based Integration of Information A Survey of Existing Approaches.

82 Requirements for Geospatial Ontology Engineering 75 Paper presented at the IJCAI-01 Workshop: Ontologies and Information Sharing, Seattle, WA.

83 Chapter 4 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata Klien, E. and Lutz, M. (2005). The Role of Spatial Relations in Automating the Semantic Annotation of Geodata. In: Cohn, A. and Mark, D (eds.). Proceedings of the Conference of Spatial Information Theory (COSIT'05), Ellicottville, NY, USA. Lecturer Notes in Computer Science, Vol. 3693, pp Abstract. How can the usability of distributed and heterogeneous geographic data sets be enhanced? Semantic interoperability is a prerequisite for effectively finding and accessing relevant data in different application contexts. By using geospatial domain ontologies and semantic annotations of geodata based on these ontologies semantic interoperability can be achieved. However, since no automated methods for the semantic annotation of geodata exist this remains a laborious task, which data providers are neither willing nor capable to perform. In this paper we propose a method for automating the annotation process based on spatial relations. At the domain level, spatial relations play an important role for defining and identifying geospatial concepts. At the data level, spatial relations may be expressed through spatial processing methods, as we can calculate relations like topology, direction or distance between two spatial entities. We show how this potential can be exploited for automating the semantic annotation of geodata. The approach is illustrated by introducing a case study for annotating data containing representations of floodplains.

84 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata Introduction Distributed and heterogeneous geographic data sets have a great potential for applications ranging from environmental planning to emergency management or e-commerce. However, even though syntactical standards for Spatial Data Infrastructures (SDIs) already enable the retrieval and multiple exploitation of geodata (cf. [1]), still many problems impede efficient usability. Being able to assess semantic interoperability is a precondition for effectively finding and accessing relevant data in different application contexts. One of the shortcomings of current SDIs is the missing support for this assessment. An important means for achieving semantic interoperability are ontologies, which capture consensual knowledge and formalize this knowledge in a machine-interpretable way [2]. In SDIs, ontologies can be employed for making the semantics of the information content of geospatial web services explicit. In [3, 4] we have shown how ontologies can be used to realise semantic matchmaking during service discovery and retrieval. The backbone of our approach is an infrastructure of geospatial domain ontologies and semantic annotations of the geodata. Domain ontologies represent the basic concepts and relations to which all members of an information community commit. They provide the foundation on which the geodata is semantically annotated. The common commitment ensures semantic interoperability [5]. So far, no automated method for the semantic annotation of geodata exists. Manual annotation is difficult, time consuming, and expensive and data providers who are no ontology engineering specialists will be neither willing nor capable to perform it. We propose a method for automating the annotation process that relies on the specific characteristics of geographic information. Spatial relations between entities are characteristic for geographic information and they are often as important as the entities themselves [6]. In geospatial domain ontologies, taxonomic and non-taxonomic relations are used to define concepts of the physical world and to differentiate between them. At this level, spatial relations play an important role for defining and identifying spatial concepts, but when reasoning about these concepts, spatial relations are not treated differently from other non-taxonomic relations. At the data level though, the spatial relations may be expressed through spatial processing methods, as we can calculate e.g. the topology, direction or distance between two spatial entities. In this paper we illustrate how this potential can be exploited for the semi-automatic semantic annotation of geodata. In this paper we focus on spatial relations. In our future work, the approach will be extended to include spatial attributes and non-spatial characteristics. The approach is illustrated by introducing an example ( annotating data containing floodplains ) and consists of the following steps: Extract all concept definitions from the geospatial domain ontology that contain spatial relations. Translate spatial relations (e.g. adjacent to) into a corresponding spatial analysis method, which is implemented as a sequence of GIS operations. Apply these spatial analyses on the geodataset to be annotated.

85 Chapter 4 78 Identify sets of spatial entities that share a characteristic set of relations and can then be referenced to the corresponding geospatial concept. The remainder of the paper is structured as follows. We first introduce a motivating example to clarify in what context we refer to semantic annotation and why we think a method for automating the process is needed (Section 4.2). In Section 4.3, spatial relations are discussed with respect to their role in the semantic annotation process. In Section 4.4, we illustrate the general idea by conducting a walk-through the annotation process and continue with explaining the proposed method in detail. Section 4.5 provides an overview on related work. Finally, we discuss the approach and identify some future work (Section 4.6). 4.2 Motivation: Ontology-Based Discovery and Retrieval of GI Ontologies can be applied for making the semantics of the information content of geospatial web services explicit in order to enhance geographic information (GI) discovery and retrieval in geospatial web service environments [4]. In the following we introduce an example to illustrate our approach for solving semantic heterogeneity problems in SDIs and to describe the semantic matchmaking mechanism which underlies the ontology-based discovery and retrieval Example Floodplain Floodplains are crucial elements for the task of flood management. They serve as a natural water retention area after a river broke its banks during a flooding event. If sufficient area along the river banks has the function of a floodplain, some of the river s water load will naturally be absorbed and the flooding event will be less critical for populated areas that lie further downstream. Floodplains can be looked at from several different perspectives: To define a floodplain depends somewhat on the goals in mind. As a topographic category it is quite flat and lies adjacent to a stream; geomorphologically, it is a landform composed primarily of unconsolidated depositional material derived from sediments being transported by the related stream; hydrologically, it is best defined as a landform subject to periodic flooding by a parent stream. A combination of these [characteristics] perhaps comprises the essential criteria for defining the floodplain"[7]. We will use this definition for formalizing the concept of a floodplain in our geospatial domain ontology. It is important to note, that the relation lies adjacent to is interpreted in the sense of near or close to but not necessarily touching (WordNet 2.0) Semantic Heterogeneity Problems In order to avoid future flooding disasters, the planning department of a city council has decided to identify potential areas in the district that may be re-designated as floodplains. The task of John, the planner in charge, is a) first to find data that contains the relevant information (discovery) and b) to access this data and retrieve the information (retrieval).

86 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata 79 In current standards-based catalogues users can formulate queries using keywords and/or spatial filters. The metadata fields that can be included in the query depend on the metadata schema used (e.g. ISO 19115) and on the query functionality of the service that is used for accessing the metadata. Even though natural language processing techniques (e.g. [8]) can increase the semantic relevance of search results with respect to the search request, keyword-based techniques are inherently restricted by the ambiguities of natural language. As a result, keyword-based search can have low recall if different terminology is used and/or low precision if terms are homonymous or because of their limited possibilities to express complex queries [9]. For example, if John uses floodplains as a keyword he may fail to find existing Web Feature Services (WFS) that offer information on floodplains, because their metadata description uses a different terminology. Furthermore, he might also discover data sources that are annotated with this keyword but not appropriate for answering his purposes, e.g. a service providing areas that are officially appointed and protected for having the function of a floodplain according to national legislation. Another obstacle often encountered is missing metadata entries. In that case a successful search will not be possible at all. Once John has discovered a dataset and wants to access it via its WFS interface, he faces yet another major difficulty. The DescribeFeatureType request [10] returns the application schema for the feature type, which is essential for formulating a query filter. John now runs into trouble if the property names are not intuitively interpretable or if the feature type floodplain is not explicitly stored in the schema. In our example, it might be sufficient to offer John a natural language description for each property. However, our work is aiming at automating the process of discovery and retrieval and this makes a machine-interpretable description of the properties indispensable Semantic Matchmaking Figure 4.1 illustrates the matchmaking which underlies ontology-based discovery and retrieval. The geospatial ontology contains the basic terms of a domain (e.g. geomorphology). It is assumed that all actors within a domain share a common understanding of the concepts and relations provided at the domain level [5]. The information sources, i.e. the geodata are annotated based on the concepts and relations provided in the geospatial ontologies. In our example the information source is a geodataset that contains polygons with land use attributes. John, the user of geospatial web services, is looking for information sources that will answer his question. His query for lowlands adjacent to a river that are subject to flooding is formulated based on a geospatial ontology.the semantic annotations of the geodata available are created in the same way as John s query and stored in a catalogue. Thus, John s query concept becomes machinecomparable to all geodata descriptions in this catalogue. As shown in [3] the integration of the matchmaking capability into SDIs overcomes some of the semantic heterogeneity problems in service discovery and thus leads to increased recall and precision. However, in order for this approach to become widely accepted in the GI community it is essential to provide methods and tools that support the user in creating semantic descriptions. So far, no automated method for the semantic annotation of geodata exists. It remains a laborious task and data providers who are no ontology engineering specialists will neither be willing nor capable to perform it.

87 Chapter 4 80 Figure 4.1: Ontology-based discovery of geospatial data. 4.3 Spatial Relations In the geospatial domain, relations among spatial entities are often as important as the entities themselves [6]. For example, for a farmer it is crucial to know that a planned plantation is on lowland adjacent to a river. This implies that the plantation will probably be covered by rising water once in a while and the farmer is well advised to choose plants that may cope with these conditions. This makes the representation and processing of spatial relations crucial in geographical applications. Spatial relations have been classified in all possible various ways. We refer to a classification provided in [11], where spatial relations are classified according to their characteristic behavior in space.. Topological relations refer to properties like connectivity, adjacency and intersection among geospatial entities. They stay invariant under consistent topological transformation, such as rotation, translation, and scaling. Direction relations deal with order in space (e.g. north, east, south, and west). They are based on the existence of a vector space and, therefore, are subject to change under rotation, while invariant under translation and scaling of the reference frame. The third major type of spatial relations is distance relation. They refer to the geographical distances among geospatial objects (e.g. A is close to B, X is very far from Y). They reflect the concept of metric, thus change under scaling but stay invariant under translation and rotation. Inherent to all these relations is the vagueness and imprecision in natural language expressions. Moreover, terminology and semantics of the relations varies across application domains [12]. Consider for example the spatial relation adjacent taken from our floodplain example. According to WordNet 2.0, adjacent has three slightly different senses: 1. nearest in space or position; immediately adjoining without intervening space; 2. having a common boundary or edge; touching; 3. near or close to but not necessarily touching; For the interpretation of a floodplain as defined in Section 4.2.1, the first two senses are not applicable. Floodplains do not have to touch a river directly as long as the intervening space does not prevent the rising water from flooding the area.

88 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata 81 Attempts to capture the semantics of spatial relations have been undertaken from both, the cognitive [13-15] and the mathematical viewpoint [16-19]. Still, there have only been few attempts to link the formal models of spatial relations developed for GIS with people s intuitive understanding of spatial relations as expressed in natural language [11]. Nevertheless, welldefined primitive operators like the Egenhofer operators for topological relations [20] may be used as the backbone for the definition of terms used in GIS and spatial query languages (cf. [21]). Spatial relations are usually not explicitly stored together with geographic objects but have to be inferred from the objects geometry [12]. By extracting them, hidden information in geospatial data becomes explicit. Depending on the application domain, some spatial relations may be more significant than others for identifying relevant implicit information. For the enterprise of using spatial relations for identifying concept characteristics in datasets, it will eventually be necessary to decide on a core set of relations for the geospatial domain of discourse. In the next section we describe how we want to exploit the potential of spatial relations in order to identify characteristic concept information for the semi-automated annotation of geodata. 4.4 Method for Automating Semantic Annotation In this section we present a method for automating the semantic annotation of geographic datasets within a specific application domain. We first introduce the general idea of using spatial analysis methods that are associated with spatial relations in ontologies to derive annotations for datasets (Section 4.4.1). We then use the case study of annotating data containing floodplains to illustrate the different steps that eventually lead to the semi-automated creation of semantic annotations (Section 4.4.2). In the remaining subsections each of the building blocks of the suggested methodology is presented in detail. These building blocks are: a geospatial ontology that defines spatial concepts based on their spatial relations and attributes (Section 4.4.3); a method for associating the characteristic spatial relations in this ontology with spatial analysis methods (Section 4.4.4); and a reference dataset that is needed to calculate relationships between its well-known reference entities and the unknown ones in the dataset to be annotated (Section 4.4.5). In this paper we concentrate on the role of spatial relations to introduce the fundamental idea of our approach. We are aware, that in order to arrive at reasonable results, the approach will have to be extended to include other (non-spatial) relations and (spatial and non-spatial) attributes, e.g. geometry, shape or extent, in the analysis. This will be part of our future work Using Spatial Analyses for Creating Annotations When reasoning about concept definitions that are (partly) based on spatial relations (e.g. to infer subsumption relationships between them), the spatial relations are not treated differently from other non-taxonomic relations. They simply represent implicit domain knowledge about what it means to be an instance of that concept. This is illustrated in Table 4.1, where the non-

89 Chapter 4 82 taxonomic relations a) adjacentto and b) owner produce the same behaviour for subsumption reasoning. Table 4.1: Two DL inferences, using spatial (a) and non-spatial relations (b) (a) Boathouse House adjacentto.waterbody RiverBoathouse House adjacentto.river RiverBoathouse Boathouse River Waterbody (b) Palace House owner.nobleman RoyalPalace House owner.king RoyalPalace Palace King Nobleman When dealing with concrete datasets, however, this implicit knowledge can be compared with the inferred characteristics from the objects geometry, and the results of this comparison can be used for annotation. This requires that each type of spatial relation that has been identified on the domain level is associated with a spatial analysis method (see Figure 4.2 for an example). This method provides a formal definition of the semantics of the spatial relation. Note, that this definition is particular for the chosen domain, because the interpretation of the relation can differ significantly depending on the viewpoint (e.g. adjacent in Section 4.3). A more detailed description on how spatial relations are associated with spatial analysis methods is given in Section Figure 4.2: Using a spatial analysis method associated with the spatial relation adjacent to (a river) to annotate the dataset shown in (a): In (b) a buffer is generated, in (c) the features intersecting the buffer are selected as being adjacent to the river Walk Through for the Floodplain Example Jane is working at a company that produces thematic datasets for all kind of geographic issues. The company owns a large database of geographic information and wants to make this commercially available for more customers via a geospatial web services environment. The semantic annotation for a specific domain view will consist of the following steps. Before the automated process starts, Jane has to select the domain of discourse. In our example, Jane wants to annotate her data for the geomorphology domain. The annotation procedure then consists of the following steps (Figure 4.3):

90 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata All concept definitions that contain spatial relations are identified in the geomorphology ontology. From each of these concept definitions, the characteristic spatial relations are extracted. This extraction (and the subsequent analysis) is controlled, in the sense that the system will analyse the dataset by looking explicitly for the concepts defined in the ontology (rather than performing an uncontrolled search for arbitrary patterns in the dataset). The process can be depicted as a decision tree, e.g. the system identifies a land unit L as a floodplain if L fulfils the following criteria: L is adjacent to a river L is flat L is at most 2 m higher than the adjacent river 2. For each spatial relation, the corresponding spatial analysis method will be extracted. For example, adjacent is implemented as a sequence of GIS operations (Section 4.4.4). 3. The GIS operations are applied to the geodataset to be annotated (AnnoDS). In order to be able to calculate the relation R between two entities x and y (e.g. x is adjacent to some river ) a reference dataset (RefDS) with the well-known geometry of y (e.g. all rivers) is required. 4. The spatial entities that meet the characteristic spatial relations of floodplains are stored as the result of this analysis step. 5. Steps 2-4 are repeated for other characteristics that define the analysed concept. In our example, this means that the flatness and difference in altitude compared to the adjacent river also have to be tested. 6. The final result set is created by intersecting the result sets of all analysis steps. If this result set contains a significant number of entities (which is greater than a certain user-defined threshold value), the geodata set will be annotated with floodplains in the description. 7. The result of the matchmaking process is finally presented to Jane for verification. She is also asked for further information if necessary. The ontological description is then automatically created. Figure 4.3: Procedure for (semi-) automated annotation of geodata.

91 Chapter Defining Geospatial Concepts Based on Characteristic Spatial Relations What is special about geospatial concepts on the domain level? They describe geographic entities, i.e. entities that are associated with a geometry and a location relative to the earth. In consequence, geospatial concepts stand in high complex relationships to underlying physical reality and these relations may serve to define concepts and distinguish between sub-concepts on the domain level. For example, floodplain might be seen as a subconcept of meadow. One of the characteristics that distinguish floodplains from a meadow relies in the spatial relation of lying adjacent to a river. Such characteristic relations have to be identified by a domain expert during the ontology modelling process. For extracting the spatial characteristics of floodplains, we adopt the definition of a floodplain introduced in Section In the following, we illustrate how these can be used for formally defining the floodplain concept (1). x (Floodplain(x) Landunit(x) HasSlope(x, Flat) y [River(y) Adjacent(x, y) lessthan(difference(altitude(x), Altitude(y)), 2)]) (1) Formula (1) states that all floodplains are landunits that are flat and adjacent to a river, and whose altitude does not differ by more than 2 meters from that of the adjacent river (and that all such landunits are floodplains). For the presented method it is important that all relations and attributes used in this definition can be inferred from the dataset using some kind of spatial analysis method. Therefore, only the defining spatial features of floodplains but not their nonspatial characteristics (e.g. being subject to periodic flooding, being composed of sediments) are taken into account at this stage. However, the non-spatial characteristics are an important part of the concept definitions as well. They will be subject to different kinds of analyses in future work in order to enhance the annotation results. Some representational difficulties arise from the definition given in (1). The floodplain s lowness can only be described with respect to the adjacent river. Such concept interdependencies are often crucial for describing the semantics of a concept. For example, we distinguish between different spatial entities due to physical processes that lead to observable distinctions in the landscape (we can observe that some areas adjacent to a river are flooded and some are not depending on their altitude compared to the river). For our approach, this means that the representation language chosen must be expressive enough for describing these kinds of concept interdependencies. Formal concept definitions like (1) constitute the ontological knowledge at the domain level. Each relation could now be implemented with an analysis method that may be applied on the geodata. In the scope of this paper and to illustrate the idea we concentrate on the spatial relation adjacent. Nevertheless, an exhaustive classification algorithm for the concept floodplain would require the implementation of all relations used in its definition Associating Spatial Analysis Methods with Spatial Relations The association of relations in the ontology with spatial analysis methods can be done in different ways. An analysis method can be associated as a black box containing the spatial relation it implements. This has the advantage that the description of the method and thus the association

92 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata 85 with the spatial relation in the ontology is simple. However, this also means that the implementation of the spatial relation is not transparent to a service provider like Jane. It can be assumed that there are always a number of different possible implementations for one spatial relation, especially if one takes into account more fuzzy relations like side by side instead of only precisely defined ones like meet [20]. We therefore reject the possibility to represent the implementation as a black box in favour of representing the analysis methods based on primitive operations defined in the ISO series of standards and specifications of the Open Geospatial Consortium (OGC). This strategy provides flexibility for adjusting the semantics of spatial relations in different application domains by changing the underlying implementation (e.g. adjacent could also be implemented using other primitives). Moreover, this implementation, and thus the semantic interpretation of a spatial relation remain transparent to the user. Table 4.2 shows an informal representation of an algorithm containing the spatial analysis steps involved in implementing adjacent, which is based on the following specifications and standards: The Web Feature Service (WFS) Implementation Specification [10] defines the GetFeature operation for selecting features of a particular feature type. In ISO Rules for Application Schema [22] states that the geometric characteristics of a feature are described by one or more spatial attributes whose values are given by a geometric object (GM_Object) or a topological object (TP_Object). We introduce the geometry attribute to refer to the geometric representation of a feature. ISO Spatial Schema [21] defines the notions of GM_Object and TP_Object and a number of operations that can be applied to them. In our example, we use the buffer operation, which returns a buffer polygon, and the intersects operation, which returns a boolean value to indicate whether two geometries intersect. Table 4.2: Example for representing the implementation of a spatial analysis method (RefDS represents the reference dataset, and AnnoDS the dataset to be annotated) Algorithm for implementing adjacent to some X Reference Standards and Specifications select all features from RefDS where featuretype = X Web Feature Service create empty set A for each selected feature f A.add(f.geometry.buffer(d : Distance)) ISO 19107, ISO create empty set B for each feature g in A for each feature h in AnnoDS if (g.geometry.intersects(h.geometry)) ISO 19107, ISO B.add(h) return B

93 Chapter Annotating a Reference Dataset We have argued that spatial relations are especially useful for extracting implicit information from geodatasets and that their characteristics make them a perfect candidate for the extraction of implicit information from geodata. However, if a concept definition is based on a spatial relation to another geographic feature of a particular type, it is necessary to first identify these related features. For example, for calculating a relation like adjacent to a river the river features in the dataset (or in a different dataset covering the same spatial extent) have to be known. This can be achieved by providing reference datasets that have already been annotated. That is, the river features in the reference dataset would already be associated with the river concept of the domain ontology. One interesting question in this context is to what extent the provision of reference datasets could be substituted by a recursive process (e.g. for calculating the floodplain s characteristic spatial relation is adjacent to river, the system would first have to identify rivers in a dataset by calculating their spatial characteristics and so on). We propose to introduce a reference dataset for each geospatial domain ontology. The role of a reference dataset in our example can be fulfilled by the national topographic map, e.g. ATKIS (Amtliches Topographisch-Kartographisches Informationssystem) 1 in Germany. Other reference datasets needed may include a Digital Terrain Model (DTM) for calculating e.g. slope and altitude of unknown spatial entities. 4.5 Related Work In this section, we relate the presented approach for semi-automatic annotation of geodata to existing work in the area of spatial data mining and to other approaches for automatic annotation Spatial Data Mining Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from spatial databases [23-25]. We incorporate a similar strategy for automatically extracting relevant information from a geospatial database. But, instead of mining the dataset for potentially interesting patterns we define the spatial constraints a priori in the geospatial concepts of our domain ontology. These spatial constraints are then implemented into a supervised analysis process that aims to identify a specific concept (and not some previously unknown, potentially useful pattern). Evidently our approach requires far less complex techniques than those applied in Spatial Data Mining Automatic Annotation With the emergence of the Semantic Web, the creation of semantic metadata by annotating documents has become a major concern in the community [26]. Several approaches are concerned with automating the process of semantically annotating information for the Semantic Web [27, 28]. 1

94 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata 87 Some research in this area has focused on the idea that spatial characteristics play a central role for effectively supporting information retrieval and annotation. In [29], Manov et al. demonstrate how their annotation platform can be extended by using spatial knowledge in conjunction with information extraction. In their approach, integrated gazetteers (like the Alexandria Digital Library Gazetteer) provide the additional spatial knowledge. At the MINDSWAP group, Hiramatsu and Reitsma [30] have worked on a geographic ontology to circulate geographically referenced information on the Semantic Web. Their idea is to associate georeferenced data (instead of using some gazetteer) to any other non-spatial information related to the geographic feature, i.e. they want to make use of the inherent characteristic spatial relations in order to add semantics to hypertext coded information. Similar work is done in the SPIRIT project, where knowledge stored in geographical data is made usable in an internet search engine [31]. While these approaches aim at annotating hypertext or enable spatially enhanced internet search, our work provides a method for the semantic annotation of geodata in order to enable its ontologybased discovery and retrieval through web services. Also related to our work is the automatic extraction of classical metadata (like ISO 19115) from geographic data [32]. We believe that the method presented in this paper will not only be useful for semantically annotating but also for populating missing entries in the standard metadata documents for geographic information. 4.6 Discussion and Future Work We propose a method for automating the semantic annotation of geodata based on spatial relations and suggest to apply spatial analysis methods in order to extract information on spatial relations useful for annotation. Compared to knowledge extraction techniques like string-based attribute analysis, the calculation of spatial relations remains independent from the textual description of geographic features and their properties. This has the advantage that semantic heterogeneity problems inherent in the processing of natural language descriptions are avoided. In this paper we have concentrated on the role of spatial relations. Taking into account the different types of spatial relations, topological relations seem to be especially useful as they stay invariant under transformations. This is a valuable characteristic since we have to deal with heterogeneous data sources in a variety of formats, scales, and projections. A crucial issue not yet decided on is the choice of representation language for the ontological knowledge. As has been illustrated in Section 4.4.3, the representation language must be expressive enough for describing concept interdependencies, i.e. a floodplain s lowness can only be described with respect to the adjacent river. In Description logic (DL), which has been used in our previous work on ontology-based GI discovery and retrieval [3, 4], it is impossible to describe classes whose instances are related to another anonymous individual via different property paths [33]. Thus, current DL-based ontology languages like OWL [34] are not applicable. First-order-logic (FOL) provides the expressivity needed in our approach. However, while reasoning in DL is decidable and therefore guaranteed to terminate, proving entailment in FOL is only semi-decidable [35]. Therefore, the final decision on a representation language will have to take into account the tradeoff between expressivity of the languages and the complexity of their reasoning problems.

95 Chapter 4 88 In Section we have described our strategy on how to define a spatial analysis method that implements a single spatial relation by combining well-defined primitive operators. The question of how to explicitly associate a spatial analysis method to a spatial relation remains. A possible approach is outlined in [36], suggesting to integrate the invocation of executable programs into static ontological knowledge. Likewise, the strategy on how to generate and formalise the analysis algorithm for the entire concept definition remains an open issue. For this task a workflow description is needed, that is applicable for representing a decision tree as outlined in Section For this, we will examine current workflow description languages like BPEL (Business Process Execution Language) and PSL (Process Specification Language). Apart from the benefits for automating the process of annotating geodata, our approach might also contribute to enhance retrieval capabilities in geospatial web service environments. For example: A user wants to retrieve all motorbike roads, with motorbike roads being a concept in the domain ontology. The geometric characteristic of a motorbike road, i.e. its high twisting grade, is associated with a spatial analysis method. Thus, the system can apply this spatial analysis method for the on-the-fly retrieval of motorbike roads from a street network. There is no need for finding a dataset that explicitly stores motorbike roads. Another application area for which the approach might be beneficial is the automation of metadata population for standard metadata like ISO In the ISO case, fields that might be filled with information extracted by the semantic annotation process are the following: descriptivekeywords, topiccategory, geographicbox, geographicdescription [37]. In our future work we will extend our approach by providing implementation strategies for spatial attributes, like geometry, extent, and shape. Spatial attributes probably have an equally high potential for identifying characteristic information in geodata as spatial relations. For example, the analysis of the straightness of a water course might identify an entity as channel rather than river. However, the applicability of such an analysis highly depends on the resolution of the geodata. If all watercourse geometries are generalized in straight lines, the straightness attribute is of no value for information extraction. These dependencies on representation, resolution, and scale of the data have to be taken into account. At this stage, we assume that the analysis of spatial characteristics will be the core methodology for identifying characteristic concept information in geodata. However, the analysis of spatial characteristics will probably not suffice for describing the semantics at the conceptual level in many cases (or not be applicable at all). Consequently, besides taking spatial attributes into account, we will also refine the presented approach by combining spatial analyses with the analysis of non-spatial attributes. This combination will eventually lead to reasonable results in the annotation process. The vagueness of the specification of spatial characteristics is also a crucial issue. Consider the formalisation of the floodplain concept we provided in Section 4.4.3: How many meters away from the water body still counts as adjacent? What degree of flatness still counts as flat? How is the difference of altitude between floodplain and adjacent river determined? In our future work, we will have to consider vagueness of spatial relations when specifying the associated analysis methods. The performance of the spatial query techniques have to be evaluated and, if necessary, optimized. Methods for spatial query optimisation have been discussed in [12, 38].

96 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata 89 We are aware that a fully automated process is out of scope. Therefore, we plan to develop a user interface that guides the data provider through the annotation process. Acknowledgements We would like to thank Werner Kuhn and Florian Probst for their valuable input at various stages of this work. Our thanks also go to the anonymous referees for providing valuable comments that helped to improve the content of the paper. The work presented in this paper has been supported by the German Federal Ministry for Education and Research as part of the GEOTECHNOLOGIEN program (grant number 03F0369A) and can be referenced as publication no. GEOTECH-142. References 1. OGC: OpenGIS Reference Model. Open GIS Consortium (2003) 2. Studer, R., Benjamins, V.R., Fensel, D.: Knowledge Engineering: Principles and Methods. Data and Knowledge Engineering. 25(1-2) (1998): Klien, E., Lutz, M., Einspanier, U., Hübner, S.: An Architecture for Ontology-Based Discovery and Retrieval of Geographic Information. Presented at 7th Conference on Geographic Information Science (AGILE 2004). Heraklion, Greece. (2004) 4. Lutz, M., Klien, E.: Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science (IJGIS), forthcoming 5. Wache, H., Vögele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neumann, H., Hübner, S.: Ontology-Based Integration of Information A Survey of Existing Approaches. Presented at IJCAI-01 Workshop: Ontologies and Information Sharing. Seattle, WA. (2001) 6. Papadias, D., Kavouras, M.: Acquiring, Representing and Processing Spatial Relations. Presented at Sixth International Symposium on Spatial Data Handling. Edinburgh, Scotland. (1994) 7. Schmudde, T.H.: Floodplains. In: Fairbridge, R.W. (ed.): The Encyclopedia of Geomorphology. New York. (1968) Richardson, R., Smeaton, A.F.: Using WordNet in a Knowledge-based Approach to Information Retrieval (Technical Report CA-0395). Dublin City University: Dublin, Ireland (1995) 9. Bernstein, A., Klein, M.: Towards High-Precision Service Retrieval. Presented at The Semantic Web - First International Semantic Web Conference (ISWC 2002). Sardinia, Italy. (2002) 10. OGC: Web Feature Service Implementation Specification. Open GIS Consortium (2002) 11. Shariff, A., Egenhofer, M., Mark, D.: Natural-Language Spatial Relations between Linear and Areal Objects: the Topology and Metric of English-Language Terms. International Journal of Geographical Information Science. 12(3) (1998):

97 Chapter Clementini, E., Sharma, J., Egenhofer, M.: Modeling Topological Spatial Relations: Strategies for Query Processing. Computers and Graphics. 18(6) (1994): Herskovitz, A.: Language and Spatial Cognition. In: Joshi, A. (ed.): Studies in Natural Language Processing. Cambridge University Press: Cambridge. (1986) 14. Mark, D., Svorou, S., Zubin, D.: Spatial terms and spatial concepts: geographic, cognitive, and linguistic perspectives. Presented at International Geographic Information Systems (IGIS) Symposium. Arlington, VA, USA. (1987) 15. Talmy, L.: How Language Structures Space. In: Pick, H.,Acredols, L. (eds.): Spatial Orientation. Theory, Research and Application. Plenum: New York. (1983): Egenhofer, M., Herring, J.R.: A Mathematical Framework for the Definition of Topological Relationships. Presented at the 4 th International Symposium on Spatial Data Handling. Zurich, Switzerland. (1990) 17. Frank, A.U.: Qualitative Spatial Reasoning about Distances and Directions in Geographic Space. Journal of Visual Languages and Computing. 3 (1992): Cohn, A.G.: A Hierarchical Representation of Qualitative Shape Based on Connection and Convexity. In: Frank, A.U., Kuhn, W. (eds.): Spatial Information Theory-A Theoretical Basis for GIS. Springer, Berlin-Heidelberg-New York. (1995): Papadias, D., Sellis, T.: The Semantics of Relations in 2D Space Using Representative Points: Spatial Indexes. In: Frank, A.U.,Campari, I. (eds.): Spatial Information Theory - Theoretical Basis for GIS. Springer Verlag, Heidelberg-Berlin. (1993): Egenhofer, M.: Reasoning about Binary Topological Relations. Presented at Advances in Spatial Databases, 2nd International Symposium. Zurich. (1991) 21. ISO: ISO Spatial Schema. ISO TC 211 (2002) 22. ISO/TC-211: Text for DIS Geographic information - Rules for application schema Vs Draft Version. International Organization for Standardization. (2001) 23. Shekhar, S., Zhang, P., Huang, Y., Vatsavai, R.: Trends in Spatial Data Mining In: Kargupta, H., et al. (eds.): Data Mining: Next Generation Challenges and Future Directions. AAAI Press. (2004): Koperski, K., Han, J., Adhikary, J.: Mining Knowledge in Geographical Data. Communications of the ACM. 26(1) (1998): Roddick, J., Lees, B.G.: Paradigms for spatial and spatio-temporal data mining. In: Miller, H.,Han, J. (eds.): Geographic Data Mining and Knowledge Discovery (2001) 26. Handschuh, S., Staab, S. (eds): Annotation for the Semantic Web. Frontiers in Artificial Intelligence and Applications. Vol. 96. IOS Press: Amsterdam, The Netherlands. (2003) 27. Handschuh, S., Staab, S., Ciravegna, F.: S-CREAM -- Semi Automatic Creation of Metadata. Presented at the Semantic Authoring, Annotation and Markup Workshop, 15th European Conference on Artificial Intelligence (ECAI02). Lyon, France. (2002) 28. Dingli, A., Ciravegna, F., Wilks, Y.: Automatic Semantic Annotation Using Unsupervised Information Extraction and Integration. Presented at the Knowledge Markup and Semantic Annotation Workshop at the Second International Conference on Knowledge Capture (K- CAP 2003). Sanibel, Florida, USA. (2003)

98 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata Manov, D., Kiryakov, A., Popov, B., Bontcheva, K., Maynard, D., Cunningham, H.: Experiments with geographic knowledge for information extraction. Presented at the NAACL-HLT 2003, Workshop on the Analysis of Geographic References. Edmonton, Alberta, Canada. (2003) 30. Hiramatsu, K., Reitsma, F.: GeoReferencing the Semantic Web: ontology based markup of geographically referenced information. Presented at the Joint EuroSDR/EuroGeographics workshop on Ontologies and Schema Translation Services. Paris, France. (2004) 31. Heinzle, F., Sester, M.: Derivation of Implicit Information from Spatial Data Sets with Data Mining. Presented at the XXth Congress of the International Society for Photogrammetry and Remote Sensing (ISPRS). Istanbul, Turkey. (2004) 32. Manso, M.A., Nogueras-Iso, J., Bernabe, M.A., Zarazaga-Soria, F.J.: Automatic Metadata Extraction from Geographic Information. Presented at the 7th Conference on Geographic Information Science (AGILE 2004). Heraklion, Greece. (2004) 33. Grosof, B., Horrocks, I., Volz, R., Decker, S.: Description Logic Programs: Combining Logic Programs with Description Logic. Presented at 12th Intl. Conf. on the World Wide Web (WWW-2003). Budapest, Hungary. (2003) 34. W3C: OWL Web Ontology Language Overview. (2004) 35. Russell, S., Norvig, P.: Artificial Intelligence - A Modern Approach. (2002) 36. Borchert, R.: How can a knowledge base run executables on the frame level? Presented at the International Protege Workshop. Manchester, England. (2003) 37. ISO/TC-211: ISO 19115:2003. Geographic information - Metadata. International Organization for Standardization. (2003) 38. Papadias, D., Theodorodis, Y.: Spatial relations, minimum bounding rectangles, and spatial data structures. International Journal of Geographical Information Systems. 11(2) (1997):

99 Chapter 5 A Rule-based Strategy for the Semantic Annotation of Geodata Klien, E. (2007). A Rule-based Strategy for the Semantic Annotation of Geodata. Transactions in GIS, Special Issue on the Geospatial Semantic Web 11(3): Abstract. The ability to represent geospatial semantics is of great importance when building geospatial applications for the Web. This ability will enhance discovery, retrieval and translation of geographic information as well as the reuse of geographic information in different contexts. The problem of generating semantic annotations has been recognized as one of the most serious obstacles for realizing the Geospatial Semantic Web vision. We present a rule-based strategy for the semantic annotation of geodata that combines Semantic Web and Geospatial Web Services technology. In our approach, rules are employed to partially automate the annotation process. Rules define conditions for identifying geospatial concepts. Based on these rules, spatial analysis procedures are implemented that allow for inferring whether a feature in a dataset represents an instance of a geospatial concept. This automated evaluation of features in the dataset generates valuable information for the creation and refinement of semantic annotations on the concept level. The approach is illustrated by a case study on annotating data sources containing representations of lowlands. The presented strategy lays the foundations for the specification of a semantic annotation tool for geospatial web services that supports data providers in annotating their sources according to multiple domain views.

100 A Rule-based Strategy for the Semantic Annotation of Geodata Introduction Geographic information is increasingly made available through Spatial Data Infrastructures (SDIs). In comparison to monolithic expert GIS, two implications arise: 1) the number of users with access to geographic information increases as well as the variation in the users experiences, viewpoints and information needs, and 2) geographical datasets that were once produced for a specific purpose and used only within the same organisation are now accessible for a broad and heterogeneous user community. Interoperability in SDIs is supported by the Open Geospatial Consortium (OGC) with a series of syntactic interface specifications, establishing protocols for components that exchange geospatial information (Kuhn 2005). However, challenges remain in supporting the crucial tasks of discovery and retrieval of information sources that meet the user s needs. Metadata standards for the description of geodata exist as well as catalogue services to search them. But these do not account for the fact that the conceptualisations governing the different implementations have been constrained in different ways (Burrough and Frank 1995), causing semantic heterogeneity during discovery of information sources and retrieval of information (Lutz and Klien 2006). Currently, available information about geodata are application schemas and natural language entries in metadata documents. The schemas do not provide explicit semantics of their data and the metadata entries are not machine-readable. What is lacking is a formal and explicit representation of the semantics to achieve semantic interoperability (Kuhn 2005). While the standardization efforts of the OGC concentrate on syntactic interoperability, the semantic web initiative has brought the semantic issues of information processing into perspective (Berners-Lee et al. 2001). It seems promising to adopt the developments around the semantic web like the Web Ontology Language (OWL) and the Semantic Web Rule Language (SWRL) to approach semantic interoperability in geospatial web applications. Visions, architectures, and applications of this cross-fertilization of geospatial and semantic web technology are cumulating in the notion of a Geospatial Semantic Web (Arpinar et al. 2004; Egenhofer 2002; Fonseca and Sheth 2002; Kolas et al. 2005). In this work, semantic annotation is understood as making explicit the relationship between a data schema and a domain ontology by defining mappings from elements of the schema to elements in the ontology. In Klien and Lutz (2005), we have presented a method for automating this process by concentrating on the role of spatial relations for extracting implicit instance level information. In this paper, we address the specific case of annotating geodata in a web environment. We show how the requirements of a web environment can be met by exploring the use of semantic web technologies for the ontology layer. Also, the method is extended by taking into account not only spatial relations but spatial properties for the detection of implicit instance level information. The presented approach relies on the use of rules for specifying the spatial conditions that must be fulfilled to have something classified as a certain concept. The novelty of the approach is that the rules are not checked on the concept level, but on the instance level: a characteristic spatial configuration may be expressed through spatial processing methods, as we can calculate

101 Chapter 5 94 e.g. the topology, direction or distance between two spatial entities. This automated evaluation of features in the dataset generates valuable information for the creation and refinement of semantic annotations on the concept level. The remainder of the paper is structured as follows. Section 5.2 introduces the general setting for the semantic annotation of geodata in a web environment. This includes details on geospatial domain ontologies (Section 5.2.1) and on the strategy of rule-based semantic annotation (Section 5.2.2). Section 5.3 provides some background on the technology involved to implement the strategy. Section 5.4 illustrates the rule-based strategy with the example of annotating representations of lowlands. In Section 5.5, related work is discussed and Section 5.6 closes with a conclusion and an outlook on future work. 5.2 General Framework for the Semantic Annotation of Geodata in Web Environments Emergent SDIs enable access to geodata for a broad user community. In such open and heterogeneous environments, semantic interoperability is crucial for searching data sources and evaluating their content (Halevy et al. 2003; Sheth 1999). Ontologies provide formal definitions for concepts in a domain and the terms to denote them. Ontologies are increasingly used to uniformly access information. They can be applied for making the semantics of geospatial data sources explicit and enable automated semantic matchmaking (Kuhn 2005; Rodríguez and Egenhofer 2003). As shown in Lutz and Klien (2006), the integration of the matchmaking capability into SDIs overcomes some of the semantic heterogeneity problems in discovery and retrieval. Their applicability for semantic transformation of information sources has been shown in Bowers and Ludäscher (2004) and Hübner et al. (2005). However, in order for ontologies to become widely accepted in the geospatial community, it is essential to provide methods and tools that support information providers in generating formal semantic annotations. In our approach, the term semantic annotation denotes the mapping from elements in a feature type schema to elements in a domain ontology. The goal of semantic annotation is to make the meaning of the schema elements explicit. The presented strategy is a concrete suggestion for implementing parts of the idea of semantic reference systems as introduced by Kuhn (2003). Figure 5.1 illustrates the general framework for the semantic annotation of geodata in web environments. Subject to annotation are spatial information objects that represent real world entities. These spatial information objects are encoded as features using the Geographic Markup Language (OGC 2003a) and served via Web Feature Services (OGC 2002). The only explicit information available about the features is their application schema, which can be retrieved via the standardized DescribeFeatureType request. The lower half of Fig. 1 depicts this initial situation to which the semantic representation layer is to be added. In the following, we will concentrate on the two things needed to provide such a semantic representation layer for geospatial web services, namely: 1) geospatial domain ontologies that capture a specific world view in axioms and rules (Section 5.2.1) and 2) the strategy for generating semantic annotations for the feature types that are served by a Web Feature Service (Section 5.2.2).

102 A Rule-based Strategy for the Semantic Annotation of Geodata 95 Figure 5.1: General framework for the semantic annotation of geodata in open web environments Geospatial Domain Ontology Different user groups have different abstractions and descriptions of the real world, depending on their application field. The specific conceptualisations of these Geographic Information Communities (GIC) determine how specific information units are modelled or described (OGC 2003b). Information system ontologies can be built to reflect the conceptualisations of a specific GIC, and thus specify a particular application context and resolution of representation. According to Guarino (1998), an ontology refers to an engineering artefact, constituted by a specific vocabulary used to describe a certain reality, plus a set of explicit assumptions regarding the intended meaning of the vocabulary words. The closer this theory corresponds to the human concepts about a domain, the more useful an ontology will be (Kuhn 2005). In talking about meaning, philosophers have drawn a distinction between the intension of a term (its intrinsic meaning or associated concept) and its extension (the complete set of objects that a concept refers to) (Sowa 2000). Theoretically, a semantic basis could be intensional or extensional. But for most cases, an extensional definition is not possible. For example, an extensional definition of the type ROAD might be a catalogue of all the roads in the world, which would be highly impractical. An intensional definition specifies the properties or criteria for recognizing roads without regard to their possible existence. In our approach, the geospatial domain ontologies are solely concerned with intension. For example, part of the intension of a motorbike road is (something like) a road with a high twisting grade, which can be formalised as axiom in an ontology. This characterization of the concept MOTORBIKE ROAD specifies the properties by which its extension (i.e., the set of road entities that have a high twisting grade) can be identified. It is impossible to capture a concept s complete intension with the axioms in an ontology; we can only give a partial account of it. Two concepts might have the same extension. But this does not imply that they are identical. Their intension can be quite different. For example, road entities can be seen as connections between two locations in a navigation task, but also as man-made barriers that intersect biotopes in an environmental impact assessment. In the geospatial domain, spatial relations (Papadias and Kavouras 1994) and spatial properties play an important role for characterizing geospatial entities. For example (spatial relations

103 Chapter 5 96 and properties are in italics), Floodplains are adjacent to Rivers, Beaches touch Seas, Channels are straight Waterways, and Barriers intersect Biotopes. Consequently, concept representations that want to capture the semantics of geospatial entities will, to a large extent, rely on formalising spatial characteristics. As we assume that the intension of a concept determines its extension, we can use the intensional definitions to compute whether or not an object belongs to the concept by analysing whether or not it has the characteristic spatial relations and spatial properties. This lays the foundations for the semantic annotation strategy that is introduced in the next section The Semantic Annotation Task The proposed strategy for semantic annotation consists of two steps. The first step is straightforward and consists of translating the feature type s application schema into the syntax of the ontology language. This step is necessary as both the feature type description and the Domain Ontology have to be encoded with the same language constructs in order to be mapped in the next step. Note that the resulting FeatureType Ontology does not contain any additional information about the meaning of the concepts; the translation solely changes the syntactic structure, but not the vocabulary. The transformation from the application schema to, for example, the Web Ontology Language (OWL) can be implemented with Extensible Stylesheet Language Transformations (XSLT) 1. The second step is more complex and labelled as the semantic annotation task in Figure 5.1. This is the crucial step in the annotation process, where semantics come into play. Background on Ontology Mapping Many methods and tools for ontology mapping have been developed by different communities for various applications and with different assumptions. A comprehensive overview and discussion on ontology mapping can be found in Kalfoglou and Schorlemmer (2003). The discussed approaches range from manual mappings (where the registrant would need excellent knowledge of both the data source and the domain ontology) to promises of fully-automated methods (which work only for very similar ontologies). Kalfoglou and Schorlemmer (2003) define ontology mapping as a morphism, which consists of a collection of functions assigning the symbols used in one vocabulary to the symbols of the other. An ontology can be defined as a pair O = (S, A), where S is the ontological signature (describing the vocabulary) and A is a set of ontological axioms (restricting the intended interpretation of the vocabulary). A total ontology mapping can thus be defined as follows: O1 = (S1, A1) to O2 = (S2, A2) is a morphism : S1 S2 of ontological signatures, such that A2 (A1), i.e., all interpretations that satisfy O2 s axioms also satisfy O1 s translated axioms. 1

104 A Rule-based Strategy for the Semantic Annotation of Geodata 97 In other words, the goal of the mapping is to determine which symbol in one ontology can be replaced by a symbol in another ontology, while keeping the underlying conceptualization identical. A weaker notion of ontology mapping is defined as a partial ontology mapping: O1 = (S1, A1) to O2 = (S2, A2) if there exists a sub-ontology O 1 = (S 1, A 1) (S 1 S1 and A 1 A1) such that there is a total mapping from O 1 to O2. Kalfoglou and Schorlemmer (2003) state that almost none of the works encountered used the intended semantics of the concepts to be mapped. This is not surprising, as these semantics are often not captured in the underlying formalism, and a human expert is needed to give their precise meaning. In this respect, the initial conditions in our case differ from most other mapping scenarios. The two ontologies that are to be mapped differ significantly in their expressiveness. We assume that the Domain Ontology O do captures the intended semantics of concepts in axioms [O do = (S do, A do )]. The FeatureType Ontology O fto is a representation of the feature type s application schema and lacks an explicit interpretation for its data [O fto = (S fto,?)]. The goal of the mapping process thus is to generate such an explicit interpretation by defining unidirectional mappings to corresponding elements in the Domain Ontology. For example, the application concept lago from a Spanish dataset (app:lago) can be mapped to the concept LAKE in a domain ontology for Water Management (wama:lake) [app:lago wama:lake]. But how can we detect that a feature type can be annotated with a concept of the Domain Ontology in the first place? As the concepts in the FeatureType Ontology lack an explicit interpretation, external knowledge is needed to enable the detection of conceptual similarities between the two ontologies. However, the FeatureType Ontology provides an implicit interpretation, as it is defined via its extension (i.e., the spatial information objects in the database). The information objects model real-world entities, which in turn are conceptualised in the Domain Ontology. In terms of the lake example, we would have to infer whether the objects labelled Lago in the Spanish dataset represent the same kind of entities that are conceptualised as LAKE in the domain ontology for Water Management. In the following, an approach on how to make use of the extensional knowledge in the database for inferring concept membership is presented. Strategy for Automatic Support during the Mapping Section showed the applicability of intensional concept definitions for inferring whether or not an object belongs to a concept. Accordingly, the definitions in the Domain Ontology that capture the semantics of geospatial concepts can be used to execute an information extraction process on the quasi instances of the FeatureType Ontology. However, this cannot be done directly, as the instances are not part of the ontology, but externally stored as features in a database. Thus, each spatial predicate (i.e., properties and relations that describe spatial characteristics) in the Domain Ontology is implemented with a spatial analysis algorithm. The extraction process is controlled in the sense that the system is analysing the dataset by searching explicitly for the concepts defined in the Domain Ontology (rather than performing an uncontrolled search for arbitrary patterns in the dataset). By executing the algorithms on the spatial informa-

105 Chapter 5 98 tion objects it is possible to compute whether or not an object belongs to a concept by analysing whether or not it has the characteristic spatial relations and properties. Compared to the stringbased matching of terms that denote schema elements with terms that denote concept representations (as done by most other mapping techniques), our strategy allows for inferring the conceptual overlap. Moreover, the result of the information extraction process can be used to add additional information into the concept description that was implicitly hidden in the dataset. A walk-through of the annotation process is described in Section Semantic Web Technology In Berners-Lee et al. (2001), Tim Berners-Lee presented his vision of a web of meaningful contents and services, which can be interpreted by computer programs. This semantic web can also be seen as a vast source of information, which can be modelled with the purpose of sharing and reusing knowledge. It provides the necessary infrastructure for publishing and resolving ontological descriptions of terms and concepts. In addition, it provides the necessary techniques for reasoning about these concepts, as well as resolving and mapping between ontologies, thus enabling semantic interoperability of web services through the identification (and mapping) of semantically similar concepts (Cabral et al. 2004) In the following, the semantic web technologies that are used in our work are introduced. We have to take into account the two basic requirements of the application. First, the selection of the ontology representation language should be based on the inference mechanisms needed, i.e. it should support subsumption reasoning for automatic matchmaking to enable semantic query processing in SDIs (Lutz and Klien 2006). This need for decidable reasoning procedures can be satisfied by the description logic subset of the Web Ontology Language (OWL-DL). Second, our strategy for partly automating the semantic annotation process requires rules for expressing more complex relationships, especially by using variables. This can be solved by using the Semantic Web Rule Language (SWRL) to add rule axioms to the set of OWL-DL axioms. Thus, the formalisation is divided in two parts. The OWL-DL encoded ontologies are used for subsumption reasoning on the conceptual level, while the SWRL rules define the conditions for classifying features in a database according to the domain ontology concepts. Web Ontology Language (OWL). The OWL Web Ontology Language is the W3C standard for representing ontologies on the Web. It is designed for use by applications that need to process the content of information instead of just presenting information to humans. OWL facilitates greater machine interpretability of web content than that supported by XML, RDF, and RDF Schema by providing additional vocabulary along with a formal semantics. OWL has three increasingly-expressive sublanguages: OWL Lite, OWL DL, and OWL Full (McGuinness and Van Harmelen 2004). OWL-DL is a syntactic variant of the SHOIN(D) description logic (DL), offering a high level of expressivity while still being decidable (Motik et al. 2004). The DL subset of logic reduces the computational complexity such that generalisation and subsumption hierarchies of concepts can be automatically classified (Brachman et al. 1991). We have selected OWL-DL as representation language because OWL Lite and RDF Schema are less expressive, while the more expressive OWL Full is no longer decidable.

106 A Rule-based Strategy for the Semantic Annotation of Geodata 99 Semantic Web Rule Language (SWRL). The Semantic Web Rule Language (SWRL) (Horrocks et al. 2004) is a potential candidate to become the standard rule language of the semantic web. It provides the ability to write Horn-like rules expressed in terms of OWL concepts. In contrast to pure OWL, SWRL provides mechanisms to represent variables. A SWRL rule axiom is made up of a head and a body, both consisting of zero or more atoms. Rule atoms can be of the form C(x), P(x,y), or builtin(r,x, ) where C is an OWL concept, P is an OWL property, r is a built-in relation, x and y are either variables, OWL individuals or OWL data values, as appropriate. A SWRL rule is treated as a logical implication between its body and head: whenever the condition in the rule body hold, then the conditions in the rule head must also hold. Examples for rules in the human readable syntax of SWRL are depicted in Table Case Study for Semantic Annotation In the following, we illustrate the rule-based strategy with an example of annotating representations of lowlands. This walk-through is inline with the approach we presented in Klien and Lutz (2005) for annotating representations of floodplains. Here, the focus is on the web environment requirements i.e., the ontology representation language along with the rule specification. In the case study, a Web Feature Service that offers various feature types representing, among others, plateaus, mountainous areas, lakes, rivers, and lowlands is to be registered within a geospatial application that offers to annotate data sources according to its Landscape Classification Ontology 2. In the example walk-through, the geographic entities of interest are lowlands Formalization of the Concept Lowland in the Domain Ontology In physical geography, lowland is any broad expanse of land with a general low level. The term lowland can be applied to the landward portion of the upward slope from oceanic depths to continental highlands, to a region of depression in the interior of a mountainous region, to a plain of denudation, or to any region in contrast to a highland. Generally speaking, lowland is a large area of relatively low relief. Such spatial characteristics have to be identified by a domain expert during the ontology engineering process. In this case study, we will not deal with vague concept descriptions. In the case of the lowland definition, for example, the notion relatively low is not expressible in the logic of the representation language. To provide a simple and illustrating example for the annotation process, we proceed by defining lowland to be a region characterised by being flat. x (Lowland(x) Region(x) HasSlope(x, Flat)) (1) 2 This domain ontology has been implemented for testing purposes only. It can be accessed at

107 Chapter Formula (1) states that all lowlands are regions and that they are flat; and that all such regions are lowlands. The capture of the intended LOWLAND 3 semantics represented in OWL is depicted in Figure 5.2, which presents an extract of the Landscape Classification Ontology. The ontology has been built with the ontology editor Protégé ( The extract is displayed in OWL abstract syntax (Patel-Schneider et al. 2004), which can be automatically generated from OWL ontologies with a plug-in for Protégé ( Class(Lowland complete restriction(hasslope somevaluesfrom(flat))) SubClassOf(Lowland Region) SubClassOf(Region GeographicEntity) ObjectProperty(adjacentTo inverseof(adjacentto) Symmetric domain(geographicentity) range(geographicentity)) ObjectProperty (hasaltitude domain(geographicentity) range(altitudevalue) ObjectProperty(hasSlope domain(geographicentity) range(slopevalue)) Figure 5.2: Representation of concept LOWLAND in OWL (extract from the ontology on Landscape Classification). This domain ontology specifies a taxonomy of landtypes. GEOGRAPHICENTITY is a generic concept that has properties like altitude and slope and it can be adjacent to other GEOGRAPHICENTITIES. LOWLAND, which is defined as a subclass of GEOGRAPHICENTITY, inherits these properties. However, to be able to identify something as being an instance of the concept LOWLAND, we need to consider only those properties and relations that are characteristic for the concept. Thus, in the rule formulation, we do not want to state all we know about a concept, but only those things that are needed to distinguish it from others. In the example, we assume that having a flat slope is sufficient to identify a REGION as LOWLAND (as stated in Formula (1)). In Table 5.1 the corresponding rule specification is given in the human readable syntax of the Semantic Web Rule Language (SWRL). The rule rulelowland specifies the conditions that must be fulfilled to have a feature in the database classified as LOWLAND. Additionally, Table 5.1 displays the rule specification for FLOODPLAIN, which makes use of the SWRL built-ins swrlb:subtract and swrlb:lessthan to define that, for classifying as a floodplain, the lowland should not be more than 4 meters higher than the adjacent river. The floodplain rule is depicted to show the complexity of relations that can be specified in SWRL compared to OWL-DL. We will get back to the rule rulefloodplain in the example for refining semantic annotations in the last paragraph of Section In the following, we will print the concept names of the Feature Type ontology in italics and the domain concept names in SMALL CAPS.

108 A Rule-based Strategy for the Semantic Annotation of Geodata 101 Table 5.1: Rules for defining the concepts LOWLAND and FLOODPLAIN in the human readable syntax of SWRL. Name rulelowland rulefloodplain Expression Region(?x) hasslope(?x, Flat) Lowland(?x) Lowland(?x) River(?river) adjacentto(?x,?river) hasaltitude(?x,?xalt) hasaltitude(?river,?riveralt) swrlb:subtract(?diffalt,?xalt,?riveralt) swrlb:lessthan(4,?diffalt) Floodplain(?x) Transforming the Application Schema into OWL Syntax In the first step, the application schema of the feature type (encoded in XML) is automatically translated into a corresponding concept definition in OWL syntax (cf. Section 5.2.2). According to its application schema, a lowland entity is modeled as polygon with the assigned attributes ID and usagetype. The translation into OWL results in a concept definition called Lowland that has the properties geometry, ID, and usagetype. Figure 5.3 depicts both representations. The OWL encoded concept definition represents only one possible encoding of information objects that represent lowlands. In this simple schema, the properties do not capture the characteristics of the real world entity, but only those of the information object. Properties like geometry and ID are obviously useless for the automatic detection of semantic similarities between a feature type and a concept in the domain ontology. <xsd:complextype name="lowland"> <xsd:complexcontent> <xsd:extension base="gml:abstractfeaturetype"> <xsd:sequence> <xsd:element name="geometry" type="gml:surfacetype" nillable="false"/> <xsd:element name="id" type="xsd:integer" nillable="true"/> <xsd:element name="usagetype" type="xsd:string" nillable="true"/> </xsd:sequence> </xsd:extension> </xsd:complexcontent> </xsd:complextype> a) Ontology( ObjectProperty(geometry domain(lowland) range(geom:compositesurface)) DatatypeProperty (id domain(lowland) range(< DatatypeProperty (usagetype domain(lowland) range(< Class(Lowland complete restriction(geometry cardinality(1)) restriction(id cardinality(1)) restriction(usagetype cardinality(1))) SubClassOf(Lowland gml:abstractfeaturetype)) b) Figure 5.3: Feature type lowland in a) XML encoded application schema, b) OWL encoded concept definition (displayed in OWL abstract syntax).

109 Chapter Semantic Annotation Process In the following paragraphs we show how existing mapping tools and the proposed method of extracting implicit instance-level information are combined to overcome the lack of explicit information in the FeatureType Ontology. First, the Lowland concept is mapped to a stringbased matching concept in the Domain Ontology. This mapping is then conceptually evaluated by executing the spatial analysis algorithm on the instance data. Eventually, further analysis steps can be executed to refine the established mappings. Detecting String-based Similarity The initial conditions for ontology mapping in the presented approach differ from those in most other mapping scenarios. Existing ontology mapping tools that provide complex mapping support have specific requirements towards ontologies, which are not fulfilled in this case (cf. Section 5.2.2). However, the string-based matching which is provided by some tools, e.g. PROMPT (Noy and Musen 2000), can be used as a mechanism to start the mapping process. PROMPT is a plug-in tool for the Protégé Ontology Editor ( that uses string-based matching to create an ordered list of the most similar terms denoting concepts in two ontologies. Based on their identical names, the merging function of PROMPT suggests mapping the Lowland concept of the FeatureType Ontology to the LOWLAND concept of the Domain Ontology (Figure 5.4). Figure 5.4: Merging function of PROMPT in Protégé: in the example a merge between the two lowland concepts is suggested based on their identical names. However, a string-based match does not necessarily imply semantic similarity. For example, the term lowland can be used in several different senses. Besides the definition provided in 4.1, lowlands can classify river habitats in ecology studies. Lowland habitats are warm, slow flowing rivers found in relatively flat lowland areas, with water that is frequently coloured by sediment and organic matter. Hence, string-based matching might be useful as a kick off mechanism, but obviously a second automatism is desirable for verifying whether both concepts describe the same real-world entity (or a representation of that entity respectively). Since the sources for information extraction are spatial information objects stored in a spatial database, we can apply spatial analysis methods in the extraction process.

110 A Rule-based Strategy for the Semantic Annotation of Geodata 103 Evaluating Conceptual Similarity by Analysing Instance Data In Section 5.4.1, we have specified the rule that needs to be satisfied to be classified as LOW- LAND. At this stage, this rule will be checked against the instances in the database. For this, the spatial predicates of the ontology are associated with spatial analysis methods. Depending on the complexity of the rule to be checked, the spatial analyses will be processed one after the other. Depending on the spatial predicates that are to be computed, it might be necessary to intersect various spatial datasets with the same spatial extent to be able to calculate as much information for a specific area as possible. In the case study, all lowland features are checked for being flat. The calculation of a slope value requires the availability of a Digital Elevation Model (DEM). The inferred information about the slope then has to be intersected with the features of the dataset to be annotated. In this way, each feature can have a slope value assigned and the system can thus evaluate whether these features fulfil the condition of being flat. The specification of the threshold value for being flat might differ depending on the application context and should remain adjustable to the domain expert. The association of predicates in the ontology with spatial analysis methods can be done in different ways. An analysis method can be associated as a black box containing the spatial predicate it implements. This has the advantage that the description of the method and thus the association with the spatial predicate in the ontology is simple. However, this does not meet the requirement of keeping the parameters in the implementation adjustable. Also, it can be assumed that there are always a number of different implementations for a certain spatial predicate, especially if one takes into account more fuzzy relations like adjacent instead of only precisely defined ones like meet (Egenhofer 1991). We therefore reject the possibility to represent the implementation as a black box in favour of representing the analysis methods based on primitive operations defined in the ISO series of standards and specifications of the Open Geospatial Consortium (OGC) (Klien and Lutz 2005). This strategy provides flexibility for adjusting the semantics of spatial predicates in different application domains by changing the underlying implementation. Moreover, this implementation, and thus the semantic interpretation of a spatial predicate remain transparent to the user. Refining Semantic Annotations Based on the results of the spatial analysis, the system is able to deduce that the instances of the feature type lowland represent a subset of the real-world entities which are intensionally captured in the LOWLAND concept. The feature type lowland can thus be annotated with the explicit semantics of the concept LOWLAND. Furthermore, this strategy has the potential to infer additional implicit information. For example, formula (2) states that all floodplains are lowlands that lie adjacent to a river and whose altitudes do not differ more than 4 meters from the adjacent river (and all such lowlands are floodplains). x (Floodplain(x) Lowland(x) y [River(y) Adjacent(x, y) lessthan(difference(altitude(x), Altitude(y)), 4)]) (2)

111 Chapter This definition is expressed in SWRL syntax in the rulefloodplain of Table 5.1. If some of the lowland features satisfy this rule, it can be added to the semantic annotation for the feature type lowland. That is, if a lowland feature lies adjacent to a river and the difference in altitude between the lowland and the river does not exceed a certain threshold value (4 meters in the example), it is likely to be a floodplain. 5.5 Related Work The semantic annotation (or registration) of heterogeneous and distributed information sources is a prerequisite for enabling semantic interoperability in open information infrastructures. In contrast to the various semantic annotation methods developed for the semantic web that deal with arbitrarily web content, we are concerned with the annotation of complex spatial information objects. Several approaches that consider the specific problem of annotating data sources with ontologies exist. OBSERVER (Ontology Based System Enhanced with Relationships for Vocabulary Heterogeneity) is a system that uses multiple pre-existing ontologies to access heterogeneous data repositories (Mena et al. 1998). The semantic annotation of data sources is achieved by mapping ontology concepts to data structures in the underlying repository. The resulting mapping information provides a view on the data repository as a set of entities types and attributes, independently of the concrete organization of the data. Bowers and Ludäscher (2003) propose a generic framework to support semantic registration of scientific datasets. Data providers can choose concepts within an ontology that best describe the dataset. Based on this selection, a semantic registration tool semi-automatically generates a conceptual schema that is mapped onto the dataset by specifying the connection between the information within the dataset and the structures of the schema. The approach presented by Fonseca et al. (2003) is even more straightforward. They suggest to directly use ontologies for the generation of geographic conceptual schemas (i.e., computer models describing information stored in databases). Thus, this approach already captures the semantics of geographic information during the design phase by using ontology concepts in the conceptual schema definitions. None of these approaches take instance data into account during the mapping processes. In contrast, Duckham and Worboys (2005) show the usefulness of instance-level information in their approach for automating information fusion. They present a method to infer shared schema-level structure from instance-level information by identifying shared structure within the instance-level examples (e.g. lexical, geometric, or topological structure). 5.6 Conclusion and Future Work Semantic annotation defines mappings between the application schemas of geodata and domain ontologies. We introduce a strategy for partially automating this process by: 1. transforming the feature type s application schema into the OWL syntax, and

112 A Rule-based Strategy for the Semantic Annotation of Geodata applying spatial analysis methods during the mapping process for exploiting extensional knowledge at the instance level. This approach is exclusively applicable on spatial datasets and so far has not been incorporated into methods for ontology mapping. We have shown the applicability of semantic web technology for the ontology layer. In combination, OWL-DL and SWRL provide sufficient support to formally define expressive instance queries. The rule specifications provide the basis for implementing spatial analysis algorithms. The novelty of the proposed strategy is that the rules are not checked on OWL instances, but on the spatial information objects stored in a database. Compared to knowledge extraction techniques like string-based attribute analysis, the calculation of spatial characteristics remains independent from the textual description of geographic objects and their attributes. This has the advantage that semantic heterogeneity problems inherent in the processing of natural language are avoided. The presented strategy lays the foundations for the specification of a semantic annotation tool for geospatial web services that supports data providers in annotating their sources according to multiple domain views. It is thus a concrete suggestion for how parts of a semantic reference system as introduced by Kuhn (2003) can be implemented. According to Kuhn s approach, a semantic reference system consists of ontologies that specify concepts as well as mappings between them, embedded in a formalism that supports the computation of these mappings. Once explicit and formal mappings in such a semantic reference system are established, mechanisms for translating between different application contexts become available (e.g., reclassification by subsumption reasoning in DL formalisms). Having a semantic reference system at hand, a requester will be able to determine, whether a service provides useful semantics to answer a question i.e., assessment and exploitation of the semantic value of an information source become possible. The quality of the semantic annotations generated by the system depends considerably on the quality of the domain ontologies. Expressive and concise domain ontologies are vital for the success of the presented strategy. Aligning domain ontologies to a foundational ontology can be seen as a mechanism to enforce certain quality standards into the ontology engineering process. The alignment process and its benefits are shown in Probst (2006). Future work will concentrate on implementing the presented strategy as part of a semantic web service environment. For this, further semantic web technologies will be taken into account, namely the Web Service Modeling Ontology (WSMO) approach (Roman et al. 2005). In this context, we also plan to further extend the approach by annotating the OGC web service s capabilities (returned by the standardized GetCapabilities request). The resulting annotation document would not only include explicit information on the geodata a WFS serves, but also on its filter encoding, its spatial extent and other descriptive metadata available from the capabilities document. Following this vision, classical metadata documents will eventually become superficies. We are aware that a fully automated process is out of scope. We have planned to refine the approach to combine spatial analyses with the analysis of non-spatial attributes. We assume that this combination will eventually lead to reasonable results in the annotation process with only little need for supervision.

113 Chapter Acknowledgements This work is supported by the European Commission under the SWING project (FP ). I would like to thank Michael Lutz, Werner Kuhn, Florian Probst and the members of the MUSIL group for their valuable input. Special thanks go to the anonymous reviewers for their constructive comments on this paper. References Arpinar B, Sheth A, Ramakrishnan C, Usery L, Azami M, and Kwan M-P 2004 Geospatial Ontology Development and Semantic Analytics. In Wilson J P, and Fotheringham A S (eds) Handbook of Geographic Information Science. Blackwell Publishing Berners-Lee T, Hendler J, and Lassila O 2001 The Semantic Web. Scientific American, 284, Bowers S, and Ludäscher B 2003 Towards a Generic Framework for Semantic Registration of Scientific Data. In Proceedings of the Semantic Web Technologies for Searching and Retrieving Scientific Data (SCISW). Sanibel Island, Florida, USA Bowers S, and Ludäscher B 2004 An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In Proceedings of the International Workshop on Data Integration in the Life Sciences (DILS'04). Leipzig, Germany Brachman R, McGuinness D, Patel-Schneider P, Resnik L, and Borgida A 1991 Living with Classic: when and how to use a KL-ONE-like language. In Sowa J (ed) Principles of Semantic Networks: Explorations in the Representation of Knowledge. San Mateo, CA, Morgan Kaufmann Publishers: Burrough P, and Frank A 1995 Concepts and paradigms in spatial information: are current geographical information systems truly generic? International Journal of Geographical Information Systems 9(2): Cabral L, Domingue J, Motta E, Payne T R, and Hakimpour F 2004 Approaches to Semantic Web Services: An Overview and Comparisons. In Proceedings of the 1st European Semantic Web Symposium (ESWS2004). Heraklion, Crete, Greece Duckham M, and Worboys M 2005 An algebraic approach to automated information fusion. International Journal of Geographical Information Science 19(5): Egenhofer M 2002 Toward the Semantic Geospatial Web. In Proceedings of the The 10th ACM International Symposium on Advances in Geographic Information Systems (ACM-GIS). McLean, VA Egenhofer M J 1991 Reasoning about Binary Topological Relations. In Proceedings of the Advances in Spatial Databases, 2 nd International Symposium. Zurich, Springer: Fonseca F, Davis C, and Camara G 2003 Bridging Ontologies and Conceptual Schemas in Geographic Information Integration. GeoInformatica 7(4): Fonseca F, and Sheth A 2002 The Geospatial Semantic Web. UCGIS White Paper. org/priorities/research/2002researchagenda.htm

114 A Rule-based Strategy for the Semantic Annotation of Geodata 107 Guarino N 1998 Formal Ontology and Information Systems. In Proceedings of Formal Ontology in Information Systems. Trento, Italy, IOS Press: 3-15 Halevy A Y, Ives Z G, Mork P, and Tatarinov I 2003 Piazza: data management infrastructure for semantic web applications. In Proceedings of the 12th Intl. World Wide Web Conference (WWW03). Budapest, Hungary, ACM: Horrocks I, Patel-Schneider P, Boley H, Tabet S, Grosof B, and Dean M 2004 SWRL: A Semantic Web Rule Language. Combining OWL and RuleML. W3C Member Submission. Hübner S, Witte J, Klien E, and Christ I 2005 Semantic Translation of Sensor Data. In Proceedings of the Münsteraner GI-Tage. Münster, Germany, Institut für Geoinformatik: Kalfoglou Y, and Schorlemmer M 2003 Ontology mapping: the state of the art. The Knowledge Engineering Review 18(1): 1-31 Klien E, and Lutz M 2005 The Role of Spatial Relations in Automating the Semantic Annotation of Geodata. In Proceedings of the Conference on Spatial Information Theory (COSIT'05). Ellicottville, NY, USA, Springer: Kolas D, Hebeler J, and Dean M 2005 Geospatial Semantic Web: Architecture of Ontologies. In Proceedings of the GeoSpatial Semantics. Mexico City, Mexico, Springer Kuhn W 2003 Semantic Reference Systems. International Journal of Geographical Information Science 17(5): Kuhn W 2005 Geospatial Semantics: Why, of What, and How? Journal of Data Semantics III: 1-24 Lutz M, and Klien E 2006 Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science 20(3): McGuinness D, and Van Harmelen F 2004 OWL Web Ontology Language Overview. W3C Recommendation. Mena F, Kashyap V, Illarramendi A, and Sheth A 1998 Domain Specific Ontologies for Semantic Information Brokering on the Global Information Infrastructure. In Proceedings of the First International Conference on Formal Ontologies in Information Systems. Trento, Italy Motik B, Sattler U, and Studer R 2004 Query Answering for OWL-DL with Rules. In Proceedings of the 3rd Int. Semantic Web Conference (ISWC 2004). Hiroshima, Japan Noy N F, and Musen M A 2000 Algorithm and Tool for Automated Ontology Merging and Alignment. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000). Austin, USA OGC 2002 Web Feature Service Implementation Specification. Open Geospatial Consortium. OGC 2003a Geography Markup Language (GML) Implementation Specification, Version 3.0. Open Geospatial Consortium. OGC 2003b OGC Reference Model. Open Geospatial Consortium. files/?artifact_id=3836

115 Chapter Papadias D, and Kavouras M 1994 Acquiring, Representing and Processing Spatial Relations. In Proceedings of the Sixth International Symposium on Spatial Data Handling. Edinburgh, Scotland, Taylor and Francis: Patel-Schneider P, Hayes P, and Horrocks I 2004 OWL Web Ontology Language Semantics and Abstract Syntax. W3C Recommendation, Probst F 2006 An Ontological Analysis of Observations and Measurements. In Proceedings of the Fourth International Conference of Geographic Information Science (GIScience). Muenster, Germany, Springer: Rodríguez A, and Egenhofer M 2003 Determining Semantic Similarity among Entity Classes from Different Ontologies. IEEE Transactions on Knowledge and Data Engineering 15(2): Roman D, Keller U, Lausen H, de Bruijn J, Lara R, Stollberg M, et al Web Service Modeling Ontology. Applied Ontology 1(1): Sheth A P 1999 Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics. In Goodchild M F, Egenhofer M, Fegeas R, and Kottman C (eds) Interoperating Geographic Information Systems. Kluwer: 5-30 Sowa J F 2000 Knowledge Representation. Logical, Philosophical, and Computational Foundations, Brooks Cole Publishing Co., Pacific Grove, CA

116 Chapter 6 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework Klien, E., D. I. Fitzner and P. Maué (2007). Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework. 10th Conference on Geographic Information Science (AGILE 2007), Aalborg, Denmark. Abstract. Spatial Data Infrastructures (SDIs) provide access, reuse and integration of geographic information from multiple sources. This potential can only be fully exploited, if a) geographic information registered in the SDI reaches a critical mass, and b) this information is findable. Finding what you need is one of the bottlenecks for open environments to become successful. However, current users of SDIs are still faced with inefficient and frustrating search experiences. To improve the situation, new ways of advertising resources as well as searching for information are needed. In this paper, we present a method for registering geodata in a semantic web service framework. Once the geodata is registered, sophisticated query processing becomes available. The implementation is illustrated with a use case of registering an OGC Web Feature Service within the semantic web service framework WSMO.

117 Chapter Introduction Spatial Data Infrastructures (SDIs) provide access, reuse and integration of geographic information from multiple sources. However, this potential can only be fully exploited if a) geographic information registered in the SDI reaches a critical mass, and b) this information is findable. Finding what you need is one of the crucial factors for open environments to become successful. However, current users of SDIs still face inefficient and frustrating search experiences. To improve the situation, we have to establish new ways for both, advertising and finding information resources in SDIs. An SDI is characterized by the great variety both in its users and in its information sources. Searching for information is a communication process between the information requestor and the information provider, only that they are not interacting directly but through the query tool, which acts as broker for negotiating between them. Problems of different conceptualizations, terminology and missing information are easily resolved in a personal communication process, while a keyword-based search algorithm remains quite ignorant to these aspects. Search functionalities that account for semantic heterogeneities could improve the situation. A promising development in this direction is the combination of technologies around the Semantic Web with the standardization achievements in the geospatial community. Introducing formal descriptions of information sources written in logic opens possibilities for more sophisticated query processing, e.g. matchmaking based on logic reasoning. The usefulness of integrating such ontologybased discovery into information systems to overcome semantic heterogeneities has been shown (Lutz and Klien, 2006). However, the approach relies on the availability of formal descriptions of the information sources written in logic calculus. In this paper, we present a strategy for registering geodata in a semantic web service (SWS) framework, which produces formal descriptions. The strategy involves the automatic transformation of service descriptions and data schemas into a formal representation language, which can be processed within the SWS framework. Based on this foundation, structures that are more sophisticated can be added to the service descriptions by generating semantic annotations (understood as mappings from elements of the data schema to elements in a domain ontology). This will enable querying not only on syntax, but also on meaning. We first introduce the basic elements of the SWS framework and the formal language used in the presented approach to the extent needed for understanding the examples. The subsequent section provides an analysis of the infrastructure needed to enable automatic transformations from service descriptions and data schemas into the components of the SWS framework. We also illustrate the need for semantic annotation to ensure that we do not only use the correct formalisms but also capture the semantics of the registered data sets. With the help of a specific example, a walk-through showing how to register an OGC Web Feature Service (WFS) in the SWS framework illustrates the implementation. The benefits of having formal descriptions available are shown with a simple discovery scenario. The last section concludes and gives some thoughts on future work.

118 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework The Semantic Web Service Framework Semantic web services are meant to enrich web services with machine-processable semantics. There are several developments in the area of SWS, which are documented in (Cabral et al., 2004). In the following, we introduce WSMO (Roman et al., 2005), which has been developed as a conceptual model for semantically describing all relevant aspects of web services in order to facilitate the automatisation of discovering, combining and invoking services over the Web. The WSMX execution environment is a reference implementation of WSMO. The Web Service Modeling Language (WSML) is the internal language of WSMX. The Web Service Modeling Ontology. WSMO identifies Ontologies, WebServices, Goals and Mediators as the four top-level elements, for defining SWS (Roman et al., 2005). In our work on registration and semantic annotation of geodata, we deal with the WSMO elements ONTOLOGIES and WEBSERVICES. In addition, we need GOALS in the discovery scenario. ONTOLOGIES provide the terminology used by other WSMO elements to describe the relevant aspects of the domains of discourse. Ontologies capture and formalize the meaning of the described components. Moreover, the formal definitions are machine-processable and thus allow sophisticated information processing based on logic reasoning. WEBSERVICES represent computational entities able to provide access to services that, in turn, provide some value in a domain; a WEBSERVICE comprises the capabilities, interfaces and internal working of a service. All these aspects are specified using the terminology defined in the ONTOLOGIES. Capabilities characterize the web service s state before and after an execution by specifying pre- and postconditions. GOALS describe aspects related to user desires with respect to the requested functionality; again, ONTOLOGIES can be used to define the used domain terminology, useful in describing the relevant aspects of GOALS. The Web Service Modeling Language. WSML has been designed for writing down descriptions of WEBSERVICES, GOALS, ONTOLOGIES, and (to some extent) MEDIATORS (de Bruijn, 2005). WSML is a family of formal description languages used for the precise specification of the elements in the WSMO framework. The different variants of WSML (WSML-Core, WSML-Flight, WSML-Rule, WSML- DL, and WSML-Full) correspond to different logical language paradigms, namely Description Logic, Logic Programming and First-Order Logic. All of them are specified in terms of a general WSML-syntax, but each might impose different restrictions on certain syntactic elements of the language. The general WSML-syntax mainly consists of two parts: the conceptual syntax and the logical expression syntax. The conceptual syntax is a frame-like syntax with constructs like concepts, attributes, relations and instances. The logical expression syntax is used for further refinement of concept- or relation-definitions in the conceptual syntax. We show examples in WSML syntax in the section Walk-through for registering and annotating WFS in WSMO. We use WSML-Flight because this variant has proven to serve best our application s requirements regarding expressivity of language and reasoning capabilities.

119 Chapter Approach for Registering and Annotating Geodata In OGC-compliant SDIs, geodata are served via WFS. In a standard Catalogue, users register WFS by providing metadata (e.g. ISO 19115) on the service and the data it serves. We suggest a new way for registering WFS, which automates the registration process and generates formalized service descriptions usable for further information processing in a SWS framework. The goal of the registration process is to generate a WSMO WEBSERVICE for a specific WFS that integrates information on the service functionality, on the geodata encoding, and on the semantics of the data that is served. The following two steps are required as part of this process: First, we execute an automatic transformation of the WFS descriptions into WSMO WEBSERVICE and WSMO ONTOLOGY constructs. The result of the automatic transformation cannot contain more information than the original documents - it is merely a translation into a different syntactic representation. So, how can we ensure that we do not only use the correct syntax, but also capture the semantics of the registered data sets? To satisfy this requirement, the second step of the registration process involves the semantic annotation of feature types served by the WFS by mapping elements of the feature type schema to concepts in domain ontologies First Step: Approach for Automatic Transformation Figure 6.1 depicts the items involved in the process of automatically transforming WFS descriptions into WSMO WEBSERVICE and ONTOLOGY constructs. Subject to annotation are spatial information objects that model real world entities. These spatial information objects are represented as features in the Geographic Markup Language (GML) and served via WFS. Information on the service is accessible via its standardized operations GetCapabilities (returning a description of the service s capabilities) and DescribeFeatureType (returning the application schemas of the feature types served by the WFS). Both service descriptions (WFS capabilities and feature type schema) are parsed with the help of a third party library 1. During the transformation, rules are applied to reference the parsed information to the corresponding concepts in OGC domain ontologies. For this purpose, two OGC ontologies encoded in WSML are available in our environment: the WFS ONTOLOGY captures the service implementation rules for WFS as specified in the OGC WFS Implementation Specification (OGC, 2002) and the GML ONTOLOGY captures the encoding rules for features as specified in the OGC GML Encoding Specification (OGC, 2003). The WEBSERVICE and ONTOLOGY constructs that formally describe a specific WFS in WSMO are generated onthe-fly. This automatic translation into WSML consists of two parts: a) Translation of feature type schema: For every feature type listed in the capabilities document, the algorithm creates the corresponding WSML representation. Those elements of the schemas that point to GML encodings (e.g. the geometry attribute) are referenced to corresponding concepts in the GML ONTOLOGY. The result of this translation step is a FEATURE TYPE ONTOLOGY (FTO) providing structural information of all feature types served by the specific WFS. 1 We are using the LGPL library geotools ( for this task.

120 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework 113 b) Translation of service capabilities: WFS are services that provide standardized operations for data access. The basic functionality (i.e. retrieving features) is the same for all WFS implementations. It is thus possible to use the generic WFS ONTOLOGY to define the WFS capabilities. The translation process further refines the description with specific information from the WFS to be registered. For example, with the help of the postcondition it is possible to constrain the output of the GetFeature operation to the features the service is actually serving. By referencing the features in the postcondition to the corresponding concepts in the FTO, it is possible to assess more detailed information on the feature type schema. The result of this translation step is a specific WFS WEBSERVICE. At this stage, the automatic transformation has produced a specific WFS WEBSERVICE and FTO written in WSML, which do not contain more information than the original WFS descriptions. The crucial part is now to capture and explicate the meaning of the retrievable data. Figure 6.1: Items involved in registering a WFS in the WSMO framework Second Step: Approach for Semantic Annotation Figure 6.2 depicts a schematic representation of a domain ontology (DO) on Quarries (sites, where mineral resources are produced or mined). This domain ontology has been developed in cooperation with the Bureau de recherches géologiques et minières (BRGM) as part of the work

121 Chapter carried out in the SWING project 1. DO s are developed to capture the conceptualization of a specific view on the world and formalize it in concept definitions. It is assumed that all members of the community will interpret the terminology used in their domain ontology in the same way and, at the same time, people from outside the community are able to explore the intended meaning with the help of the concept definitions. Figure 6.2: Schematic representation of an extract from the Quarry Ontology. Quarry is the central concept of the QUARRY ONTOLOGY. It is defined as subconcept of IndustrialSite, which means that Quarry inherits all relationships that have already been defined for IndustrialSite. Some of the relationships require further restrictions. The range of haslocation points to QuarryLocation, which is a subconcept of Location, and the Production produces not any Product, but QuarryProduct. Again, these concepts are further defined by adding or constraining their non-taxonomic relationships. In order to generate semantic annotations for the concepts defined in the FTO, the data provider has to define mappings between concepts in the FTO and concepts in the DO. The tricky part in this endeavor is that FTO and DO do not capture the same kind of extensions the FTO reflects the data schema for features, i.e. spatial information objects whereas concepts in the DO capture the meaning of real-world entities (as conceptualized by a specific user community). The mappings between FTO and DO cannot be defined via taxonomic is-a relationships because both ontologies represent different kind of entities. We have thus decided to introduce a non-taxonomic annotate relation into our environment, which explicitly expresses the fact that instances of one concept convey the same meaning, but are not necessarily of the same kind as the instances of the concept they are annotated with. In order to identify semantic annotations, we have to answer the following questions: does a spatial information object convey the same meaning as a concept of the domain ontology?. This question can only be answered, if we can detect that the spatial information object represents a geographic entity, which in turn would be classified as instance of a domain concept (Klien, 2007). Metadata (e.g. ISO 19115) and feature type schema often do not reveal sufficient explicit information, so that existing automatic mapping techniques would not be supportive for 1

122 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework 115 this task (please refer to Klien (2007) for discussion on existing mapping techniques). For example, string-based matching of terms is not applicable (fto:exploitationponctualproduction will not match domain:quarry, see example in the next section). In addition, the structure of the feature type and the structure of the domain concept are not comparable. Currently, only the data provider who is familiar with the spatial information objects can infer this kind of instance relationship and the mappings from FTO to DO have to be performed manually. FeatureType Ontology Semantic Annotation Feature Type Schema Domain Ontology Web Feature Service: Feature Types Axiomatized concept definitions that capture a specific view on the world Spatial Information Objects representation Real world entities Figure 6.3: Setting for the semantic annotation of geodata (from Klien (2007)). 6.4 Walk-through for Registering and Annotating WFS in WSMO BRGM advertises a WFS that provides data on quarries together with related information like beds or basins in France (see: The QuarryWFS serves six different feature types, namely exploitationsboundaries, exploitationsponctuals, exploitationsponctualsproduction, beds, sites and basins. The term exploitation is a synonym to the term quarries ; the service thus offers three feature types with information about quarries. To deal with a simple example, we will only consider the feature type exploitationsponctualsproduction and two of its attributes, namely msgeometry (the point geometry of the quarries) and allowedproduction (the maximum production of the quarry in tons per year) First Step: Automatic Generation of QuarryWFS descriptions a) Translation of feature type schema: Invoking the DescribeFeatureType operation on the QuarryWFS with a specific feature type id returns the feature type schema. Figure 6.4 shows how the feature type exploitationsponctualsproduction (a) can be translated into a FTO written in WSML (b), including references to concepts of the GML ONTOLOGY like gml#feature and gml#geometrypropertytype. Due to the standardization of both syntaxes, the translation is non-ambiguous and therefore straightforward.

123 Chapter <element name="exploitationsponctualsproduction" type="qua:exploitationsponctualsproductiontype" substitutiongroup="gml:_feature" /> <complextype name="exploitationsponctualsproductiontype"> <complexcontent> <extension base="gml:abstractfeaturetype"> <sequence> <element name="msgeometry" type="gml:geometrypropertytype /> <element name="exploitationname" type="string" /> <element name="communities" type="string" /> <element name="substance" type="string" /> <element name="year" type="string" /> <element name="allowedproduction" type="string" /> <element name="sitename" type="string" /> <element name="sitetype" type="string" /> </sequence> </extension> </complexcontent> </complextype> a) concept exploitationsponctualsproduction subconceptof gml#feature msgeometry impliestype (1 1) gml#geometrypropertytype exploitationname impliestype (1 1) _string communities impliestype (1 1) _string substance impliestype (1 1) _string year impliestype (1 1) _string allowedproduction impliestype (1 1) _string sitename impliestype (1 1) _string sitetype impliestype (1 1) _string b) Figure 6.4: Transforming the feature type schema (a) into the FTO written in WSML (b). b) Translation of service capabilities: Invoking the operation GetCapabilities on the QuarryWFS returns the capabilities document, which among other metadata provides a specification of the service s operations. The automatic translation generates a specific WSMO WEBSERVICE for the QuarryWFS. In this example, we keep the specification simple and concentrate on the postcondition of the GetFeature operation that is restricted to features of type exploitationsponctualproduction. This restriction is specified by reference to the concept fto#exploitationsponctualproduction as defined in the FTO (Figure 6.5). wsmlvariant _" namespace { _" fto _" } webservice brgmwfs capability brgmwfs_capability postcondition getfeature_postcondition nonfunctionalproperties dc#description hasvalue "The Service returns features of the types that are further specified in the BRGM FeatureType Ontology" endnonfunctionalproperties definedby?features memberof fto#exploitationsponctualsproduction. Figure 6.5: WSMO WEBSERVICE for QuarryWFS Second Step: Semantic Annotation of Feature Types The feature type exploitationsponctualproduction denotes point objects that model quarries (real-world geographic entities). In turn, real world quarries are described in the concept definition of domain#quarry, which means that theoretically they would be classified as instances of the domain concept domain:quarry. The feature type s attributes refer either to the information

124 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework 117 object (e.g. msgeometry ) or to the geographic entity (e.g. allowedproduction ). Those attributes that describe the information object have been referenced to the GML ONTOLOGY in the previous automatic translation step. Attributes referring to the geographic entity have to be semantically annotated with concepts from the QUARRY ONTOLOGY. In our example, allowed- Production is mapped to the domain concept domain:productionrate (Figure 6.6). Location haslocation IndustrialSite hasindustrialactivity Production Quantity Quarry produces hasproductionrate QuarryLocation rangeconstraint on haslocation rangecontraint on produces QuarryProduct ProductionRate concept exploitationsponctualsproduction subconceptof gml#feature msgeometry impliestype (1 1) gml#geometrypropertytype exploitationname impliestype (1 1) _string communities impliestype (1 1) _string substance impliestype (1 1) _string year impliestype (1 1) _string allowedproduction impliestype (1 1) _string sitename impliestype (1 1) _string sitetype impliestype (1 1) _string Figure 6.6: Two elements of the feature type schema mapped to domain concepts. Once the data provider has defined the mappings, they are formalized in WSML axioms and added to the FTO. Figure 6.7 depicts the formalized semantic annotations for the feature type exploitationsponctualproduction and its attribute allowedproduction. Remember, FTO and DO are two different sides in the framework that describe different kinds of entities and it would be awkward to define taxonomic relationships between their instances. Thus, we realize the semantic annotation bridge by a binary predicate ( annotate ) connecting concept names. The axiom defineannotation lists all annotate relations that might exist between predicates in the FTO and concepts in the DO. axiom defineannotation definedby annotate(exploitationsponctualsproduction, domain#quarry). annotate(allowedproduction, domain#productionrate). Figure 6.7: Formalized semantic annotations. 6.5 Benefits To illustrate the benefits of having formal service and data descriptions available in WSMO, we introduce a simple discovery scenario. A user is looking for information on the production rate of quarries in a specific area of France. The search interface of the SWING application allows the user to formulate this question by utilizing the vocabulary specified in the domain ontologies (QUARRY ONTOLOGY, GML ONTOLOGY, WFS ONTOLOGY). A simple example is the fol-

125 Chapter lowing question: Find me all services that provide Quarry features that contain information on production rates. This query is translated into WSML 1 as depicted in Figure 6.8. The postcondition of the goal s capabilities states that the user is searching for services that provide features, which are semantically annotated with the domain concept domain:quarry. And those features should also have an attribute, which is semantically annotated with the domain concept domain:productionrate. The user request is represented in WSML as a conjunctive query. The answer to a query is as in databases the set of all tuples of instances that satisfy the query. In a conjunctive query, all constraints are connected by conjunction (logical AND). Since we do not deal with concrete instances (data items) in the discovery scenario, the user request is represented as a so called meta-query, which is in principle the same as a query in databases or logic programs, but which does not query for data but for schema information (in our case concept definitions). In this simple discovery scenario, a matchmaking component uses the goal specification (Figure 6.8) to query the concept definition (fto#exploitationsponctualproduction) offered by the service brgmwfs (see Figure 6.5). Since the advertised concept delivers the requested functionality, i.e. it is a gml#feature, it is semantically annotated with domain#quarry and it has an attribute that is annotated with domain#productionrate, the user query matches the service. namespace { _" gml _" domain _ " goal usergoal capability goalcapability postcondition nonfunctionalproperties dc#description hasvalue "returns features that are semantically annotated with the domain concept Quarry and that have an attribute that is semantically annotated with the domain concept ProductionRate" endnonfunctionalproperties definedby?x memberof?c and?c subconceptof gml#feature and annotate(?c,?y) and?y subconceptof domain#quarry and (?C[?r impliestype?t]) and annotate(?r,?rc) and?rc subconceptof domain#productionrate. Figure 6.8: User GOAL written in WSML. 6.6 Conclusion and Future Work We have presented a strategy for registering and annotating geodata in a SWS framework. The availability of formal service and data descriptions enables sophisticated information processing within the SWS framework. In this paper, we have shown the benefits for discovery, as logicbased reasoning allows query processing not only on syntax, but also on meaning. So far, it is only possible to define these semantic annotations manually. Automating the process with existing ontology mapping techniques like text and schema comparison is not reliable. Future work will concentrate on methods that allow to automatically infer a feature type s meaning, by exploring implicit information hidden in text documents or instance data. One proposal in this direction is a method for detecting class membership by analyzing geometry and topology in 1 The automatic translation of a user query into WSML has not been implemented yet

126 Baseline for Registering and Annotating Geodata in a Semantic Web Service Framework 119 geospatial datasets (Klien, 2007). In the case of WFS the focus is on the content level, i.e. what data can be retrieved. But combining SWS and OGC services with the goal of realizing composition of geospatial services needs further work on formalizing processing semantics (Lutz, 2006). The question of how to annotate geoprocessing services will be addressed at a later stage of the SWING project. References Cabral L., Domingue J., Motta E., Payne T.R., and Hakimpour F Approaches to Semantic Web Services: An Overview and Comparisons. In Proceedings of the 1st European Semantic Web Symposium (ESWS2004). Heraklion, Crete, Greece de Bruijn J. (Ed.) 2005 The Web Service Modeling Language WSML. Klien E A rule-based strategy for the semantic annotation of geodata. Transactions in GIS, Special Issue on the Geospatial Semantic Web, forthcoming Lutz M Ontology-Based Descriptions for Semantic Discovery and Composition of Geoprocessing Services. GeoInformatica, forthcoming Lutz M., and Klien E Ontology-Based Retrieval of Geographic Information. International Journal of Geographical Information Science (IJGIS) 20(3): OGC 2002 Web Feature Service Implementation Specification, Version Open Geospatial Consortium OGC 2003 Geography Markup Language (GML) Implementation Specification, Version 3.0. Open Geospatial Consortium Roman D., Keller U., Lausen H., de Bruijn J., Lara R., Stollberg M., et al Web Service Modeling Ontology. Applied Ontology 1(1):

127 Chapter 7 Category Membership Evaluation for Geographic Entities - Ontological Foundations and Implementation Klien, E., Probst, F. and Nientiedt, M (2008). Category Membership Detection for Geographic Entities Ontological Foundations and Implementation. under review Abstract. Semantic annotation explicates the meaning of geographic features by establishing a link between a feature type and a formal category description in an ontology. To be able to effectively apply semantic annotations for assessing semantic interoperability in distributed environments, the different conceptualizations captured in the formal descriptions have to be comparable. Further, semantic annotations should reflect the multiple views that humans have on their surroundings. Ontologies that are used for annotation should therefore (i) be comparable, and (ii) exhibit a structure that allows for multiple views in an ontologically consistent way. We assume that comparability and ontological consistency are best achieved by aligning the domain views to a foundational ontology. In this paper we develop a formal ontological foundation for the geospatial domain by aligning the most generic categories of the domain to the upper-level notions of the foundational ontology DOLCE. This intermediate foundational structure closes the gap between the very generic categories of DOLCE and the more specific categories described in domain ontologies such as geomorphology. The benefit of using the proposed geospatial domain ontology for semantic annotation is that it results in comparable annotations that reflect multiple perspectives on the data source in an ontologically consistent way. We further show how the proposed ontology can be employed in a method for evaluating the validity of annotations. The method relies on evaluating the plausibility of an entity s category membership based on its physical qualities and spatial relations. Rules for category membership evaluation (that are formalized as part of the ontology) are translated into spatial analysis procedures that can be executed on the feature instances with spatial operators provided in geographic information systems. The result can either approve or disapprove the validity of an annotation. We present an implementation for testing the applicability of these rules. For this, we have implemented a prototype as extension to the geographic information system ArcGIS that combines formal rules encoded in the Web Service Modeling Language (WSML) and the spatial operators offered by ArcGIS.

128 Category Membership Evaluation for Geographic Entities Introduction Geographic information is increasingly made available through web services. In such open and heterogeneous environments, information discovery and information exchange requires that meanings of terms in (meta-)data descriptions are well defined and preserved (Rugg, et al. 1997). Geographic features are information objects that represent geographic entities, i.e. entities that are located in geographic space. Features derive their meaning from the concepts that humans employ to categorize the represented geographic entities. We understand semantic annotation as the process of explicating the meaning of features by establishing a link between a feature type and a formal category description. This link is identified based on the annotating person's knowledge that the features represent members of that category. It is formally established by defining mappings from elements of the feature type schema (that describes the structure of the feature instances of that feature type) to category descriptions in the ontology (that provides a characterization of the geographic entities that are members of that category) (Klien 2007). Figure 1 illustrates the proposed conceptual model to semantically annotate features. For example, features that are created for representing flooded areas can be semantically annotated by the category description of FLOODPLAIN stored in an ontology. The feature type schema reflects the data modeler s vocabulary and characterization of how to represent entities in information systems that he calls flooded lowlands. To evaluate the validity of using the selected category for annotation, it is necessary either to know or identify that the features represent members of the category FLOODPLAIN. The feature type schema that describes flooded lowlands does not convey useful information for answering this question. In addition, a string-based matching of the terms that denote the feature type (i.e. FT_floodedLowlands ) with the terms that denote the category definition (i.e. FLOODPLAIN) will not deliver reliable results, since only syntax and not the underlying conceptualizations are compared. How can we support the users with the task of evaluating the validity of an annotation that has been generated manually or with methods based on term matching (and with respect to a certain conceptualization)? In previous publications (Klien 2007; Klien and Lutz 2005), we suggest evaluating the validity of semantic annotations by evaluating the plausibility of the represented entity s category membership. The proposed method relies on the importance of physical qualities and spatial relations for categorizing geographic entities. For example, members of the category FLOODPLAIN are characterized (among others) by being flat and being adjacent to a river. Thus it is possible to define rules for category membership that are restricted to the characteristic geometric and topological characteristics. These rules are formalized as a part of the geospatial domain ontology. Since geographic features have a geometric representation as well as a spatial location, it is possible to compute and analyze geometric and topological relations of the represented geographic entities. Consequently, we can use the formal rules for category membership evaluation in the process of evaluating the validity of semantic annotations. For this, the formal rules are translated into spatial analysis procedures that can be executed on the feature instances with spatial operators provided in geographic information systems. For example, if we successfully analyze that all features of type FT_floodedLowlands are flat and adjacent to rivers we can confirm the plausibility of using the category FLOODPLAIN for semantic annotation.

129 Chapter Figure 7.1: Conceptual model for semantic annotation of geographic features. The motivation of this paper is to clarify the ontological assumptions behind our approach and to propose a formal ontological foundation for the geospatial domain. Our ontology engineering is guided by the following scope, purpose and expected impacts. Scope. Geographic entities include everything that humans experience and conceptualize in geographic (large-scale) space. We follow the definition for geographic space provided in (Mark and Frank 1996). Geographic space is the large-scale space that can be explored only by navigating in it, and we conceptualize it from multiple views, which are put together (mentally) like a jigsaw puzzle (Egenhofer and Mark 1995). These large-scale spaces are the ones that geographers typically study and therefore Mark and Frank (1996) propose to call it geographic space. In contrast, small-scale space refers to subsets of space that are visible from a single point and in which objects are being manipulable. Missing information about these objects can be gathered by examining the object, e.g. moving around it and touching it (Egenhofer and Mark 1995, p.7). Purpose. The purpose of the proposed geospatial domain ontology is to support annotating geographic features in order to improve data discovery and data exchange across heterogeneous data sources. Therefore, the presented ontological analysis has three goals. 1. We want to ensure the linkage with the way people perceive and conceptualize geographic space and the ways they communicate about them. 2. The developed structure should ensure comparability. We assume that semantic interoperability can only be achieved if the different conceptualizations captured in the formal descriptions are comparable. 3. The developed structure should allow for multiple views on geographic entities in an ontologically consistent way. We want to apply the proposed method not only for evaluating the validity of annotations, but also for suggesting additional categories for extending the anno-

130 Category Membership Evaluation for Geographic Entities 123 tations to other perspectives. Different people with different views on their surrounding will perceive different kinds of entities at the same spatial location. For example, while the data modeler in the example of Figure 1 wants to communicate about flooded areas, other people might have lowlands, recreational areas or habitats in mind. Being able to reflect multiple perspectives in the formal descriptions will widen the scope for semantic interoperability assessment and make GI findable and usable in different contexts. Approach and expected benefit. Comparability and ontological consistency are best achieved by aligning the domain ontologies used for semantic annotation to a foundational ontology as common denominator. Foundational ontologies capture the most generic categories in an ontologically rigorous way and provide the structure needed for sound ontology engineering (Schneider 2003). In this paper, we develop a formal ontological foundation for the geospatial domain by aligning the most generic categories of the domain to the top-level notions of the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) 1. This intermediate foundational structure closes the gap between the very generic categories of DOLCE and the more specific categories described in domain ontologies such as geomorphology, hydrology, and ecology. To our knowledge, a geospatial domain ontology aligned to the foundational ontology DOLCE does not yet exist. We do not claim that the proposed ontological foundation will be usable in any context. Other views on how to structure the most generic categories of the geospatial domain exist and provide valuable insights (Casati, et al. 1998, Grenon and Smith 2004), yet are based on a foundational ontology we consider as too difficult to use in the context of annotating geographic features. We expect that once a stable and sound structure has been established, it will serve to support ontology engineers to produce consistent and comparable domain ontologies. The expected benefit of using the proposed ontological foundation for semantic annotation is that it results in comparable annotations that reflect multiple perspectives on the data source in an ontological consistent and intuitive way. The developed geospatial domain ontology provides a first basic setting for testing the applicability of category membership rules for evaluating the validity of annotations. For this, we have implemented a prototype as extension to the geographic information system ArcGIS 2 that combines formal rules encoded in the Web Service Modeling Language (WSML) (de Bruijn 2005) and the spatial operators offered by ArcGIS. Used terminology. We restrict our investigations to the realm of GI web service environments, more specifically to the web-components of those infrastructures that adhere to OGC Web Service (OWS) specifications (OGC 2002) as published by the Open Geospatial Consortium (OGC) 3. According to the OGC Reference Model (OGC 2002), a geographic feature is the starting point for modeling geographic information. A geographic feature is defined as an abstraction of a real world entity with a location relative to the Earth (OGC 2002). We thus use the term geographic feature for information objects that represent geographic entities, i.e. entities that are directly or indirectly located in geographic space. A feature type schema describes the feature instances of that feature type. Similarly, a category definition in the geospatial domain ontology provides a characterization of geographic entities that are members of that category

131 Chapter Naming conventions. We use small caps for CATEGORIES, Concept Names start with an upper case and entities with lower case. Feature type and feature are set in quotation marks, the former starting with upper case, and the latter with lower case. When explicitly referring to terms, these are set between backslashes, e.g. /word/. The remainder of the paper is structured as follows. In Section 7.2 we clarify scope and purpose of the proposed ontology, and we define our ontological assumptions. Section 7.3 presents the results of the ontological analysis of entities directly or indirectly located in geographic space. This includes the alignment of the most generic categories of the geospatial domain to the foundational ontology DOLCE and examples for illustrating the modeling principles. In Section 7.4, we present the strategy of using rules for category membership evaluation for evaluating the validity of annotations. This section includes examples for rules, the scope of their applicability in the annotation process, and a first prototypical implementation of a tool for testing. Section 7.5 concludes and provides an outlook on future work. 7.2 Background on Formal Ontology and DOLCE According to Guarino (1998a, p.5), an information system (IS) ontology is a logical theory accounting for the intended meaning of a formal vocabulary, i.e. its ontological commitment to a particular conceptualization of the world. The intended models of a logical language using such a vocabulary are constrained by its ontological commitment. An ontology indirectly reflects this commitment (and the underlying conceptualization) by approximating these intended models. The closer this theory corresponds to the human concepts about a domain, the more useful an ontology will be (Kuhn 2005). The practical purpose of ontologies in information systems is to enable the seamless exchange of information by utilizing advanced computer support like logical reasoners for the assessment of semantic interoperability. Formal ontology can be seen as a philosophical research field concerned with laying the foundations for sound IS ontology engineering. Formal ontology can be seen as the theory of distinctions within the entities of the domain of discourse (physical objects, events, regions of space, amounts of matter) and the categories we use to talk about them (concepts, properties, qualities, states, relations, roles, parts) (Guarino 1997; Guarino 1998b). Those IS ontologies that are developed according to the principles of formal ontology and that capture the most generic categories are called foundational ontologies. Foundational ontologies are axiomatic theories of domain-independent top-level notions such as object, attribute, event, parthood, dependence, and spatio-temporal connection (Schneider 2003, p.1). Foundational ontologies structure the top-level notions needed for developing domain ontologies. This section on foundational ontologies is in large parts based on (Gangemi et al. 2002; Masolo et al. 2003; Probst 2007). We have decided to use the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) as foundational ontology to develop a comprehensive and extensible ontology for geographic entities. We have chosen DOLCE for this undertaking because of its cognitive bias and the adequateness of its top-level notions for deriving the different kinds of entities related to the realm of geography. In the following, we introduce the generic categories of DOLCE (dis-

132 Category Membership Evaluation for Geographic Entities 125 played in Figure 7.2), to which the generic categories of the ontology introduced in Section 7.3 are aligned. Figure 7.2: The basic categories in DOLCE (Masolo et al. 2003). Endurants are wholly present at any time they exist. Their parts move with them in time. Perdurants, on the contrary, extend in time by accumulating different temporal parts. At any time that they are present, they are only partially present in the sense that some of their proper parts are not present anymore or are not yet present (Gangemi et al. 2002). Examples for endurants are bridges, lakes, forests, or national parks that participate in perdurants such as floods, forest fires, or park openings. Members of the sub-category PHYSICALOBJECT are further investigated in Section Qualities, Quality Regions, and Reference Regions. Qualities are the basic entities we can perceive or measure, for example the volume of a certain lake, the color of a certain rose, or the length of a certain street (Masolo et al. 2003). Every entity comes with certain qualities, which exist as long as the entity exists. DOLCE distinguishes between physical, temporal and abstract qualities. Physical endurants can only have physical qualities. DOLCE defines a strict distinction between a quality, e.g., the depth of a lake, and its magnitude. A magnitude can be approximated with a measurement process, e.g. 100t. The magnitude of a quality is understood as a quality region. All quality regions of the same kind constitute a quality space. For example, the mass quality of some water body has a certain absolute magnitude. This magnitude or quality region has a specific place within a quality space for mass. The absolute magnitude of an entity's quality can never be determined exactly. What we do by observing a quality is to approximate its location in the quality space. Ontologically there is an exact magnitude for each quality while in practical applications we can at best identify a larger region, in which we assume the exact or atomic quality region to be located. In (Probst 2007), we have introduced the notion of semantic reference regions. Semantic reference regions map on quality regions for the purpose of approximating these. Further, a reference region can be associated with a sign that can be used to denote the magnitude of some quality. The sign /100t/ might be used to approximate the absolute magnitude of some mass quality. Hence the reference region that is labeled

133 Chapter /100t/ can be seen as the value of some quality while its absolute magnitude is assumed to be somewhere within the magnitude range between /99t/ and /101t/ (depending in which way the reference regions are used in the measurement and rounding procedure). Although it would be ontologically more precise to state that a quality is located at a quality region, which in turn is approximated by a reference region, which is associated with a sign, we prefer to take a short cut in this article. In the rest of the paper, we associate a quality directly with a reference region via the hasomresult relation. In this sense, we only refer to the practically relevant approximated magnitude of some quality, which is, after all, the communicable entities we are interested in. Mappings between regions from ordinal scaled reference spaces to rational scaled reference regions are introduced in Section Roles. A standard example for a role is the distinction between the categories STUDENT and PERSON. Being a person is a rigid property of some entity while being student is an anti-rigid property. An anti-rigid property is contingent or non-essential for all their instances (Guarino and Welty 2002). For example, a person can play the role of being student, yet for being a person it is not essential to be a student. The person keeps its identity, that is, the person can be identified to be the same person, independent of playing the role of being a student or not. In general, we follow (Masolo et al. 2004) characterizing roles as socially constructed entities that in consequence are existentially dependent on cognitive agents. From an ontological point of view, a role is a weak characterization of some entity since the entity can stop playing this role without loosing its identity. Seen from the perspective of the characterized entity; an entity can play several roles at the same time but also can stop playing roles without loosing its identity. Introducing roles in an ontology is a difficult endeavor. At a certain point, many words employed in natural languages appear to denote categories that are subcategories of ROLE, giving them an unexpected weight in the taxonomy of the ontology. We follow the DOLCE categorization in seeing a role as non-physical endurant, and further as socially constructed object. The category ROLE is not yet fully specified and requires further research. However, for the sake of a taxonomy that adheres (as far as possible) to the basic meta-properties of rigidity, identity and unity introduced in (Welty and Guarino 2001) we apply roles in the following ontological analysis to characterize socially constructed entities in the geographic domain. To bring the above into a geospatial context: Entities of the geographic domain are conceptualized either as perdurants or endurants. Endurants and perdurants are characterized through their qualities. Endurants and perdurants have unique qualities, which are individual entities too. Only physical endurants have spatial location qualities, whereas perdurants or roles have indirect spatial location qualities via the endurants participating in them or playing them respectively. 7.3 Formal Ontological Foundation for the Geospatial Domain In their discussion about an ontology of landforms, Smith and Mark (2003) focus on primary theory as introduced by Horton (1982), referring to the conceptualizations that humans share about the world. The primary theories consist of basic (commonsense) physics, basic ( rational ) psychology, and other families of those basic theoretical beliefs which all people need in order to perceive and act in ordinary everyday situations (Forguson 1989). Each primary theory is a

134 Category Membership Evaluation for Geographic Entities 127 theory about what actually exists in reality or more precisely in some part of reality that is relevant to human perception and action. For example, while primary theory acknowledges the existence of mountains, it has no explanation of how mountains are demarcated (or how they fail to be demarcated) from their surrounding foothills and from the Earth beneath them. Smith and Mark (2003) hypothesize that the primary theory for geographic space is organized around categories for major landforms such as mountain, hill, valley, island, and the like, and for associated water bodies and watercourses such as lake and river. Primary theory seeks to make objects out of the variations in elevation of the earth s surface that are salient at a certain level of granularity. Figure 7.3 introduces the generic categories for characterizing entities located in geographic space aligned to the basic categories of DOLCE. In the following, we will describe the generic categories GEOGRAPHICOBJECT (7.3.1), GEOGRAPHICOBJECTROLE (7.3.2), and PHYSICALQUALITY (7.3.3). Figure 7.3: Most general categories of the ontology for entities that are directly or indirectly located in geographic space aligned to the foundational ontology of DOLCE. For the formal description of categories, we use the notation of the Web Service Modeling Language (WSML) (de Bruijn 2005). The general WSML-syntax mainly consists of two parts: the conceptual syntax and the logical expression syntax. The conceptual syntax is a frame-like syntax with constructs like concepts, attributes, relations and instances. The logical expression syntax is used for further refinement of concept- or relation-definitions in the conceptual syntax.

135 Chapter The statements given below should be mostly self-explanatory since we use a human-readable syntax of WSML. Table 7.1 gives explanations for the keywords used in the statements. We follow Sloman et al. stating that [c]oncepts and categories are, to a large extent, flip sides of the same coin. Roughly speaking, a concept is the idea that characterizes a set, or category, of objects (1998, p. 192). Hence when we talk about category membership evaluation we put the focus on the readily established "categories of things" which emerge from cognitive agents perceiving and cognizing physical and social reality. We are well aware of the many different uses the notions concepts and categories acquired and stick here to the classical approach seeing ontologies as providing category systems although WSML uses the notion concept where we use category. The following logic statements are only extracts form the proposed geospatial domain ontology. The ontology can be accessed online 1. Table 7.1: Explanation of WSML keywords used in the statements of this paper. More details on WSML syntax are available in (de Bruijn 2005) WSML keyword concept subconceptof oftype relation memberof hasvalue Variables Explanations Concepts constitute the basic elements of the agreed terminology for some problem domain. They represent classes of objects of a real or abstract world that have a specific shared characteristics. Declares a specific relationship between the extensions of a concept and its superconcept: Every instance of the subconcept is considered to be an instance of the superconcept as well. C[A oftype T] defines a constraint on the possible values that instances of class C may take for property A to values of type T. Relations are used in order to model interdependencies between several concepts (respectively instances of these concepts). The arity of relations is not limited. O memberof T is true iff element O is an instance of type T, that means the element denoted by O is a member of the extension of type T. O[A hasvalue V ] is true iff the element denoted by O takes value V under property A. Variable names start with an initial question mark, "?". Variables may occur in place of concepts, attributes, instances, relation arguments or attribute values Geographic Object Geographic objects are not merely located in space, but are tied intrinsically to space in a manner that implies that they inherit from space many of its structural properties (mereological, topological, geometrical) (Smith and Mark 1998). Following this view, we define the generic category GEOGRAPHICOBJECT to comprise those entities that are immovable, that are located in geographic space and that derive their characteristics from physical reality. Entities have intrinsic and extrinsic properties, which refer to the qualities which inhere in them on the one hand (intrinsic), and to the relations that they have to other entities on the other hand (extrinsic) (Welty and Guarino 2001). All geographic objects have the intrinsic property of 1

136 Category Membership Evaluation for Geographic Entities 129 being located in geographic space. In addition, all geographic objects have the extrinsic property of entertaining spatial relations to other geographic objects, e.g. being in a distance to, being within, intersecting, and the like (1). More details on qualities and relations for geographic objects follow in Section concept GeographicObject subconceptof dolce#physicalobject 1 hasquality oftype SpatialLocationQ relation spatialrelation(oftype GeographicObject, oftype GeographicObject) (1) GEOGRAPHICOBJECT is aligned to the DOLCE category PHYSICALOBJECT, making any geographic object an entity that exhibits physical qualities, that participates in some perdurant. In other words, a geographic object is present with all its parts at a certain point in time (cp. Section 7.2). The qualities and relations used for describing category membership must be rigid properties, i.e. the entity ceases to exist if it looses the property (Welty and Guarino 2001). For example, a member of TERRESTRIALUNIT (e.g. a mountain, valley, or floodplain) cannot loose its qualities like having a slope, aspect, or landform without loosing its identity. Note, that loosing a quality is different from changing the magnitudes ("value") of such a quality. For example, the slope of a terrestrial unit might change due to a land slide. However, the terrestrial unit still has a slope quality, only its magnitude has changed after the landslide. In contrast, if a property is anti-rigid, it indicates that the classified entity might play a role. The properties of being a hazard area, being a recreational area or being a protected area are all anti-rigid, since the terrestrial unit might loose these properties completely while still being a terrestrial unit. We call the category of roles that are played by geographic objects GEOGRAPHICOBJECTROLE. More details on the category GEOGRAPHICOBJECTROLE follow in Section The rather generic category GEOGRAPHICOBJECT is further divided into two subcategories that divide natural units from manmade structures. Since the branch of natural units is more interesting for our approach at this stage, we leave aside the explanations for the other branches and continue with short description of the categories TERRESTRIALUNIT and VEGETATIONUNIT. TERRESTRIALUNIT Terrestrial units are parts of the earth surface that are only perceived as individual entities due to our object-centered view of the world. We assume that humans categorize terrestrial units based on their most prevalent qualities related to the earth s surface like the variations of elevation, slope, shape or the relations that they have with other geographic objects. All terrestrial units exhibit intrinsic physical qualities like elevation, slope, aspect, and landform. More details on physical qualities follow in Section The prefix dolce# specifies the namespace of the category PhysicalObject that is defined in the DOLCE ontology. Categories from the Geospatial Domain Ontology will later be referenced with domain# and categories from the Feature Type Ontology with fto#.

137 Chapter VEGETATIONUNIT A vegetation unit is a geographic object. VEGETATIONUNIT and TERRESTRIALUNIT are categories on the same taxonomic level. A terrestrial unit is associated with a vegetation unit with the topological relation iscoveredby (statement (2)). This modeling decision is based on the conceptualization that a terrestrial unit does not have a vegetation cover as an intrinsic quality like area, slope, or aspect. Instead, terrestrial unit and vegetation cover are two distinct although strongly related geographic objects. The vegetation cover might functionally depend on the terrestrial unit which results in the intuitive assumption that a vegetation unit is a part of a terrestrial unit. However we do not assume a mereological relation between vegetation unit and terrestrial unit. We conceptualize a vegetation unit as being one-sided dependant on being associated with a terrestrial unit, i.e. a vegetation unit can only exist if it covers a terrestrial unit (axiom definition in statement (2)). relation iscoveredby(oftype TerrestrialUnit, oftype VegetationUnit) axiom defvegunit definedby?x memberof VegetationUnit implies exists?y memberof TerrestrialUnit and iscoveredby(?y,?x). (2) Roles Played by Geographic Objects People use multiple conceptualizations of geographic space (Egenhofer and Mark 1995). That is, one and the same entity might be assigned to more than one category. For example, our intuition might tell us that a forest is a recreation area. In that case, a forest would be related via an "is-a" relation to both, the VEGETATIONUNIT and the RECREATIONALAREA. However, since an entity should be classified only via its rigid properties (Welty and Guarino 2001), it cannot be a member of two categories at the same time. In other words, we do not allow for multiple "is-a" relations (multiple inheritance) in our ontology. To avoid this problematic classification, we must carefully distinguish between entities that are members of GEOGRAPHICOBJECT and entities, which are roles that such geographic objects might play. As mentioned before, this differentiation can be performed based on exploring the rigidity of a property. For example, the rigid property of a floodplain is being adjacent to a river. If the floodplain looses this quality, it will also loose its identity, i.e. it will cease to exist. On the other hand, the property of being a recreational area is anti-rigid. That means a floodplain can stop being a recreational area without loosing its identity. The problem of avoiding multiple inheritances while it appears common to our intuitive interaction with entities in geographic space can be solved by allowing a geographic object to play multiple roles at the same time For example, a particular floodplain is a member of the category FLOODPLAIN that might play the role of a recreational area and the role of a habitat at the same time. Once we start thinking about entities that we identify as belonging to the geographic domain (e.g. national parks, or hazard areas), we will discover that most of these are actually roles that are played by other entities that are identified (via their rigid properties) to be members of GEOGRAPHICOBJECT. With this structure, we are able resolve the dilemma described by Egenhofer (1995) that people use multiple conceptualizations of geographic space, without giving up sound ontological modeling.

138 Category Membership Evaluation for Geographic Entities 131 Members of GEOGRAPHICOBJECTROLE do not have the intrinsic quality of being located in geographic space, since they are socially constructed entities, and in that sense non-physical entities. However, roles are indirectly located in geographic space through the geographic objects that play them. In other words, the role inherits the spatial location quality of the geographic object and thus has an indirect spatial location quality: axiom indirectspatialq definedby?x memberof GeographicObjectRole and?y memberof GeographicObject and isroleplayedby(?x,?y) and?y[hasspatiallocationq hasvalue?z] implies?x[hasindirectspatiallocationq hasvalue?z]. (3) What we have stated in axiom (3) for the spatial location quality holds for any physical quality or spatial relation. Roles indirectly have the physical qualities and spatial relations of the entities that play them. The identification of roles is not trivial. Anti-rigidity is not the only criterion. As we have stated before in Section 7.2, we follow (Masolo et al. 2004) characterizing roles as socially constructed entities that in consequence are existentially dependent on cognitive agents. Roles are entities that are characterized via the functionality they offer or the purpose they serve for an individual or for the community as such. For example: A hillside with northern aspect above 2000m and a slope greater than 43 degree plays the role of a hazard area in terms of avalanches only if one attempts to cross it, or to build a settlement beneath it. In a different context, the same hillside might play roles like nature reserve, afforestation area, or others. A geographic object, such as a forest, can play the role of a local recreational area because it is used for recreational activities like hiking, jogging, or biking. However, the forest, seen as a particular existing entity, does not stop to exist if it looses the role. To summarize, as soon as an anti-rigid and socially constructed property is used to classify an entity, the introduction of a new subcategory of GEOGRAPHICOBJECTROLE is indicated Spatial Relations and Physical Qualities Geographic objects are entities located in geographic space that exhibit physical qualities and spatial relations. We generally adopt but modify the structure for modeling qualities and their magnitudes as proposed in (Probst 2007). Physical Qualities In DOLCE, a physical endurant can only have physical qualities; such as mass, temperature, volume. Since we model GEOGRAPHICOBJECT as sub-category of PHYSICALENDURANT, a geographic object can only have physical qualities too. Additional to the DOLCE axiomatization, we introduce in statement (4) that the physical qualities of geographic objects are approximated (observed and measured) by reference regions. A reference region is part of the reference space for that quality. A reference space can be constituted by one to n dimensions, reflecting the complexity of the quality whose magnitude is made communicable via the reference space. As

139 Chapter explained in Section 7.2, we simplify the complex modeling structure for qualities presented in (Probst 2007) and introduce a short cut that omits the notions of quality space and reference space and allows us to directly associate a physical quality with observation and measurement results which in turn are associated to communicable signs. This short cut is introduced with the attribute hasomresult which directly relates the physical quality with a reference region. Figure 7.4 illustrated this modeling structure for physical qualities for an example set of qualities (which is not meant as exhaustive list). axiom definephysicalq definedby?x[hasquality hasvalue?physicalq] memberof GeographicObject and?physicalq[hasomresult hasvalue?rregion] memberof PhysicalQuality and?rregion memberof ReferenceRegion. (4) Figure 7.4: Focus on the category QUALITY. In the following we present two examples, where physical qualities play an important role for categorizing entities. Landform quality and vegetation form quality are both located in complex reference spaces made up of several dimensions, structuring the magnitudes of the qualities that compose the landform or vegetation form quality. In the scope of this work, we do not show how the reference regions for landform quality and vegetation form quality are formalized; we only give an informal characterization on how the reference regions might be constituted. LandformQuality. In most cases, the prevalent property for terrestrial units is the landform quality. This relies in our object-centered perception of the landscape, which has lead to a classification of typical reference regions for landform qualities like mountain, valley, and plain. We apply these reference regions on typical elevation patterns in the landscape and categorize the observed entity as e.g. a member of the category MOUNTAIN if it satisfies our notion of a mountain landform. The reference region is part of a reference space that will be made up of several dimensions, such as shape, slope or elevation. The sign (or term) used to name the reference region also determines the name of the terrestrial unit. This allows us, for example, to identify a particular terrestrial unit as mountain compared to a valley or a hill. This implies that the sub-categories of TERRESTRIAL UNIT are distinguished via the quantification of their qualities.

Key Words: geospatial ontologies, formal concept analysis, semantic integration, multi-scale, multi-context.

Key Words: geospatial ontologies, formal concept analysis, semantic integration, multi-scale, multi-context. Marinos Kavouras & Margarita Kokla Department of Rural and Surveying Engineering National Technical University of Athens 9, H. Polytechniou Str., 157 80 Zografos Campus, Athens - Greece Tel: 30+1+772-2731/2637,

More information

Semantic Evolution of Geospatial Web Services: Use Cases and Experiments in the Geospatial Semantic Web

Semantic Evolution of Geospatial Web Services: Use Cases and Experiments in the Geospatial Semantic Web Semantic Evolution of Geospatial Web Services: Use Cases and Experiments in the Geospatial Semantic Web Joshua Lieberman, Todd Pehle, Mike Dean Traverse Technologies, Inc., Northrop Grumman Information

More information

Mappings For Cognitive Semantic Interoperability

Mappings For Cognitive Semantic Interoperability Mappings For Cognitive Semantic Interoperability Martin Raubal Institute for Geoinformatics University of Münster, Germany raubal@uni-muenster.de SUMMARY Semantic interoperability for geographic information

More information

A General Framework for Conflation

A General Framework for Conflation A General Framework for Conflation Benjamin Adams, Linna Li, Martin Raubal, Michael F. Goodchild University of California, Santa Barbara, CA, USA Email: badams@cs.ucsb.edu, linna@geog.ucsb.edu, raubal@geog.ucsb.edu,

More information

A Case Study for Semantic Translation of the Water Framework Directive and a Topographic Database

A Case Study for Semantic Translation of the Water Framework Directive and a Topographic Database A Case Study for Semantic Translation of the Water Framework Directive and a Topographic Database Angela Schwering * + Glen Hart + + Ordnance Survey of Great Britain Southampton, U.K. * Institute for Geoinformatics,

More information

Modeling and Managing the Semantics of Geospatial Data and Services

Modeling and Managing the Semantics of Geospatial Data and Services Modeling and Managing the Semantics of Geospatial Data and Services Werner Kuhn, Martin Raubal, Michael Lutz Institute for Geoinformatics University of Münster (Germany) {kuhn, raubal, m.lutz}@uni-muenster.de

More information

Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies

Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies Taxonomies of Building Objects towards Topographic and Thematic Geo-Ontologies Melih Basaraner Division of Cartography, Department of Geomatic Engineering, Yildiz Technical University (YTU), Istanbul Turkey

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Netherlands

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Netherlands Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Netherlands Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

Use of the ISO Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011

Use of the ISO Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011 Use of the ISO 19100 Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011 Eurogeographics Quality Knowledge Exchange Network Reference: History Version Author Date Comments

More information

From Research Objects to Research Networks: Combining Spatial and Semantic Search

From Research Objects to Research Networks: Combining Spatial and Semantic Search From Research Objects to Research Networks: Combining Spatial and Semantic Search Sara Lafia 1 and Lisa Staehli 2 1 Department of Geography, UCSB, Santa Barbara, CA, USA 2 Institute of Cartography and

More information

Open Geospatial Consortium 35 Main Street, Suite 5 Wayland, MA Telephone: Facsimile:

Open Geospatial Consortium 35 Main Street, Suite 5 Wayland, MA Telephone: Facsimile: Open Geospatial Consortium 35 Main Street, Suite 5 Wayland, MA 01778 Telephone: +1-508-655-5858 Facsimile: +1-508-655-2237 Editor: Carl Reed Telephone: +1-970-402-0284 creed@opengeospatial.org The OGC

More information

EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE

EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE EXPECTATIONS OF TURKISH ENVIRONMENTAL SECTOR FROM INSPIRE June, 2010 Ahmet ÇİVİ Tuncay DEMİR INSPIRE in the Eyes of MoEF Handling of Geodata by MoEF Benefits Expected TEIEN First Steps for INSPIRE Final

More information

Designing and Evaluating Generic Ontologies

Designing and Evaluating Generic Ontologies Designing and Evaluating Generic Ontologies Michael Grüninger Department of Industrial Engineering University of Toronto gruninger@ie.utoronto.ca August 28, 2007 1 Introduction One of the many uses of

More information

A SEMANTIC SCHEMA FOR GEONAMES. Vincenzo Maltese and Feroz Farazi

A SEMANTIC SCHEMA FOR GEONAMES. Vincenzo Maltese and Feroz Farazi DISI - Via Sommarive 14-38123 Povo - Trento (Italy) http://www.disi.unitn.it A SEMANTIC SCHEMA FOR GEONAMES Vincenzo Maltese and Feroz Farazi January 2013 Technical Report # DISI-13-004 A semantic schema

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Malta

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Malta Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Malta Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

Paper presented at the 9th AGILE Conference on Geographic Information Science, Visegrád, Hungary,

Paper presented at the 9th AGILE Conference on Geographic Information Science, Visegrád, Hungary, 220 A Framework for Intensional and Extensional Integration of Geographic Ontologies Eleni Tomai 1 and Poulicos Prastacos 2 1 Research Assistant, 2 Research Director - Institute of Applied and Computational

More information

Part 1: Fundamentals

Part 1: Fundamentals Provläsningsexemplar / Preview INTERNATIONAL STANDARD ISO 19101-1 First edition 2014-11-15 Geographic information Reference model Part 1: Fundamentals Information géographique Modèle de référence Partie

More information

Technical Specifications. Form of the standard

Technical Specifications. Form of the standard Used by popular acceptance Voluntary Implementation Mandatory Legally enforced Technical Specifications Conventions Guidelines Form of the standard Restrictive Information System Structures Contents Values

More information

GEO-INFORMATION (LAKE DATA) SERVICE BASED ON ONTOLOGY

GEO-INFORMATION (LAKE DATA) SERVICE BASED ON ONTOLOGY GEO-INFORMATION (LAKE DATA) SERVICE BASED ON ONTOLOGY Long-hua He* and Junjie Li Nanjing Institute of Geography & Limnology, Chinese Academy of Science, Nanjing 210008, China * Email: lhhe@niglas.ac.cn

More information

The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes

The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes The Global Statistical Geospatial Framework and the Global Fundamental Geospatial Themes Sub-regional workshop on integration of administrative data, big data and geospatial information for the compilation

More information

WEB-BASED SPATIAL DECISION SUPPORT: TECHNICAL FOUNDATIONS AND APPLICATIONS

WEB-BASED SPATIAL DECISION SUPPORT: TECHNICAL FOUNDATIONS AND APPLICATIONS WEB-BASED SPATIAL DECISION SUPPORT: TECHNICAL FOUNDATIONS AND APPLICATIONS Claus Rinner University of Muenster, Germany Piotr Jankowski San Diego State University, USA Keywords: geographic information

More information

Roadmap to interoperability of geoinformation

Roadmap to interoperability of geoinformation Roadmap to interoperability of geoinformation and services in Europe Paul Smits, Alessandro Annoni European Commission Joint Research Centre Institute for Environment and Sustainability paul.smits@jrc.it

More information

Reducing Consumer Uncertainty

Reducing Consumer Uncertainty Spatial Analytics Reducing Consumer Uncertainty Eliciting User and Producer Views on Geospatial Data Quality Introduction Cooperative Research Centre for Spatial Information (CRCSI) in Australia Communicate

More information

Charter for the. Information Transfer and Services Architecture Focus Group

Charter for the. Information Transfer and Services Architecture Focus Group for the Information Transfer and Services Architecture Focus Group 1. PURPOSE 1.1. The purpose of this charter is to establish the Information Transfer and Services Architecture Focus Group (ITSAFG) as

More information

Interoperable Services for Web-Based Spatial Decision Support

Interoperable Services for Web-Based Spatial Decision Support Interoperable Services for Web-Based Spatial Decision Support Nicole Ostländer Institute for Geoinformatics, University of Münster Münster, Germany ostland@uni-muenster.de SUMMARY A growing number of spatial

More information

Geospatial Semantics. Yingjie Hu. Geospatial Semantics

Geospatial Semantics. Yingjie Hu. Geospatial Semantics Outline What is geospatial? Why do we need it? Existing researches. Conclusions. What is geospatial? Semantics The meaning of expressions Syntax How you express the meaning E.g. I love GIS What is geospatial?

More information

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective

Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective Geographic Analysis of Linguistically Encoded Movement Patterns A Contextualized Perspective Alexander Klippel 1, Alan MacEachren 1, Prasenjit Mitra 2, Ian Turton 1, Xiao Zhang 2, Anuj Jaiswal 2, Kean

More information

Spatial Data Infrastructure Concepts and Components. Douglas Nebert U.S. Federal Geographic Data Committee Secretariat

Spatial Data Infrastructure Concepts and Components. Douglas Nebert U.S. Federal Geographic Data Committee Secretariat Spatial Data Infrastructure Concepts and Components Douglas Nebert U.S. Federal Geographic Data Committee Secretariat August 2009 What is a Spatial Data Infrastructure (SDI)? The SDI provides a basis for

More information

AS/NZS ISO :2015

AS/NZS ISO :2015 Australian/New Zealand Standard Geographic information Reference model Part 1: Fundamentals Superseding AS/NZS ISO 19101:2003 AS/NZS ISO 19101.1:2015 (ISO 19101-1:2014, IDT) AS/NZS ISO 19101.1:2015 This

More information

ESBN. Working Group on INSPIRE

ESBN. Working Group on INSPIRE ESBN Working Group on INSPIRE by Marc Van Liedekerke, Endre Dobos and Paul Smits behalf of the WG members WG participants Marc Van Liedekerke Panos Panagos Borut Vrščaj Ivana Kovacikova Erik Obersteiner

More information

Visualizing Uncertainty In Environmental Work-flows And Sensor Streams

Visualizing Uncertainty In Environmental Work-flows And Sensor Streams Visualizing Uncertainty In Environmental Work-flows And Sensor Streams Karthikeyan Bollu Ganesh and Patrick Maué Institute for Geoinformatics (IFGI), University of Muenster, D-48151 Muenster, Germany {karthikeyan,pajoma}@uni-muenster.de

More information

Maps as Research Tools Within a Virtual Research Environment.

Maps as Research Tools Within a Virtual Research Environment. Maps as Research Tools Within a Virtual Research Environment. Sebastian Specht, Christian Hanewinkel Leibniz-Institute for Regional Geography Leipzig, Germany Abstract. The Tambora.org Virtual Research

More information

Economic and Social Council 2 July 2015

Economic and Social Council 2 July 2015 ADVANCE UNEDITED VERSION UNITED NATIONS E/C.20/2015/11/Add.1 Economic and Social Council 2 July 2015 Committee of Experts on Global Geospatial Information Management Fifth session New York, 5-7 August

More information

INSPIRE - A Legal framework for environmental and land administration data in Europe

INSPIRE - A Legal framework for environmental and land administration data in Europe INSPIRE - A Legal framework for environmental and land administration data in Europe Dr. Markus Seifert Bavarian Administration for Surveying and Cadastre Head of the SDI Office Bavaria Delegate of Germany

More information

Training on national land cover classification systems. Toward the integration of forest and other land use mapping activities.

Training on national land cover classification systems. Toward the integration of forest and other land use mapping activities. Training on national land cover classification systems Toward the integration of forest and other land use mapping activities. Guiana Shield 9 to 13 March 2015, Paramaribo, Suriname Background Sustainable

More information

CARTOGRAPHIC WEB SERVICES AND CARTOGRAPHIC RULES A NEW APPROACH FOR WEB CARTOGRAPHY

CARTOGRAPHIC WEB SERVICES AND CARTOGRAPHIC RULES A NEW APPROACH FOR WEB CARTOGRAPHY CARTOGRAPHIC WEB SERVICES AND CARTOGRAPHIC RULES A NEW APPROACH FOR WEB CARTOGRAPHY 1. Introduction Ionut Iosifescu, Marco Hugentobler, Lorenz Hurni ETH Zurich, Institute of Cartography Wolfgang-Pauli-Str.

More information

The Process of Spatial Data Harmonization in Italy. Geom. Paola Ronzino

The Process of Spatial Data Harmonization in Italy. Geom. Paola Ronzino The Process of Spatial Data Harmonization in Italy Geom. Paola Ronzino ISSUES Geospatial Information in Europe: lack of data harmonization the lack of data duplication of data CHALLENGES Challenge of INSPIRE:

More information

Ministry of Health and Long-Term Care Geographic Information System (GIS) Strategy An Overview of the Strategy Implementation Plan November 2009

Ministry of Health and Long-Term Care Geographic Information System (GIS) Strategy An Overview of the Strategy Implementation Plan November 2009 Ministry of Health and Long-Term Care Geographic Information System (GIS) Strategy An Overview of the Strategy Implementation Plan November 2009 John Hill, Health Analytics Branch Health System Information

More information

Geografisk information Referensmodell. Geographic information Reference model

Geografisk information Referensmodell. Geographic information Reference model SVENSK STANDARD SS-ISO 19101 Fastställd 2002-08-09 Utgåva 1 Geografisk information Referensmodell Geographic information Reference model ICS 35.240.70 Språk: engelska Tryckt i september 2002 Copyright

More information

REGIONAL SDI DEVELOPMENT

REGIONAL SDI DEVELOPMENT REGIONAL SDI DEVELOPMENT Abbas Rajabifard 1 and Ian P. Williamson 2 1 Deputy Director and Senior Research Fellow Email: abbas.r@unimelb.edu.au 2 Director, Professor of Surveying and Land Information, Email:

More information

A Practical Example of Semantic Interoperability of Large-Scale Topographic Databases Using Semantic Web Technologies

A Practical Example of Semantic Interoperability of Large-Scale Topographic Databases Using Semantic Web Technologies Paper presented at the 9th AGILE Conference on Geographic Information Science, Visegrád, Hungary, 2006 35 A Practical Example of Semantic Interoperability of Large-Scale Topographic Databases Using Semantic

More information

Analys av GIS-användning inom offentlig sektor i Norden

Analys av GIS-användning inom offentlig sektor i Norden Analys av GIS-användning inom offentlig sektor i Norden Henning Sten Hansen, Line Hvingel and Lise Schrøder Aalborg Universitet Jönköping, Sweden 30 March 2011 Overview The Danish and Swedish approaches

More information

Basic Dublin Core Semantics

Basic Dublin Core Semantics Basic Dublin Core Semantics DC 2006 Tutorial 1, 3 October 2006 Marty Kurth Head of Metadata Services Cornell University Library Getting started Let s introduce ourselves Let s discuss our expectations

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE France

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE France Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE France Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Finland

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Finland Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Finland Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

The production and use of a hydrographic flow-direction network of surface waters. Rickard HALLENGREN, Håkan OLSSON and Erik SISELL, Sweden

The production and use of a hydrographic flow-direction network of surface waters. Rickard HALLENGREN, Håkan OLSSON and Erik SISELL, Sweden The production and use of a hydrographic flow-direction network of surface waters Rickard HALLENGREN, Håkan OLSSON and Erik SISELL, Sweden Key words: hydrographic, flow-direction network, surface waters

More information

Using C-OWL for the Alignment and Merging of Medical Ontologies

Using C-OWL for the Alignment and Merging of Medical Ontologies Using C-OWL for the Alignment and Merging of Medical Ontologies Heiner Stuckenschmidt 1, Frank van Harmelen 1 Paolo Bouquet 2,3, Fausto Giunchiglia 2,3, Luciano Serafini 3 1 Vrije Universiteit Amsterdam

More information

ArchaeoKM: Managing Archaeological data through Archaeological Knowledge

ArchaeoKM: Managing Archaeological data through Archaeological Knowledge Computer Applications and Quantitative Methods in Archeology - CAA 2010 Fco. Javier Melero & Pedro Cano (Editors) ArchaeoKM: Managing Archaeological data through Archaeological Knowledge A. Karmacharya

More information

Using OGC standards to improve the common

Using OGC standards to improve the common Using OGC standards to improve the common operational picture Abstract A "Common Operational Picture", or a, is a single identical display of relevant operational information shared by many users. The

More information

Intelligent GIS: Automatic generation of qualitative spatial information

Intelligent GIS: Automatic generation of qualitative spatial information Intelligent GIS: Automatic generation of qualitative spatial information Jimmy A. Lee 1 and Jane Brennan 1 1 University of Technology, Sydney, FIT, P.O. Box 123, Broadway NSW 2007, Australia janeb@it.uts.edu.au

More information

Semantic Granularity in Ontology-Driven Geographic Information Systems

Semantic Granularity in Ontology-Driven Geographic Information Systems Semantic Granularity in Ontology-Driven Geographic Information Systems Frederico Fonseca a Max Egenhofer b Clodoveu Davis c Gilberto Câmara d a School of Information Sciences and Technology Pennsylvania

More information

INSPIREd solutions for Air Quality problems Alexander Kotsev

INSPIREd solutions for Air Quality problems Alexander Kotsev INSPIREd solutions for Air Quality problems Alexander Kotsev www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting legislation The European data puzzle The European data puzzle 24 official

More information

The purpose of this report is to recommend a Geographic Information System (GIS) Strategy for the Town of Richmond Hill.

The purpose of this report is to recommend a Geographic Information System (GIS) Strategy for the Town of Richmond Hill. Staff Report for Committee of the Whole Meeting Department: Division: Subject: Office of the Chief Administrative Officer Strategic Initiatives SRCAO.18.12 GIS Strategy Purpose: The purpose of this report

More information

Exploring Visualization of Geospatial Ontologies Using Cesium

Exploring Visualization of Geospatial Ontologies Using Cesium Exploring Visualization of Geospatial Ontologies Using Cesium Abhishek V. Potnis, Surya S. Durbha Centre of Studies in Resources Engineering, Indian Institue of Technology Bombay, India abhishekvpotnis@iitb.ac.in,

More information

Paper UC1351. Conference: User Conference Date: 08/10/2006 Time: 8:30am-9:45am Room: Room 23-B (SDCC)

Paper UC1351. Conference: User Conference Date: 08/10/2006 Time: 8:30am-9:45am Room: Room 23-B (SDCC) Conference: User Conference Date: 08/10/2006 Time: 8:30am-9:45am Room: Room 23-B (SDCC) Title of Paper: Increasing the Use of GIS in the Federal Government Author Name: Miss Abstract This presentation

More information

Open spatial data infrastructure

Open spatial data infrastructure Open spatial data infrastructure a backbone for digital government Thorben Hansen Geomatikkdagene 2018 Stavanger 13.-15. mars Spatial Data Infrastructure definition the technology, policies, standards,

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Czech Republic

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Czech Republic Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Czech Republic Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the

More information

Developing 3D Geoportal for Wilayah Persekutuan Iskandar

Developing 3D Geoportal for Wilayah Persekutuan Iskandar Developing 3D Geoportal for Wilayah Persekutuan Iskandar Dionnald Beh BoonHeng and Alias Abdul Rahman Department of Geoinformatics, Faculty of Geoinformation Engineering and Sciences, Universiti Teknologi

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Croatia

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Croatia Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Croatia Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata

Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata Finding geodata that otherwise would have been forgotten GeoXchange a portal for free geodata Sven Tschirner and Alexander Zipf University of Applied Sciences FH Mainz Department of Geoinformatics and

More information

Axiomatized Relationships Between Ontologies. Carmen Chui

Axiomatized Relationships Between Ontologies. Carmen Chui Axiomatized Relationships Between Ontologies by Carmen Chui A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Mechanical & Industrial

More information

Toward A Foundational Hydro Ontology For Water Data Interoperability

Toward A Foundational Hydro Ontology For Water Data Interoperability City University of New York (CUNY) CUNY Academic Works International Conference on Hydroinformatics 8-1-2014 Toward A Foundational Hydro Ontology For Water Data Interoperability Boyan Brodaric Torsten

More information

Cognitive modeling with conceptual spaces

Cognitive modeling with conceptual spaces Cognitive modeling with conceptual spaces Martin Raubal Department of Geography, University of California at Santa Barbara 5713 Ellison Hall, Santa Barbara, CA 93106, U.S.A. raubal@geog.ucsb.edu Abstract.

More information

Global Geospatial Information Management Country Report Finland. Submitted by Director General Jarmo Ratia, National Land Survey

Global Geospatial Information Management Country Report Finland. Submitted by Director General Jarmo Ratia, National Land Survey Global Geospatial Information Management Country Report Finland Submitted by Director General Jarmo Ratia, National Land Survey Global Geospatial Information Management Country Report Finland Background

More information

Think Local - Act Global a Nordic Perspective

Think Local - Act Global a Nordic Perspective Think Local - Act Global a Nordic Perspective OGC Nordic Forum Jari Reini 20-21.5.2014 OGC Nordic Forum? OGC Nordic Forum addresses OGC outreach and education needs of government, academic, research and

More information

INSPIRing Geospatial Framework For Local Administrations

INSPIRing Geospatial Framework For Local Administrations This project is financed by the European Union and the Republic of Turkey Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey National Programme for Turkey 2010 Instrument for Pre-Accession

More information

Citation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.

Citation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n. University of Groningen Geographically constrained information retrieval Andogah, Geoffrey IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from

More information

UBGI and Address Standards

UBGI and Address Standards Workshop on address standards UBGI and Address Standards 2008. 5.25 Copenhagen, Denmark Sang-Ki Hong Convenor, WG 10 1 Evolution of Geographic Information Visualization Feature (Contents) Context Accessibility

More information

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research

GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research GIS Visualization: A Library s Pursuit Towards Creative and Innovative Research Justin B. Sorensen J. Willard Marriott Library University of Utah justin.sorensen@utah.edu Abstract As emerging technologies

More information

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME:

GEOGRAPHY 350/550 Final Exam Fall 2005 NAME: 1) A GIS data model using an array of cells to store spatial data is termed: a) Topology b) Vector c) Object d) Raster 2) Metadata a) Usually includes map projection, scale, data types and origin, resolution

More information

CyberGIS: What Still Needs to Be Done? Michael F. Goodchild University of California Santa Barbara

CyberGIS: What Still Needs to Be Done? Michael F. Goodchild University of California Santa Barbara CyberGIS: What Still Needs to Be Done? Michael F. Goodchild University of California Santa Barbara Progress to date Interoperable location referencing coordinate transformations geocoding addresses point-of-interest

More information

Your web browser (Safari 7) is out of date. For more security, comfort and. the best experience on this site: Update your browser Ignore

Your web browser (Safari 7) is out of date. For more security, comfort and. the best experience on this site: Update your browser Ignore Your web browser (Safari 7) is out of date. For more security, comfort and lesson the best experience on this site: Update your browser Ignore Political Borders Why are the borders of countries located

More information

ISO Series Standards in a Model Driven Architecture for Landmanagement. Jürgen Ebbinghaus, AED-SICAD

ISO Series Standards in a Model Driven Architecture for Landmanagement. Jürgen Ebbinghaus, AED-SICAD ISO 19100 Series Standards in a Model Driven Architecture for Landmanagement Jürgen Ebbinghaus, AED-SICAD 29.10.2003 The Company 100% SIEMENS PTD SIEMENS Business Services Shareholder & Strategic Business

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Ireland

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Ireland Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Ireland Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

Ontology Summit 2016: SI Track: SI in the GeoScience Session 1: How is SI Viewed in the GeoSciences"

Ontology Summit 2016: SI Track: SI in the GeoScience Session 1: How is SI Viewed in the GeoSciences Ontology Summit 2016: SI Track: SI in the GeoScience Session 1: How is SI Viewed in the GeoSciences" February 25, 2016 Some Introductory Comments on the Track Topic Gary Berg-Cross Ontolog, RDA US Advisory

More information

Model Generalisation in the Context of National Infrastructure for Spatial Information

Model Generalisation in the Context of National Infrastructure for Spatial Information Model Generalisation in the Context of National Infrastructure for Spatial Information Tomas MILDORF and Vaclav CADA, Czech Republic Key words: NSDI, INSPIRE, model generalization, cadastre, spatial planning

More information

Economic and Social Council

Economic and Social Council United Nations Economic and Social Council Distr.: General 2 July 2012 E/C.20/2012/10/Add.1 Original: English Committee of Experts on Global Geospatial Information Management Second session New York, 13-15

More information

DGIWG 200. Defence Geospatial Information Framework (DGIF) Overview

DGIWG 200. Defence Geospatial Information Framework (DGIF) Overview DGIWG 200 Defence Geospatial Information Framework (DGIF) Overview Document type: Standard Document date: 28 November 2017 Edition: 2.0.0 Responsible Party: Audience: Abstract: Copyright: Defence Geospatial

More information

8 th Arctic Regional Hydrographic Commission Meeting September 2018, Longyearbyen, Svalbard Norway

8 th Arctic Regional Hydrographic Commission Meeting September 2018, Longyearbyen, Svalbard Norway 8 th Arctic Regional Hydrographic Commission Meeting 11-13 September 2018, Longyearbyen, Svalbard Norway Status Report of the Arctic Regional Marine Spatial Data Infrastructures Working Group (ARMSDIWG)

More information

Collaborative NLP-aided ontology modelling

Collaborative NLP-aided ontology modelling Collaborative NLP-aided ontology modelling Chiara Ghidini ghidini@fbk.eu Marco Rospocher rospocher@fbk.eu International Winter School on Language and Data/Knowledge Technologies TrentoRISE Trento, 24 th

More information

Towards knowledge-based integration and visualization of geospatial data using Semantic Web technologies *

Towards knowledge-based integration and visualization of geospatial data using Semantic Web technologies * Towards knowledge-based integration and visualization of geospatial data using Semantic Web technologies * Weiming Huang GIS Centre, Lund University, Lund, Sweden weiming.huang@nateko.lu.se Abstract. Geospatial

More information

Spatial data interoperability and INSPIRE compliance the platform approach BAGIS

Spatial data interoperability and INSPIRE compliance the platform approach BAGIS Spatial data interoperability and INSPIRE compliance the platform approach BAGIS BAGIS Voluntary, independent, public, non-profit organization; Organization with main mission to promote the growth of the

More information

GEOSPATIAL WEB SERVICE INTEGRATION AND MASHUPS FOR WATER RESOURCE APPLICATIONS

GEOSPATIAL WEB SERVICE INTEGRATION AND MASHUPS FOR WATER RESOURCE APPLICATIONS GEOSPATIAL WEB SERVICE INTEGRATION AND MASHUPS FOR WATER RESOURCE APPLICATIONS C. Granell a, *, L. Díaz a, M. Gould a a Center for Interactive Visualization, Department of Information Systems, Universitat

More information

European Commission STUDY ON INTERIM EVALUATION OF EUROPEAN MARINE OBSERVATION AND DATA NETWORK. Executive Summary

European Commission STUDY ON INTERIM EVALUATION OF EUROPEAN MARINE OBSERVATION AND DATA NETWORK. Executive Summary European Commission STUDY ON INTERIM EVALUATION OF EUROPEAN MARINE OBSERVATION AND DATA NETWORK Executive Summary by NILOS Netherlands Institute for the Law of the Sea June 2011 Page ii Study on Interim

More information

The Swedish National Geodata Strategy and the Geodata Project

The Swedish National Geodata Strategy and the Geodata Project The Swedish National Geodata Strategy and the Geodata Project Ewa Rannestig, Head of NSDI Co-ordination Unit, Lantmäteriet, ewa.rannstig@lm.se Ulf Sandgren, Project Manager Geodata Project, Lantmäteriet,

More information

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning

Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning Data Aggregation with InfraWorks and ArcGIS for Visualization, Analysis, and Planning Stephen Brockwell President, Brockwell IT Consulting, Inc. Join the conversation #AU2017 KEYWORD Class Summary Silos

More information

Ontology-driven Problem Solving Framework for Spatial Decision Support Systems

Ontology-driven Problem Solving Framework for Spatial Decision Support Systems Ontology-driven Problem Solving Framework for Spatial Decision Support Systems Abstract Chin-Te Jung 1, Chih-Hong Sun 2, 1 Department of Geography National Taiwan University d94228001@ntu.edu.tw 2 Taiwan

More information

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Portugal

Status of implementation of the INSPIRE Directive 2016 Country Fiches. COUNTRY FICHE Portugal Status of implementation of the INSPIRE Directive 2016 Country Fiches COUNTRY FICHE Portugal Introduction... 1 1. State of Play... 2 1.1 Coordination... 2 1.2 Functioning and coordination of the infrastructure...

More information

The Baltic Sea Region Maritime Spatial Planning Data Expert Sub-group. First Report 2015/2016/

The Baltic Sea Region Maritime Spatial Planning Data Expert Sub-group. First Report 2015/2016/ The Baltic Sea Region Maritime Spatial Planning Data Expert Sub-group First Report 2015/2016/2017 21.04.2017. Contents 1. Glossary 2 2. Introduction. 3 3. MSP Data. 5 3.1. Input Data. 5 3.2 Output Data.

More information

Modern Education at Universities: Improvements through the Integration of a Spatial Data Infrastructure SDI into an e-learning Environment

Modern Education at Universities: Improvements through the Integration of a Spatial Data Infrastructure SDI into an e-learning Environment Modern Education at Universities: Improvements through the Integration of a Spatial Data Infrastructure SDI into an e-learning Environment Ingo Simonis Institute for Geoinformatics, University of Muenster

More information

Semantics, ontologies and escience for the Geosciences

Semantics, ontologies and escience for the Geosciences Semantics, ontologies and escience for the Geosciences Femke Reitsma¹, John Laxton², Stuart Ballard³, Werner Kuhn, Alia Abdelmoty ¹femke.reitsma@ed.ac.uk: School of Geosciences, Edinburgh University ²

More information

The OntoNL Semantic Relatedness Measure for OWL Ontologies

The OntoNL Semantic Relatedness Measure for OWL Ontologies The OntoNL Semantic Relatedness Measure for OWL Ontologies Anastasia Karanastasi and Stavros hristodoulakis Laboratory of Distributed Multimedia Information Systems and Applications Technical University

More information

Collaborative Systems for the Creation of Marine Services

Collaborative Systems for the Creation of Marine Services Collaborative Systems for the Creation of Marine Services INSPIRE 2016 Nuno Almeida Nuno Catarino Barcelona, 28 th September 2016 Elecnor Deimos is a trademark which encompasses Elecnor Group companies

More information

The Role of Ontology in Improving Gazetteer Interaction

The Role of Ontology in Improving Gazetteer Interaction International Journal of Geographical Information Science Vol. 00, No. 00, Month 200x, 1 24 The Role of Ontology in Improving Gazetteer Interaction Krzysztof Janowicz & Carsten Keßler Institute for Geoinformatics,

More information

gvsig: Open Source Solutions in spatial technologies

gvsig: Open Source Solutions in spatial technologies gvsig: Open Source Solutions in spatial technologies gvsig is a tool for handling geographical information, a completely GIS client with license GNU GPL. Alvaro A. Anguix Alfaro, Gabriel Carrión Rico Conselleria

More information

Ready for INSPIRE.... connecting worlds. European SDI Service Center

Ready for INSPIRE.... connecting worlds. European SDI Service Center Ready for INSPIRE Consultancy SOFTWARE T r a i n i n g Solutions... connecting worlds European SDI Service Center Increasing Added Value with INSPIRE and SDI Components INSPIRE In 2007, the European Commission

More information

Safe to Save? Archive Options for Geodatabases. Jeff Essic North Carolina State University Libraries

Safe to Save? Archive Options for Geodatabases. Jeff Essic North Carolina State University Libraries Safe to Save? Archive Options for Geodatabases Jeff Essic North Carolina State University Libraries 2011 ESRI International Users Conference July 13, 2011 GeoMAPP Geospatial Multistate Archive and Preservation

More information

Ontology Summit Framing the Conversation: Ontologies within Semantic Interoperability Ecosystems

Ontology Summit Framing the Conversation: Ontologies within Semantic Interoperability Ecosystems Ontology Summit 2016 Framing the Conversation: Ontologies within Semantic Interoperability Ecosystems GeoSciences Track: Semantic Interoperability in the GeoSciences Gary Berg-Cross and Ken Baclawski Co-Champions

More information

WEB MAP SERVICE (WMS) FOR GEOLOGICAL DATA GEORGE TUDOR

WEB MAP SERVICE (WMS) FOR GEOLOGICAL DATA GEORGE TUDOR WEB MAP SERVICE (WMS) FOR GEOLOGICAL DATA GEORGE TUDOR WEB MAP SERVICE (WMS) - GENERALITIES Projects with data from different sources Geological data are in different GIS software format Large amount of

More information

Observe Reflect Question What type of document is this?

Observe Reflect Question What type of document is this? Appendix 2 An Empty Primary Source Analysis Tool and a Full Primary Source Analysis Tool with Guiding Questions Observe Reflect Question What type of document is this? What is the purpose of this document?

More information