Least Common Subsumers and Most Specific Concepts in a Description Logic with Existential Restrictions and Terminological Cycles*

Similar documents
Restricted role-value-maps in a description logic with existential restrictions and terminological cycles

Role-depth Bounded Least Common Subsumers by Completion for EL- and Prob-EL-TBoxes

Hybrid Unification in the Description Logic EL

LTCS Report. A finite basis for the set of EL-implications holding in a finite model

Computing Least Common Subsumers in Description Logics with Existential Restrictions*

Complexity of Subsumption in the EL Family of Description Logics: Acyclic and Cyclic TBoxes

Completing Description Logic Knowledge Bases using Formal Concept Analysis

Computing the Least Common Subsumer w.r.t. a Background Terminology

A Goal-Oriented Algorithm for Unification in EL w.r.t. Cycle-Restricted TBoxes

Technische Universität Dresden. Fakultät Informatik EMCL Master s Thesis on. Hybrid Unification in the Description Logic EL

CEX and MEX: Logical Diff and Semantic Module Extraction in a Fragment of OWL

Subsumption of concepts in FL 0 for (cyclic) terminologies with respect to descriptive semantics is PSPACE-complete.

Dismatching and Local Disunification in EL

Extending Unification in EL towards General TBoxes

Chapter 2 Background. 2.1 A Basic Description Logic

Tractable Extensions of the Description Logic EL with Numerical Datatypes

COMPUTING LOCAL UNIFIERS IN THE DESCRIPTION LOGIC EL WITHOUT THE TOP CONCEPT

Phase 1. Phase 2. Phase 3. History. implementation of systems based on incomplete structural subsumption algorithms

LTCS Report. Decidability and Complexity of Threshold Description Logics Induced by Concept Similarity Measures. LTCS-Report 16-07

Proof-theoretic Approach to Deciding Subsumption and Computing Least Common Subsumer in EL w.r.t. Hybrid TBoxes

Completing Description Logic Knowledge Bases using Formal Concept Analysis

Fuzzy Description Logics

LTCS Report. Exploring finite models in the Description Logic EL gfp. Franz Baader, Felix Distel. LTCS-Report 08-05

An Introduction to Description Logics

Reasoning in ELH w.r.t. General Concept Inclusion Axioms

The Concept Difference for EL-Terminologies using Hypergraphs

The Complexity of Lattice-Based Fuzzy Description Logics

Rewriting Concepts Using Terminologies Revisited Franz Baader, Ralf Kusters, and Ralf Molitor LuFg Theoretical Computer Science, RWTH Aachen fb

Reasoning About Typicality in ALC and EL

Optimisation of Terminological Reasoning

High Performance Absorption Algorithms for Terminological Reasoning

From Tableaux to Automata for Description Logics

ten into an \related better" description E by using (some of) the names dened in T? In this paper, related will mean equivalent, and better will mean

Gödel Negation Makes Unwitnessed Consistency Crisp

Description Logics. an introduction into its basic ideas

On Subsumption and Instance Problem in ELH w.r.t. General TBoxes

Description logics. Description Logics. Applications. Outline. Syntax - AL. Tbox and Abox

The Bayesian Ontology Language BEL

Incremental Learning of TBoxes from Interpretation Sequences with Methods of Formal Concept Analysis

Uncertainty in Description Logics: a Lattice-based Approach

Quasi-Classical Semantics for Expressive Description Logics

Expressive number restrictions in Description Logics

Finite Model Reasoning in Horn-SHIQ

Axiom Pinpointing in General Tableaux

Pushing the EL Envelope Further

Decidability of SHI with transitive closure of roles

On the Decidability Status of Fuzzy ALC with General Concept Inclusions

Foundations for the Logical Difference of EL-TBoxes

Structured Descriptions & Tradeoff Between Expressiveness and Tractability

Fuzzy DLs over Finite Lattices with Nominals

A Description Logic with Concrete Domains and a Role-forming Predicate Operator

An Introduction to Description Logic III

DL-Lite Contraction and Revision

Fuzzy Ontologies over Lattices with T-norms

Description Logics. Glossary. Definition

Just: a Tool for Computing Justifications w.r.t. ELH Ontologies

ALC Concept Learning with Refinement Operators

LTCS Report. Subsumption in Finitely Valued Fuzzy EL. Stefan Borgwardt Marco Cerami Rafael Peñaloza. LTCS-Report 15-06

Non-Gödel Negation Makes Unwitnessed Consistency Undecidable

Description Logics: an Introductory Course on a Nice Family of Logics. Day 2: Tableau Algorithms. Uli Sattler

Complexity of Axiom Pinpointing in the DL-Lite Family

Bayesian Description Logics

Keys, Nominals, and Concrete Domains

Łukasiewicz Fuzzy EL is Undecidable

Adaptive ALE-TBox for Extending Terminological Knowledge

LTCS Report. Blocking and Pinpointing in Forest Tableaux. LTCS-Report 08-02

An Introduction to Description Logic VIII Inherent intractability of terminological reasoning

Exploiting Pseudo Models for TBox and ABox Reasoning in Expressive Description Logics

The Inclusion Problem for Weighted Automata on Infinite Trees

An Automata-Based Approach for Subsumption w.r.t. General Concept Inclusions in the Description Logic FL 0

A Tableau Algorithm for Fuzzy Description Logics over Residuated De Morgan Lattices

Inverting Proof Systems for Secrecy under OWA

Forgetting and uniform interpolation in large-scale description logic terminologies

A Crisp Representation for Fuzzy SHOIN with Fuzzy Nominals and General Concept Inclusions

Knowledge Bases in Description Logics

A New Approach to Knowledge Base Revision in DL-Lite

University of Oxford. Consequence-based Datatype Reasoning in EL: Identifying the Tractable Fragments

Representability in DL-Lite R Knowledge Base Exchange

Belief Contraction for the Description Logic EL

Adding Context to Tableaux for DLs

Modular Reuse of Ontologies: Theory and Practice

Tight Complexity Bounds for Reasoning in the Description Logic BEL

A Similarity Measure for the ALN Description Logic

Lightweight Description Logics: DL-Lite A and EL ++

DESCRIPTION LOGICS. Paula Severi. October 12, University of Leicester

Unification in Description Logic EL without top constructor

Technische Universität Dresden. Fakultät Informatik MCL Master s Thesis on

Reasoning with Annotated Description Logic Ontologies

Entailment with Conditional Equality Constraints (Extended Version)

PSpace Automata for Description Logics

On Terminological Default Reasoning about Spatial Information: Extended Abstract

Inconsistencies, Negations and Changes in Ontologies

Exact Learning of TBoxes in EL and DL-Lite

Modeling Ontologies Using OWL, Description Graphs, and Rules

Interpreting Low and High Order Rules: A Granular Computing Approach

Reasoning in the Description Logic BEL using Bayesian Networks

The Fuzzy Description Logic ALC F LH

A Tractable Rule Language in the Modal and Description Logics that Combine CPDL with Regular Grammar Logic

Elementary Sets for Logic Programs

Transcription:

Least Common Subsumers and Most Specific Concepts in a Description Logic with Existential Restrictions and Terminological Cycles* Franz Baader Theoretical Computer Science TU Dresden D-01062 Dresden, Germany baader@inf.tu-dresden.dea Abstract Computing least common subsumers (Ics) and most specific concepts (msc) are inference tasks that can support the bottom-up construction of knowledge bases in description logics. In description logics with existential restrictions, the most specific concept need not exist if one restricts the attention to concept descriptions or acyclic TBoxes. In this paper, we extend the notions les and msc to cyclic TBoxes. For the description logic EC (which allows for conjunctions, existential restrictions, and the top-concept), we show that the les and msc always exist and can be computed in polynomial time if we interpret cyclic definitions with greatest fixpoint semantics. 1 Introduction Computing the most specific concept of an individual and the least common subsumer of concepts can be used in the bottom-up construction of description logic (DL) knowledge bases. Instead of defining the relevant concepts of an application domain from scratch, this methodology allows the user to give typical examples of individuals belonging to the concept to be defined. These individuals are then generalized to a concept by first computing the most specific concept of each individual (i.e., the least concept description in the available description language that has this individual as an instance), and then computing the least common subsumer of these concepts (i.e., the least concept description in the available description language that subsumes all these concepts). The knowledge engineer can then use the computed concept as a starting point for the concept definition. The least common subsumer (les) in DLs with existential restrictions was investigated in [Baader et a/., 1999]. In particular, it was shown there that the les in the small DL EC (which allows for conjunctions, existential restrictions, and the top-concept) always exists, and that the binary les can be computed in polynomial time. Unfortunately, the most specific concept (msc) of a given ABox individual need not exist in languages allowing for existential restrictions or number restrictions. As a possible solution to this problem, Kiisters * Partially supported by the DFG under grant BA 1122/4-3. and Molitor [2001] show how the most specific concept can be approximated in EC and some of its extensions. Here, we follow an alternative approach: we extend the language by cyclic terminologies with greatest fixpoint (gfp) semantics, and show that the msc always exists in this setting. Of course, then one must also be able to compute the les w.r.t. cyclic terminologies with gfp-semantics. For the DL ACM (which allows for conjunctions, value restrictions, and number restrictions) it was shown in [Baader and Kusters, 1998] that the most specific concept always exists if one adds cyclic concept definitions with gfp-semantics. One reason for Kusters and Molitor to choose an approximation approach rather than an exact characterization of the most specific concept using cyclic definitions was that the impact of cyclic definitions in description logics with existential restrictions was largely unexplored. The paper [Baader, 2003a] is a first step toward overcoming this deficit. It considers cyclic terminologies in EC w.r.t. the three types of semantics (greatest fixpoint, least fixpoint, and descriptive semantics) introduced by Nebel [1991], and shows that the subsumption problem can be decided in polynomial time in all three cases. This is in stark contrast to the case of DLs with value restrictions. Even for the small DL F Lo (which allows conjunctions and value restrictions only), adding cyclic terminologies increases the complexity of the subsumption problem from polynomial (for concept descriptions) to PSPACE. The main tool in the investigation of cyclic definitions in EC is a characterization of subsumption through the existence of so-called simulation relations on the graph associated with an EL-terminology, which can be computed in polynomial time [Henzinger et al, 1995]. This characterization of subsumption can be used to characterize the les w.r.t. gfp-semantics via the product of this graph with itself (Section 4). This shows that, w.r.t. gfp semantics, the les always exists, and that the binary les can be computed in polynomial time. (The n-ary les may grow exponentially even in EC without cyclic terminologies [Baader et al., 1999].) The characterization of subsumption w.r.t. gfp-semantics can be extended to the instance problem in EC. This allows us to show that the msc in EC with cyclic terminologies interpreted with gfp semantics always exists, and can be computed in polynomial time (Section 5). In the next section, we introduce E L with cyclic terminolo- DESCRIPTION LOGICS 319

gics as well as the lcs and the msc. Then we recall the important definitions and results from [Baader, 2003a]. Section 4 formulates and proves the new results for the lcs, and Section 5 does the same for the msc. 2 Cyclic terminologies, least common subsumers, and most specific concepts Concept descriptions arc inductively defined with the help of a set of constructors, starting with a set of concept names and a set NR of role names. The constructors determine the expressive power of the DL. In this paper, wc restrict the attention to the whose concept descriptions arc formed using the constructors top-concept (T), conjunction, and existential restriction. The semantics of E L-concept descriptions is defined in terms of an interpretation. The d o m a i n i s a non-empty set of individuals and the interpretation function I maps each concept name to a subset and each role to a binary relation on. The extension of to arbitrary concept descriptions is inductively defined, as shown in the third column of Table 1. A terminology (or TBox for short) is a finite set of concept definitions of the form where A is a concept name and D a concept description. In addition, we require that TBoxcs do not contain multiple definitions, i.e., there cannot be two distinct concept descriptions such that both belongs to the TBox. Concept names occurring on the left-hand side of a definition are called defined concepts. All other concept names occurring in the TBox are called primitive concepts. Note that we allow for cyclic dependencies between the defined concepts, i.e., the definition of A may refer (directly or indirectly) to A itself. An interpretation I is a model of the TBox T iff it satisfies all its concept definitions, i.e., for all definitions in T. An ABox is a finite set of assertions of the form A(a) and r(a,b), where A is a concept name, r is a role name, and a, b are individual names from a set. Interpretations of ABoxes must additionally map each individual name to an element. An interpretation I is a model of the ABox A iff it satisfies all its assertions,for all concept assertions A(a) in A and x for all role assertions in A. The interpretation I is a model of the A Box A together with the iff it is a model of both T and A. The semantics of (possibly cyclic) we have defined above is called descriptive semantic by Nebel [1991]. For some applications, it is more appropriate to interpret cyclic concept definitions with the help of a fixpoint semantics. Example 1 To illustrate this, let us recall an example from LBaader, 2003a]: Here the intended interpretations arc graphs where we have nodes (elements of the concept Node) and edges (represented by the role edge), and we want to define the concept I node of all nodes lying on an infinite (possibly cyclic) path of the graph. In order to capture this intuition, the above definition must be interpreted with greatest fixpoint semantics. Before we can define greatest fixpoint semantics (gfpsemantics), we must introduce some notation. Let T be an E L-TBox containing the rolesthe primitive concepts and the defined concepts. A primitive interpretations J for T is given by a domain, an interpretation of the roles r N role by binary relations and an interpretation of the primitive concepts by subsets Obviously, a primitive interpretation differs from an interpretation in that it does not interpret the defined concepts in. We say that the interpretation X is based on the primitive interpretation J iff it has the same domain as J and coincides with J on N role and N prim. For a fixed primitive interpretation J, the interpretations X based on it are uniquely determined by the tuple of the interpretations of the defined concepts in We define is an interpretation based on Interpretations based on J can be compared by the following ordering, which realizes a pairwise inclusion test between the respective interpretations of the defined concepts: if Int(J), then It is easy to see that induces a complete lattice on Int (J), i.e., every subset of Jnt(J) has a least upper bound (lub) and a greatest lower bound (gib). Using Tarski's fixpoint theorem LTarski, 1955] for complete lattices, it is not hard to show LNebel, 1991] that, for a given primitive interpretation J, there always is a greatest model of T based on J. We call this models the greatest fixpoint model (gfp-model) of T. Greatest fixpoint semantics considers only gfp-models as admissible models. Definition 2 Let T be an let A, B be defined concepts occurring in T> and a an individual name occurring in A. Then, A is subsumed by B w.r.t. gfp-semantics iff holds for all gfp-models a is an instance of A w.r.t. gfp-semantics holds for all models X of A that are gfp-models of T. On the level of concept descriptions, the least common subsumer of two concept descriptions C, D is the least concept 320 DESCRIPTION LOGICS

description E that subsumes both C and D. An extensions of this definition to the level of (possibly cyclic) TBoxes is not completely trivial. In fact, assume that A 1,A 2 are concepts defined in the TBox T. It should be obvious that taking Proposition 5 Let T\ be an l-tbox containing the defined concepts A, B. Assume that T2 and T 2 are conservative extensions of T 1 such that the defined concept E in T 2 is a gfp-lcs of A, B in T\; the defined concept E' in T 2 is a gfp-lcs of A, B in T\; the sets of newly defined concepts in respectively T 2 and T 2 are disjoint. The notion "most specific concept" can be extended in a similar way from concept descriptions to concepts defined in a TBox. The intended interpretation is similar to the one in Example 1, with the only difference that now nodes may have colors, and we are interested in blue (red) nodes lying on an infinite path consisting of blue (red) nodes. Intuitively, the les of Bluelnode and Redlnode describes nodes lying on an infinite path (without any restriction on their color), i.e., the concept I node from Example 1 should be a definition of this les. However, this cannot be expressed by a simple concept description. It requires a new cyclic definition. Uniqueness up to equivalence of the most specific concept can be formulated and shown like uniqueness of the least common subsumer. 3 Characterizing subsumption In this section, we recall the characterizations of subsumption w.r.t. gfp-semantics developed in [.Baader, 2003a]. To this purpose, we must represent TBoxes and primitive interpretations by description graphs, and introduce the notion of a simulation on description graphs. In the case of concept descriptions, the les is unique up As shown in [Baader, 2003a], one can (without loss of generality) restrict the attention to normalized TBox. In the following, we thus assume that all TBoxes are normalized. Normalized L-TBoxes can be viewed as graphs whose nodes are the defined concepts, which are labeled by sets of primitive concepts, and whose edges are given by the existential restrictions. For the rest of this section, we fix a normalized TBoxes, this uniqueness property also holds (though its formulation is more complicated). DESCRIPTION LOGICS 321

2. There is a simulation Z: Gr ~ GT such that (B, A) Z. The theorem together with Proposition 9 shows that subsumption w.r.t. gfp-semantics in C is tractable. The proof of the theorem given in [Baader, 2003a] depends on a characterization of when an individual of a gfp-model belongs to a defined concept in this model. This proposition is also important in the proof of correctness of our characterization of the instance problem (Theorem 17). 4 Computing the Ics Conversely, every L-description graph can be viewed as representing either an C-T Box or a primitive interpretation. Simulations are binary relations between nodes of two Ldescription graphs that respect labels and edges in the sense defined below. It is easy to see that the set of all simulations from G\ to G 2 is closed under arbitrary unions. Consequently, there always exists a greatest simulation from G 1 to G 2 - If G 1, G2 are finite, then this greatest simulation can be computed in polynomial time [Henzinger et al, 1995]. As an easy consequence of this fact, the following proposition is proved in [Baader, 2003a]. Proposition 9 Let G 1, G 2 be two finite EL-description graphs, v 1 a node of G\ and V 2 a node of G 2 - Then we can decide in polynomial time whether there is a simulation Subsumption w.r.t. gfp-semantics corresponds to the existence of a simulation relation such that the subsumee simulates the subsumer: 5 Instance and most specific concepts One motivation for considering cyclic terminologies in L is the fact that the most specific concept of an ABox individual need not exist in C (without cyclic terminologies). An example is the simple cyclic ABox A := {r(b,b)}, where b has no most specific concept, i.e., there is no least Lconcept description D such that b is an instance of D w.r.t. A [Kiisters and Molitor, 2001]. However, if one allows for cyclic TBoxes with gfp-semantics, then the defined concept 322 DESCRIPTION LOGICS

DESCRIPTION LOGICS 323

6 Conclusion In [Baader, 2003a] we have shown that subsumption in remains polynomial if one allows for cyclic terminologies with greatest fixpoint (gfp) semantics. In Section 5.1 of this paper we have complemented this result by showing that the instance problem in with cyclic terminologies interpreted with gfp-semantics is also polynomial. Thus, all the standard inferences in remain polynomial if one allows for cyclic terminologies with gfp-semantics. Our main motivation for considering cyclic terminologies with gfp-semantics in was that the most specific concept of an ABox-individual then always exists. In fact, we have shown in this paper that both the least common subsumer (les) and the most specific concept (msc) can be computed in polynomial time in with cyclic terminologies interpreted with gfp-semantics. Thus, also two of the most important non-standard inferences in DLs [Kusters, 2001] remain polynomial in this context. It should be noted that there are indeed applications where the expressive power of the small DL appears to be sufficient. In fact, SNOMED, the Systematized Nomenclature of Medicine [Cote et al, 1993] uses [Spackman, 2000; 2001]. Subsumption [Baader, 2003a] and the instance problem [Baader, 2003b] are also polynomial w.r.t. descriptive semantics. For the les, descriptive semantics is not that wellbehaved: in [Baader, 2003] we have shown that w.r.t. descriptive semantics the les need not exist in with cyclic terminologies. In addition, we could only give a sufficient condition for the existence of the les. If this condition applies, then the les can be computed in polynomial time. In [Baader, 2003b] similar results are shown for the msc w.r.t. descriptive semantics. One problem left for future research is the question of how to obtain a decidable characterization of the cases in which the les (msc) exists w.r.t. descriptive semantics, and to determine whether in these cases it can always be computed in polynomial time. References [Baader, 2002] F. Baader. Least common subsumers, most specific concepts, and role-value-maps in a description logic with existential restrictions and terminological cycles. LTCS-Report 02-07, TU Dresden, 2002. See http://lat.inf.tu-dresden.de/research/reports.html. [Baader, 2003] F. Baader. Computing the least common subsumer in the description logic L w.r.t. terminological cycles with descriptive semantics. In Proc. ICCS'03, Springer LNAI, 2003. [Baader, 2003a] F. Baader. Terminological cycles in a description logic with existential restrictions. In Proc. IJ- CAl'03,2003. [Baader, 2003b] Franz Baader. The instance problem and the most specific concept in the description logic C w.r.t. terminological cycles with descriptive semantics. LTCS- Report 03-01, TU Dresden, 2003. See http://lat.inf.tudresden.de/research/reports.html. [Baader and Kusters, 1998] F. Baader and R. Kiisters. Computing the least common subsumer and the most specific concept in the presence of cyclic ALN-concept descriptions. In Proc. KI'98, Springer LNAI 1504, 1998. [Baader et al, 1999] F. Baader, R. Kusters, and R. Molitor. Computing least common subsumers in description logics with existential restrictions. In Proc. IJCAI'99, 1999. [Cote et al, 1993] R. Cote, D. Rothwell, J. Palotay, R. Beckett, and L. Brochu. The systematized nomenclature of human and veterinary medicine. Technical report, SNOMED International, Northfield, IL: College of American Pathologists, 1993. [Henzinger et a/., 1995] M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. Computing simulations on finite and infinite graphs. In 36th Annual Symposium on Foundations of Computer Science, 1995. IEEE Computer Society Press. [Kusters, 2001] R. Kusters. Non-standard Inferences in Description Logics, Springer LNAI 2100, 2001. [Kusters and Molitor, 2001] R. Kusters and R. Molitor. Approximating most specific concepts in description logics with existential restrictions. In Proc. KI 2001, Springer LNAI 2174,2001. [Nebel, 1991] B. Nebel. Terminological cycles: Semantics and computational properties. In J. F. Sowa, editor, Principles of Semantic Networks. Morgan Kaufmann, 1991. [Spackman, 2000] K. Spackman. Managing clinical terminology hierarchies using algorithmic calculation of subsumption: Experience with SNOMED-RT. J. of the American Medical Informatics Association, 2000. Fall Symposium Special Issue. [Spackman, 2001] K. Spackman. Normal forms for description logic expressions of clinical concepts in SNOMED RT. J. of the American Medical Informatics Association, 2001. Symposium Supplement. [Tarski, 1955] A. Tarski. A lattice-theoretical fixpoint theorem and its applications. Pacific Journal of Mathematics, 5, 1955. 324 DESCRIPTION LOGICS