Module-theoretic properties of reachability modules for SRIQ

Similar documents
A normal form for hypergraph-based module extraction for SROIQ

Axiom Dependency Hypergraphs for Fast Modularisation and Atomic Decomposition

Modular Combination of Reasoners for Ontology Classification

Improved Algorithms for Module Extraction and Atomic Decomposition

Modular Reuse of Ontologies: Theory and Practice

Role-depth Bounded Least Common Subsumers by Completion for EL- and Prob-EL-TBoxes

The Concept Difference for EL-Terminologies using Hypergraphs

MORe: Modular Combination of OWL Reasoners for Ontology Classification

Ontology Reuse: Better Safe than Sorry

Extracting Modules from Ontologies: A Logic-based Approach

Root Justifications for Ontology Repair

High Performance Absorption Algorithms for Terminological Reasoning

CEX and MEX: Logical Diff and Semantic Module Extraction in a Fragment of OWL

Just: a Tool for Computing Justifications w.r.t. ELH Ontologies

Tractable Extensions of the Description Logic EL with Numerical Datatypes

Dismatching and Local Disunification in EL

A Logical Framework for Modularity of Ontologies

Extending Logic Programs with Description Logic Expressions for the Semantic Web

Complexity of Axiom Pinpointing in the DL-Lite Family

Minimal Module Extraction from DL-Lite Ontologies using QBF Solvers

Ontology Partitioning Using E-Connections Revisited

Description Logics. Glossary. Definition

Finite Model Reasoning in Horn-SHIQ

Revision of DL-Lite Knowledge Bases

How to Contract Ontologies

Modularity in DL-Lite

Belief Contraction for the Description Logic EL

Quasi-Classical Semantics for Expressive Description Logics

Fuzzy Description Logics

Armstrong Relations for Ontology Design and Evaluation

Consequence-Based Reasoning beyond Horn Ontologies

Complexities of Horn Description Logics

Complexities of Horn Description Logics

Chapter 2 Background. 2.1 A Basic Description Logic

LTCS Report. Module Extraction and Incremental Classification: A Pragmatic Approach for EL + Ontologies. Boontawee Suntisrivaraporn.

Completing Description Logic Knowledge Bases using Formal Concept Analysis

Phase 1. Phase 2. Phase 3. History. implementation of systems based on incomplete structural subsumption algorithms

A New Approach to Knowledge Base Revision in DL-Lite

Optimisation of Terminological Reasoning

Closed World Reasoning for OWL2 with Negation As Failure

Tight Complexity Bounds for Reasoning in the Description Logic BEL

Concept Model Semantics for DL Preferential Reasoning

Completeness Guarantees for Incomplete Reasoners

Gödel Negation Makes Unwitnessed Consistency Crisp

Revising General Knowledge Bases in Description Logics

The Bayesian Ontology Language BEL

Conservative Extensions in Expressive Description Logics

Nonmonotonic Reasoning in Description Logic by Tableaux Algorithm with Blocking

A Refined Tableau Calculus with Controlled Blocking for the Description Logic SHOI

Complexity of Subsumption in the EL Family of Description Logics: Acyclic and Cyclic TBoxes

Adding Context to Tableaux for DLs

Ontology Module Extraction via Datalog Reasoning

Extending Unification in EL towards General TBoxes

Context-based defeasible subsumption for dsroiq

Generating Armstrong ABoxes for ALC TBoxes

On the Complexity of Semantic Integration of OWL Ontologies

Modularity and Web Ontologies

Restricted role-value-maps in a description logic with existential restrictions and terminological cycles

All Elephants are Bigger than All Mice

Pushing the EL Envelope Further

Bayesian Description Logics

A Goal-Oriented Algorithm for Unification in EL w.r.t. Cycle-Restricted TBoxes

Principles of Knowledge Representation and Reasoning

Inconsistencies, Negations and Changes in Ontologies

Analysing Multiple Versions of an Ontology: A Study of the NCI Thesaurus

LTCS Report. Subsumption in Finitely Valued Fuzzy EL. Stefan Borgwardt Marco Cerami Rafael Peñaloza. LTCS-Report 15-06

Contraction and Revision over DL-Lite TBoxes

Semantic Import: An Approach for Partial Ontology Reuse ( )

Decidability of Description Logics with Transitive Closure of Roles in Concept and Role Inclusion Axioms

DL-Lite Contraction and Revision

Technische Universität Dresden. Fakultät Informatik EMCL Master s Thesis on. Hybrid Unification in the Description Logic EL

On Concept Forgetting in Description Logics with Qualified Number Restrictions

Ontology Evolution Under Semantic Constraints

Adaptive ALE-TBox for Extending Terminological Knowledge

Unification in Description Logic EL without top constructor

OntoRevision: A Plug-in System for Ontology Revision in

University of Oxford. Consequence-based Datatype Reasoning in EL: Identifying the Tractable Fragments

Ontology Evolution Under Semantic Constraints

Towards defeasible SROIQ

LTCS Report. A finite basis for the set of EL-implications holding in a finite model

About Subsumption in Fuzzy EL

Formal Properties of Modularisation

An Automata-Based Approach for Subsumption w.r.t. General Concept Inclusions in the Description Logic FL 0

Lethe: Saturation-Based Reasoning for Non-Standard Reasoning Tasks

Modeling Ontologies Using OWL, Description Graphs, and Rules

Completing Description Logic Knowledge Bases using Formal Concept Analysis

The Description Logic ABox Update Problem Revisited

OntoMerge: A System for Merging DL-Lite Ontologies

LTCS Report. Decidability and Complexity of Threshold Description Logics Induced by Concept Similarity Measures. LTCS-Report 16-07

Lightweight Description Logics: DL-Lite A and EL ++

Exact Learning of TBoxes in EL and DL-Lite

Reasoning in the Description Logic BEL using Bayesian Networks

Yet Another Proof of the Strong Equivalence Between Propositional Theories and Logic Programs

Reasoning in Expressive Gödel Description Logics

On Subsumption and Instance Problem in ELH w.r.t. General TBoxes

COMPUTING LOCAL UNIFIERS IN THE DESCRIPTION LOGIC EL WITHOUT THE TOP CONCEPT

Ecco: A Hybrid Diff Tool for OWL 2 ontologies

Knowledge Representation for the Semantic Web Lecture 2: Description Logics I

Foundations for the Logical Difference of EL-TBoxes

Transcription:

Module-theoretic properties of reachability modules for SRIQ Riku Nortje, Katarina Britz, and Thomas Meyer Center for Artificial Intelligence Research, University of KwaZulu-Natal and CSIR Meraka Institute, South Africa Email: {rnortje;abritz;tmeyer}@csir.co.za Abstract. In this paper we investigate the module-theoretic properties of and -reachability modules in terms of inseparability relations for the DL SRIQ. We show that, although these modules are not depleting or self-contained, they share the robustness properties of syntactic locality modules and preserve all justifications for an entailment. Introduction Modularization plays an important part in the design and maintenance of large scale ontologies. Modules are loosely defined as subsets of ontologies that cover some topic of interest, where the topic of interest is defined by a set of symbols. Extracting minimal modules is computationally expensive and even undecidable for expressive DLs [3, 4]. Therefore, the use of approximation techniques and heuristics plays an important role in the efficient design of algorithms. Syntactic locality [3, 4], because of its excellent model theoretic properties, has become an ideal heuristic and is widely used in a diverse set of algorithms [4, 2, 5]. Suntisrivaraporn [4] showed that, for the DL EL +, -locality module extraction is equivalent to the reachability problem in directed hypergraphs. Nortjé et al. [0, ] extended the reachability problem to include -locality and introduced bidirectional reachability modules as a subset of -locality modules. This work was further extended to the DL SROIQ Nortje et al. [2] who showed that extracting -reachability modules is equivalent to extracting frontier graphs in hypergraphs. Reachability modules are not only of importance in hypergraph-based reasoning support for TBoxes [2], but are potentially smaller than syntactic locality modules. In this paper we investigate the module-theoretic properties of reachability modules for the DL SRIQ. We show that these modules are not self-contained or depleting but they are robust under vocabulary restrictions, vocabulary extensions, replacement and joins. By showing that reachability modules preserve all justifications for entailments, we show that depleting modules are sufficient for preserving all justifications but not necessary. In Section 2 we give a brief introduction to the DL SRIQ and modularization as defined by inseparability relations. Section 3 introduces a normal form for SRIQ TBoxes as well as the rules necessary to transform any such TBox to

normal form. In Section 4 we introduce both - and reachability modules and investigate all their module theoretic properties in terms of inseperability relations. All proofs of the work presented appears in the accompanying appendix. Lastly in Section 5 we conclude this paper with a short summary of the results. 2 Background In this section we give a brief introduction to modularization and the DL SRIQ [7] with its syntax and semantics listed in Table 2. N C and N R denote disjoint sets of atomic concept names and role names. The set N R includes the universal role whilst N C excludes the and concepts. For a complete definition of SRIQ, refer to Horrocks et al. [7], and for Description Logics refer to []. Constructs Syntax Semantics atomic concept C C I I, C N C role R R I I I, R N R inverse role R R I = {(y, x) (x, y) R I }, R N R universal role U U I = I I role composition R... R n {(x, z) (x, y ) R I (y, y 2) R2 I... (y n, z) Rn, I n 2, R i N R} top I = I bottom I = negation C ( C) I = I \ C I conjunction C C 2 (C C 2) I = C I C2 I disjunction C C 2 (C C 2) I = C I C2 I exist restriction R.C {x ( y)[(x, y) R I y C I ]} value restriction R.C {x ( y)[(x, y) R I y C I ]} self restriction R.Self {x (x, x) R I } atmost restriction nr.c {x #{y (x, y) R I y C I } n} atleast restriction nr.c {x #{y (x, y) R I y C I } n} Axiom Syntax Semantics concept inclusion C C 2 C I C2 I role inclusion R... R n R n+ (R... R n) I R I, n reflexivity Ref(R) {(x, x) x I } R I irreflexivity Irr(R) {(x, x) x I } R I = disjointness Dis(R, S) S I R I = Table. Syntax and semantics of SRIQ Module extraction is the process of extracting subsets of axioms from TBoxes that are self contained with respect to some criteria. These sets of axioms, called modules, may be used for various purposes such as reuse, optimization and error pinpointing amongst others [4, 4]. Definition. (Module for the arbitrary DL L) Let L be an arbitrary description language, O an L ontology, and σ a statement formulated in L. Then, 2

O O is a module for σ in O(a σ-module in O) whenever: O = σ if and only if O = σ. Definition is sufficiently general so that any subset of an ontology preserving a statement of interest is considered a module, the entire ontology is therefore a module in itself. Different use cases usually result in different notions of what the definition and characteristics of a module should be. Modules are often defined via the notion of conservative extensions. Given some signature (a set of concept and role names) and a set of axioms, a conservative extension of this set is simply one that implies all the same consequences over the signature. More formally: Definition 2. (Conservative extension [4]) Let T and T be two TBoxes such that T T, and let be a signature. Then T is a -conservative extension of T if, for every α with Sig(α), we have T = α iff T = α. T is a conservative extension of T if T is a -conservative extension of T for = Sig(T ). Given that both sets of axioms imply the same consequences for a given signature we may then use the smaller set whenever we wish to reason over this signature. A closely related notion to conservative extensions is that of inseparability. Definition 3. [3] T and T 2 are -concept name inseparable, written T c T 2, if for all - concept names C, D, it holds that T = C D if and only if T 2 = C D. Definition 4. [3] T and T 2 are -subsumption inseparable, written T s T 2, if for all terms X, Y that are concepts or roles over, it holds that T = X Y if and only if T 2 = X Y. Definition 5. [3] Let T be a TBox, M T, S an inseparability relation and a signature. We call M an S -module of T if M S T. a self-contained S -module of T if M S Sig(M) T. a depleting S -module of T if S Sig(M) T \ M. Modules may therefore be characterized by some inseparability criteria. It is of interest how modules defined this way would behave under different use case scenarios. For this purpose, several properties of inseparability relations [8] have been investigated in the literature, which allows us to compare different definitions of modules. Given a TBox T and a module M T for a signature, we are interested in the following inseparability properties: 3

Robustness under vocabulary restrictions implies that when we wish to restrict the symbols from further we do not need to import a different module and may continue to use M. Robustness under vocabulary extension implies that should we wish to add new symbols to that do not appear in T we do not need to use a different module but may use M. Robustness under replacement ensures that the result of importing M into a TBox T is a module of the result of importing T into T. This is also called module coverage and refers to the fact that importing a module does not affect its property of being a module. Robusness under joins implies that if T and T are inseparable w.r.t. and all the terms they share are from, then each of them are inseparable with their union w.r.t.. More formally: Definition 6. [8] The inseparability relation S is called robust under vocabulary restrictions if, for all TBoxes T, T 2 and all signatures, with, the following holds: if T S T 2, then T S T 2. robust under vocabulary extensions if, for all TBoxes T, T 2 and all signatures, with (Sig(T ) Sig(T 2 )), the following holds: if T S T 2, then T S T 2. robust under replacement if, for all TBoxes T, T 2 and all signatures and every TBox T with Sig(T ) (Sig(T ) Sig(T 2 )), the following holds: if T S T 2 then T T S T 2 T. robust under joins if, for all TBoxes T, T 2 and all signatures with Sig(T ) Sig(T 2 ), if T S T 2 then T i S T T 2, for i =, 2. 3 Normal Form In this section we introduce a normal form for SRIQ TBoxes. We utilize normalization in order to simplify the definitions, to ease the understanding of the work that follows, as well as to simplify the presentation of proofs. Definition 7. Given B i (N C { }), C i (N C { }), D { R.B, nr.b, R.Self}, with R, S, R i, S i role names from N R or their inverses and n, a SRIQ TBox T is in normal form if every axiom α T is in one of the following forms: α : B... B n C... C m α 2 : D C... C m α 3 : B... B n D α 4 : R... R n R n+ α 5 : R R 2 α 6 : D D 2 α 7 : Dis(R, R 2 ) 4

In order to normalize a SRIQ TBox T we repeatedly apply the normalization rules from Table 2. Each application of a rule rewrites an axiom into its equivalent normal form. It is easy to see that the application of every rule ensures that the normalized TBox is a conservative extension of the original. We note that the SRIQ axiom Ref(R) is represented by its equivalent R.Self and Irr(R) by R.Self []. Table 2. SRIQ normalization rules NR ˆB Ĉ 2 Ĉ ˆB Ĉ Ĉ2 NR2 ˆB Ĉ ˆB 2 ˆB ˆB 2 Ĉ NR3 ˆB ˆD Ĉ ˆB A Ĉ, ˆD A, A ˆD NR4 ˆB Ĉ ˆD ˆB Ĉ A, ˆD A, A ˆD NR5 ˆB Ĉ Ĉ2 ˆB Ĉ, ˆB Ĉ 2 NR6 ˆB ˆB 2 Ĉ ˆB Ĉ, ˆB2 Ĉ NR7... R.Ĉ...... R.A..., A Ĉ, A Ĉ NR8... R. ˆD...... R.A..., ˆD A, A ˆD NR9... nr. ˆD...... nr.a..., ˆD A, A ˆD NR0... nr.ĉ...... ( (n + )R.Ĉ)... NR ˆB Ĉ ˆB Ĉ,Ĉ ˆB NR2 0R.B Ĉ Ĉ NR3 ˆB R. ˆB NR4 ˆB nr. ˆB NR5 ˆB 0R.B NR6 nr. Ĉ NR7 R. Ĉ NR8 ˆB Ĉ NR9 Ĉ NR20 ˆB Ĉ NR2 ˆB Above A is a new concept name not in N C, ˆBi and Ĉi are possibly complex concept descriptions and ˆD a complex concept description. R N R or it s inverse, n 0 Theorem. Exhaustively applying the rules from Table 2 to any SRIQ TBox T results in a SRIQ TBox T in normal form. The normalization process can be completed in linear time in the number of axioms. Example. Let α = B C, and α 2 = A B. Then, α may be normalized by application of rule NR2 to α N = B C since C = C. α 2 may be normalized by application of rule NR to α2 N = B A since A = A. For the rest of this paper we assume that all TBoxes are in normal form. 5

4 Reachability Modules Deciding conservative extensions has been shown to be computationally expensive or even undecidable for relatively inexpressive DLs. Therefore, an approximation of these modules, based on syntax, called syntactic locality modules [4] has been introduced. Given a normalized TBox T, the definition of syntactic locality can be simplified to the following: Definition 8. (Normalized Syntactic Locality) Let be a signature and T a normalized SRIQ TBox. An axiom α is -local w.r.t. ( -local w.r.t ) if α Ax() (α Ax() ), as defined in the grammar: -Locality Ax() ::= C C w R Dis(S, S) Dis(S, S ) Con () ::= A C C C C R.C R.C R.Self nr.c nr.c -Locality Ax() ::= C C w R Con () ::= A C C C C R.C nr.c R.Self In the grammar, we have that A, A is an atomic concept, R (resp. S ) is either an atomic role (resp. a simple atomic role) not in or the inverse of an atomic role (resp. of a simple atomic role) not in, C is any concept, R is any role, S is any simple role, and C Con (), C Con (). We also denote by w a role chain w = R... R n such that for some i with i n, we have that R i is (possibly inverse of) an atomic role not in. A TBox T is -local ( -local) w.r.t. if α is -local ( -local) w.r.t. for all α T. Algorithm may be used to extract both the minimal and -locality based modules for a signature S. A variant of -syntactic locality modules called -reachability based modules [4] is based on the reachability problem in directed hypergraphs. Hypergraphs [9, 5] are a generalization of graphs and have been studied extensively since the 970s as a powerful tool for modelling many problems in Discrete Mathematics. We extend the work done by Nortje et al.[] and define reachability for SRIQ TBoxes. We then continue to show that these modules share all the robustness properties of locality modules and therefore is well suited to be used in the ontology reuse scenario. Definition 9. ( -Reachability) Let T be a SRIQ TBox in normal form and Sig(T ) a signature. The set of -reachable names in T w.r.t., denoted by T, is defined inductively as follows: For every x ( { }) we have x T. For every inclusion axiom (α L α R ) T, if Sig(α L ) T y Sig(α R ) is also in T. 6 then every

Algorithm (Module Extraction Algorithm for SRIQ [2]) Procedure extract module(o, S) Input: O: ontology; S: signature; Output: O : a module for S in O : O :=, O 2 := O 2 : while not empty(o 2) do 3 : α := select axiom(o 2) 4 : if local(α, S Sig(O )) then 5 : O 2 := O 2 \ {α} 6 : else 7 : O := O {α} 8 : O 2 := O \ O 9 : end if 0 : end while : return O Every axiom α := α L α R such that Sig(α L ) T Axioms of the form Dis(R, S) T are T The set of all T reachability module for T over. we call T -reachable. -reachable whenever {R, S} T. -reachable axioms is denoted by T and is called the - It is self-evident from Definition 8 that an axiom is -reachable w.r.t exactly when it is not -local w.r.t.. Similarly we define an axiom to be -reachable exactly when it is not -local. Definition 0. ( -Reachability) Let T be a SRIQ TBox in normal form and Sig(T ) a signature. The set of -reachable names in T w.r.t., denoted by T, is defined inductively as follows: For every x ( ) we have that x T. For all inclusion axioms (α L α R ) T, if α R =, or α R is of the form A... A n and all A i T, or α R has any other form and there exists some x Sig(α R ) T then every y Sig(α L ) is also in T. Every axiom α := α L α R such that, α R =, or α R is of the form A... A n and all A i T, or α R has any other form and there exists some x Sig(α R ) T, we call T -reachable. All axioms of the form Dis(R, S) T are always T -reachable and {R, S} T. The set of all T -reachable axioms is denoted by T and is called the -reachability module for T over. It is easy to show that -reachability modules are equivalent to -locality modules. However, by the definition of -reachability we observe that these are not equivalent to -locality modules. 7

Example 2. Let T be a TBox such that T = {α, α 2, α 3, α 4 }, with α := A r.d, α 2 := B nr.d 2, α 3 := r. C, α 4 := D D 2 and let = {C}. Then T = {α, α 2, α 3 } but the -locality module for T w.r.t. is {α, α 2, α 3, α 4 }. The difference stems from the fact that in α and α 2 the -reachability of r does not ensure the -reachability of D and D 2 respectively. This occurs because, given an axiom α = α L α R, -locality ensure that if α is -local then so are all of the symbols in Sig(α), whereas -reachability is defined such that the -reachability of α only guarantees that all symbols of α L and only some symbols of α R will be -reachable. Thus -reachability based modules are at most the size of -locality modules but in general could be substantially smaller. Similar to -locality modules we note that reachability module extraction may also be alternated until a fixpoint is reached. These modules are denoted by T. In order to investigate the module-theoretic properties of reachability modules, we follow a similar approach to Sattler et al. [3] and define inseparability different from that of conservative extensions. We say that T and T 2 are inseparable if their modules are equivalent, that is, a module extraction algorithm returns the same output for each of them. We define the following inseparability relations for reachability modules: Definition. Let T and T 2 be TBoxes and a signature. Then T and T 2 are: reachability inseparable, denoted by T T 2, if T = T 2 reachability inseparable, denoted by T T 2, if T = T 2 reachability inseparable, denoted by T T 2. ; ; T 2, if T = Firstly we show that -reachability modules are subsumption inseparable. Concept inseparability follows as a special case of subsumption inseparability. Lemma. Let T be a SRIQ TBox, and Sig(T ) a signature. Then T = C D if and only if T = C D for arbitrary SRIQ concept descriptions C and D such that Sig(C) Sig(D). Corollary. Let T be a normalized SRIQ TBox, Sig(T ) a signature and S an inseparability relation from Definitions 3 and 4. Then T S T. T is therefore a S -module of T. We show by way of counter example that T depleting S module of T when T Sig(T ). is not a self-contained or Example 3. Let T be a TBox such that T = {α = A r.d, α 2 = B nr.d 2, α 3 = r. C, α 4 = D D 2 }, and let = {C}. Then T = {α, α 2, α 3 }, δ = Sig(T ) = {A, B, C, r, D, D 2 } T. But T = D D 2 and T = D D 2. Therefore T is not a self-contained c -module of T. Similarly, T \ T = α 4 with = D, D 2 and D, D 2 δ. Therefore, T is not a depleting c -module of T. 8

Before investigating the robustness properties of reachability modules we introduce some lemmas to aid us in the proofs that follow. Lemma 2. Let α be an axiom, and be signatures, x {, } and T a SRIQ TBox. Then:. If and α is not T reachable, then α is not T reachable. 2. If Sig(α) and α is not reachable then α is not reachable. Lemma 3. Let α be an axiom, and be signatures, x {, } and T, T SRIQ TBoxes. Then:. Given T = T, if then T 2. If Sig(T ), then T T 3. If T T, then T T.. = T. In particular T T. Lemma 4. Let be an signature, T and T 2 be SRIQ TBoxes with Sig(T ) Sig(T 2 ) and x {, }. Then (T T 2 ) = T T 2. Proposition. For x {, }, x-reachability is robust under replacement. Proposition 2. For x {, }, x-reachability is robust under vocabulary extensions. Proposition 3. For x {, }, x-reachability is robust under vocabulary restrictions. Proposition 4. For x {, }, x-reachability is robust under joins. Reachability modules therefore share all the robustness properties listed. However, we have seen that these modules are neither depleting nor self-contained modules. Amongst other things, the depleting and self-contained nature of modules are utilised in order to find all MinAs (justifications) for an entailment [6]. Definition 2. Let T be a SRIQ Tbox and M T. M is a MinA (justification) for T = C D if M = C D and there exists no M M such that M = C D. We show that although our modules do not share these properties they do contain all MinAs for a given signature. Theorem 2. Let T be a normalized SRIQ TBox and a signature such that Sig(T ). Then for arbitrary concept descriptions C, D, such that T = C D and Sig(C) Sig(D) T we have that T contains all MinAs for T = C D. The proof to show that T modules share all the robustness properties of T modules follows from the above lemmas and follows the proof for - locality modules by Sattler, et al. [3]. 9

5 Conclusion We have investigated the module-theoretic properties of reachability modules for SRIQ TBoxes. Reachability modules differ from syntactic locality modules in that they are not self-contained or depleting. One application of the self-contained and depleting nature of locality modules is the finding of all justifications for entailments. However, in terms of finding justifications, by showing that reachability modules do preserve all justifications for entailments, we have shown that these properties are sufficient but that they are not necessary. We did preliminary investigations into the size difference between locality and reachability modules. We extracted a random sample of 000 modules from each of the Pizza, Nci, Nap and Xylocopa v4 ontologies. Reachability modules were between 2.5% and 33% smaller than locality modules with an average of 22% reduction in size across all ontologies tested. Our focus for future research is to do an in-depth empirical evaluation on differences with respect to size and performance between extracting reachability modules for SRIQ and existing syntactic locality methods. We also plan to extend these results to SROIQ. Acknowledgments We wish to thank the reviewers for their contributions and insightful comments. This work was partially funded by Project number 24760, Net2: Network for Enabling Networked Knowledge, from the FP7-PEOPLE- 2009-IRSES call. This work is based upon research supported in part by the National Research Foundation of South Africa (UID 85482, IFR2003270008). References. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The description logic handbook: theory, implementation, and applications. Cambridge University Press, New York, NY, USA (2003) 2. Cuenca Grau, B., Halaschek-Wiener, C., Kazakov, Y., Suntisrivaraporn, B.: Incremental Classification of Description Logic Ontologies. Tech. rep. (202) 3. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Just the right amount: extracting modules from ontologies. In: Williamson, C., Zurko, M. (eds.) Proceedings of the 6th International Conference on World Wide Web (WWW 07). pp. 77 726. ACM, New York, NY, USA (2007) 4. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular Reuse of Ontologies: Theory and Practice. Journal of Artificial Intelligence Research (JAIR) 3, 273 38 (2008) 5. Del Vescovo, C., Parsia, B., Sattler, U., Schneider, T.: The modular structure of an ontology: atomic decomposition and module count. In: O. Kutz, T.S. (ed.) Proc. of WoMO-. Frontiers in AI and Appl., vol. 230, pp. 25 39. IOS Press (20) 6. Horridge, M., Parsia, B., Sattler, U.: Laconic and precise justifications in owl. In: In Proc. of ISWC-08, volume 538 of LNCS. pp. 323 338 (2008) 7. Horrocks, I., Kutz, O., Sattler, U.: The irresistible SRIQ. In: In Proc. of OWL: Experiences and Directions (2005) 0

8. Konev, B., Lutz, C., Walther, D., Wolter, F.: Modular ontologies. chap. Formal Properties of Modularisation, pp. 25 66. Springer-Verlag, Berlin, Heidelberg (2009) 9. Nguyen, S., Pretolani, D., Markenzon, L.: On some path problems on oriented hypergraphs. Theoretical Informatics and Applications 32(-2-3), 20 (998) 0. Nortjé, R.: Module extraction for inexpressive description logics. Master s thesis, University of South Africa (20). Nortjé, R., Britz, K., Meyer, T.: Bidirectional reachability-based modules. In: Proceedings of the 20 International Workshop on Description Logics (DL20). CEUR Workshop Proceedings, CEUR-WS (20), http://ceur-ws.org 2. Nortjé, R., Britz, K., Meyer, T.: A normal form for hypergraph-based module extraction for SROIQ. In: Gerber, A., Taylor, K., Meyer, T., Orgun, M. (eds.) Australasian Ontology Workshop 2009 (AOW 2009). Ceur-ws, vol. 969, pp. 40 5. CEUR, Melbourne, Australia (202), http://ceur-ws.org/vol-969/proceedings.pdf 3. Sattler, U., Schneider, T., Zakharyaschev, M.: Which kind of module should I extract? In: Grau, B.C., Horrocks, I., Motik, B., Sattler, U. (eds.) Description Logics. CEUR Workshop Proceedings, vol. 477. CEUR-WS.org (2009) 4. Suntisrivaraporn, B.: Polynomial-Time Reasoning Support for Design and Maintenance of Large-Scale Biomedical Ontologies. Ph.D. thesis, Technical University of Dresden (2009) 5. Thakur, M., Tripathi, R.: Complexity of Linear Connectivity Problems in Directed Hypergraphs. Linear Connectivity Conference pp. 2 (200)

A Proofs for Theorems and Lemmas Theorem Exhaustively applying the rules from Table 2 to any SRIQ TBox T results in a SRIQ TBox T in normal form. The normalization process can be completed in linear time in the number of axioms. Proof: We show that any SRIQ TBox can be converted to an equivalent normal form as follows: Step - -elimination: Rule NR may be applied at most once for each axiom in the TBox. No other rule introduces new axioms that contain equivalences. Therefore the elimination of all equivalences from the TBox will require linear time and add at most a linear number of axioms. Step 2 - -elimination: Applying rule NR7 to every occurrence of a universal restriction in any axiom, irrespective of order, will result in the elimination of all universal restrictions within that axiom. Nested restrictions are handled recursively as they are removed and inserted into the added axioms as Ĉ. There are a constant number of universal restrictions per axiom and therefore the application of rule NR7 will run in constant time, with each application adding a constant number of new axioms. Therefore eliminating all universal restrictions across all axioms will require linear time and add at most a linear number of new axioms. Step 3 - -elimination: Applying rule NR0 to every occurrence of an atmost restriction in any axiom, irrespective of order, will result in the elimination of all atmost restrictions within that axiom. Nested restrictions are handled recursively as needed. There are a constant number of atmost restrictions per axiom and therefore the application of rule NR0 will run in constant time. Therefore eliminating all atmost restrictions across all axioms will require linear time. Step 4 - Complex role-filler elimination: At the start of this step there are no universal or at most restrictions. We eliminate all complex role fillers by applying rules NR8 and NR9 to all axioms. There are a constant number of complex role fillers per axiom, therefore the application of these rules requires constant time per axiom and will add at most a constant number of axioms. Therefore, removing all complex role fillers from an ontology by using rules NR8 and NR9 will require at most linear time and add a linear number of new axioms. Step 5 - -elimination and, -simplification: At this step there are no existential restriction or at-least restriction with complex role fillers. Given any axiom α = (α L α R ), in a left to right fashion we apply the rules as follows:. Apply rules NR, NR3, NR6 to α L until α L consists of an existential restriction, an at-least restriction or the conjunction of concept names. There are a constant number of complex concept description, negations, disjunctions and conjunctions in α L, each of these rules either eliminates a negation, removes a complex concept description or eliminates a disjunction. There are a constant number of times each of these operations 2

may be applied. Rules NR3 and NR6 each add a constant number of axioms. Therefore, α L can be processed in constant time for each axiom. 2. Apply rules NR2, NR4, NR5 to α R until α R consists of an existential restriction, an at-least restriction or a disjunction of concept names. There are a constant number of times each of these operations may be applied. Rules NR4 and NR5 each add a constant number of axioms. Therefore, α L can be processed in constant time for each axiom. These rules are applied repeatedly until no further processing may be applied to either α L and α R. Since each step can be completed in constant time and add at most a constant number of new axioms, normalization can be completed in linear time in the number of axioms with at most a linear increase in the number of axioms. Lemma Let T be a SRIQ TBox, and Sig(T ) a signature. Then T = C D if and only if T = C D for arbitrary SRIQ concept descriptions C and D such that Sig(C) Sig(D). Proof: We have to prove two parts. First: If T = C D then T = C D. This follows directly from the fact that T T and that SRIQ is monotonic. Conversely, we show that, if T = C D then T = C D. Assume the contrary, that is, assume T = C D but that T = C D. Then there must exist an interpretation I and an individual w I such that I is a model of T and w C I \ D I. Modify I to I by setting x I := I for all concept names x Sig(T ) \ T, and r I := I I for all roles names r Sig(T )\ and leaving everything else unchanged. We show that I is a model of. For all α := α L α R, with α T, we have that: T If α R is such that Sig(α R ) T we have that (α R ) I = (α R ) I since it does not change the interpretation of any symbols. If α R is an existential restriction of the form r.a with y Sig(α R ) \ T, then (y) I = I or (y) I = I I depending on whether y is a role or concept name. In both cases we have that (α R ) I (α R ) I. If α R is an at-least restriction of the form nr.a with y Sig(α R ) \ T, then (y) I = I or (y) I = I I depending on whether y is a role or concept name. In both cases we have that (α R ) I (α R ) I. If α R is of the form R.Self with R T we have that (α R ) I = (α R ) I since it does not change the interpretation of the symbol R. If α is of the form Dis(R, S) then by definition it is always in T, thus R, S T. Therefore, the interpretation of alpha does not change. In all cases (α L ) I = (α L ) I since α T (α L ) I (α R ) I. Thus, I is a model for T T \ T we have: α R is a concept name and αr I = I, or α R is a role name and αr I = I I, or 3 and Sig(α L ) T and thus. Now for every α = (α L α R )

α R is a disjunction of the form A... A n with at least one A i T, thus A I i = I and αr I = AI... I... A I n = I, or α R is an existential restriction r.a, thus r I = I I and A I = I so that ( r.a ) I = I, or α R is r.self, thus r I = I I so that ( r.self) I = I, or α R is an atleast restriction nr.a 2, thus r I = I I, A I 2 = I and I n so that ( nr.a 2 ) I = I. This follows from the fact that for any concept description nr.a, I (r.a) I n for it to be satisfiable. Since for all cases αl I αi R, we conclude that I is a model for T. But I and I correspond on all symbols y (Sig(D) Sig(C)) T and therefore D I = D I and C I = C I. Now since C I = C I and w C I we have that w C I \ D I and hence T = C D, contradicting the assumption. Lemma 2 Let α be an axiom, and be signatures, x {, } and T a SRIQ TBox. Then:. If and α is not T reachable, then α is not T reachable. 2. If Sig(α) and α is not reachable then α is not reachable. Proof:. By the inductive definition of x-reachability if then T T. Thus if α is not T reachable it can also not be T -reachable. 2. Assume that α is not reachable but it is reachable. Then there is some symbol y Sig(α) such that y and y is required for α to be reachable. α is reachable so it must be the case that y. But this contradicts our assumption that Sig(α). Thus, α is not reachable. Lemma 3 Let α be an axiom, and be signatures, x {, } and T, T SRIQ TBoxes. Then:. Given T = T, if then T 2. If Sig(T ), then T T 3. If T T, then T T. Proof: T. = T. In particular T T.. Assume that there is some axiom α T such that α T. Therefore, we have that α is not T reachable but that it is T reachable. But this is a contradiction by Lemma 2. since. Thus, T T. A similar argument is used to show that T T and T T. 2. For every α T we have that Sig(α). Therefore, from Lemma 2.2 we have that whenever α T is not reachable it is also not reachable and we have that T contains at most all those axioms in T. Thus, T T. 3. Let δ = T, δ = T and α (T T ). Assume α is δ reachable but not δ reachable. Since T T and Sig(T ) Sig(T ) we have by the inductive definition of x reachability that δ δ. But by Lemma 2. we have that whenever α is not δ reachable then it is also not δ reachable. Therefore, contains at most all those axioms in T. Thus, T T. 4

Lemma 4 Let be an signature, T and T 2 be SRIQ TBoxes with Sig(T ) Sig(T 2 ) and x {, }. Then (T T 2 ) = T T 2. Proof: Let M = (T T 2 ), M = T, M 2 = T 2. Now T T T 2 thus by Lemma 3.3 we have that M M. Similarly M 2 M and thus M M 2 M M which gives us M M 2 M. Let = T T 2. To show that M M M 2 we note that, when extracting these modules, the order in which axioms are extracted are irrelevant. We therefore assume that any algorithm first extracts axioms in M M 2 then tests all other axioms for T T 2 -reachability. Consider any axiom α (T T 2 ) \ (M M 2 ). If α T then α T \ M and α is not T reachable. Now precondition Sig(T 2 ) Sig(T ) implies T 2 Sig(α), taken that α is not T reachable we manipulate this statement to derive ( T 2 T ) Sig(α) T. Thus by Lemma 2.2 we have that α is not T 2 T reachable. The case where α T 2 is treated analogously. Proposition For x {, }, x-reachability is robust under replacement. Proof: Let Sig(T ) (Sig(T ) Sig(T 2 )). This implies that Sig(T ) Sig(T i ), for i =, 2. Now we have: (T T ) = T T (Lemma 4) = T 2 T (Precondition) = (T 2 T ) (Lemma 4) Proposition 2 For x {, }, x-reachability is robust under vocabulary extensions. Proof: Let (Sig(T ) Sig(T 2 )) and T x T 2 i.e., T = T 2. Let = (which implies and ). Then we have that: T = T ( ) = (T = (T 2 = T 2 = T 2 ) ) (Lemma 3.) (Precondition) (Lemma 3.) ( ) As for equality ( ), set inclusion is due to Sig(T ) = and the combination of Lemma 3.2 and Lemma 3.3, and the converse is due to and Lemma 3.. Equality ( ) is justified analogously. Proposition 3 For x {, }, x-reachability is robust under vocabulary restrictions. Proof: Follows from the fact that we have robustness under vocabulary extensions. 5

Proposition 4 For x {, }, x-reachability is robust under joins. Proof: For i =, 2, let M i = T i with Sig(T ) Sig(T 2 ), and let M = (T T 2 ). The precondition says that M = M 2. It is clear from Lemma 3.3 that M M i. It suffices to show M M. Take any axiom α (T T 2 ) \ M. It remains to show that α is not M reachable. In case α T \ M, then α is not M reachable since M = T. In case α T 2 \ M, we also have that α T 2 \ M 2 because M = M 2. This means that α is not M 2 reachable therefore not M reachable. Theorem 2 Let T be a normalized SRIQ TBox and a signature such that Sig(T ). For arbitrary concept descriptions C, D such that T = C D and Sig(C) Sig(D) T we have that T contains all MinAs for T = C D. Proof: Assume that T = C D for some Sig(C) Sig(D) T, but there is a MinA M for T = C D that is not contained in T. If C D is a tautology then M must be empty with M T. Thus, we assume that C D is not a tautology. Since M T, there must be an axiom α M \ T. Define M := M T. M is a strict subset of M since α M. There are two cases, either M = or it contains at least one axiom. In the case where M =, define T = T \ T with M T. Now since M = C D we have by monotinocity that T = C D. Since T T we have by Lemma 3.3 that T T and thus that T =. But by Lemma we have that T = C D if, and only if, T = C D. Since C D is not a tautology we have that T = C D and thus that M = C D. In the case where M we claim that M = C D, which contradicts the fact that M is a MinA for T = C D. We use proof by contraposition to show this. Assume that M = C D, i.e., there is a model I of M such that C I D I. We modify I to I by setting y I := I for all concept names y Sig(T ) \ T, and r I := I I for all roles names r Sig(T ) \ T. We have D I = D I since Sig(D) T, and C I = C I since Sig(C) T. It follows that C I D I. It remains to be shown that I is indeed a model of M, and therefore satisfies all axioms β = (β L β R ) in M, including α. If β = Dis(R r, R 2 ) then by definition Sig(β) T so that (β) I = (β) I. Otherwise there are two possibilities: β M. Since M T, all symbols in Sig(β L) are also in T and possibly some symbols of Sig(β R ) may not be in T. Consequently, I and I coincide on the names occurring in β L and since I is a model of M, we have that (β L ) I = (β L ) I and (β R ) I (β R ) I. Therefore (β L ) I (β R ) I. β M. Since β M, we have that β T, and hence β is not T - reachable. Thus, β R is a concept name and βr I = I, or β R is a role name and βr I = I I, or β R is a disjunction of the form A... A n with at least one A i T, thus A I i = I and βr I = AI... I... A I n = I, or 6

β R is an existential restriction r.a, thus r I = I I and A I = I so that ( r.a ) I = I, or β R is r.self, thus r I = I I so that ( r.self) I = I, or β R is an atleast restriction nr.a 2, thus r I = I I, A I 2 = I and I n so that ( nr.a 2 ) I = I. This follows from the fact that for any concept description nr.a, I (r.a) I n for it to be satisfiable. By definition of I, (β R ) I = I. Hence (β L ) I (β R ) I. Therefore I is a model for M. But since C I D I we have that M = C D proving the contrapositive. 7