The Combined Approach to Query Answering in DL-Lite Roman Kontchakov Department of Computer Science and Inf. Systems, Birkbeck College, London http://www.dcs.bbk.ac.uk/~roman joint work with Carsten Lutz, David Toman, Frank Wolter and Michael Zakharyaschev
Pure Query Rewriting DL-Lite family includes the first DLs specifically tailored for effective query answering over large amounts of instances. D. Calvanese et al., 2007 effective = in AC 0 for data complexity conjunctive query q TBox T + union of conjunctive queries q ABox A ABox A KRDB, Bolzano 23.02.10 0
Pure Query Rewriting: an Example of PerfectRef q(x) TeachesTo(x, y), HasTutor(y, z) q(x) TeachesTo(x, y), Student(y) Student HasTutor HasTutor(x 1, y 1 ) Student(x 1 ) TeachesTo Student Student(x 2 ) TeachesTo(y 2, x 2 ) q(x) TeachesTo(x, y), TeachesTo(x 2, y) q(x) TeachesTo(x, y) q(x) Professor(x) q(x) HasTutor(y 4, x) unification Professor TeachesTo TeachesTo(x 3, y 3 ) Professor(x 3 ) HasTutor Professor Professor(x 4 ) HasTutor(y 4, x 4 ) Intuitive! NB. what if Student has many subclasses? TeachesTo? O(( T q ) q ) subqueries KRDB, Bolzano 23.02.10 1
Combined Approach in EL (Lutz, Toman & Wolter, 2008) query answering in EL is PTIME-complete for data complexity conjunctive query q + FO query q TBox T + ABox A ABox A A is computed in polytime in A and only when A is updated KRDB, Bolzano 23.02.10 2
Variants of DL-Lite R ::= P P C ::= A kr TBox concept inclusions DL-Lite N horn : C 1 C n C DL-Lite N core : C 1 C 2, C 1 C 2 ABox assertions: C(a), R(a, b) DL-Lite F α = DL-LiteN α with R and 2 R only DL-Lite (HN ) α = DL-Lite N α DL-Lite (HF) α = DL-Lite F α with (restricted) role inclusions, role disjointness, etc. with (restricted) role inclusions, role disjointness, etc. In all these languages, answering positive existential queries (under UNA) is in AC 0 for data complexity positive existential formulas are built from A(x) and R(x, y) using, and KRDB, Bolzano 23.02.10 3
ABox Expansion in DL-Lite canonical interpretation I K : I = Ind(A) {c R R is generating in K} a c R1 c Rn R n is generating K = R 1 (a) but R 1 (a, b) / A for all b Ind(A) T = R i R i+1 and R i R i+1 A I K = {a K = A(a)} {c R T = R A} P I K = {(a, b) P (a, b) A} {(d, c P ) d c P } {(c P, d) d c P } I K is not a model T = {A P, 2 P }, A = {A(a), A(b)} I K does not give the right answers q = v P (v, v), T = {A P, P P }, A = {A(a)} q = v 2 (P (v 1, v 2 ) P (v 3, v 2 )), T = {A P }, A = {A(a), A(b)} The unravelling U K is almost a (canonical) model of I K and does give the right answers KRDB, Bolzano 23.02.10 4
Query Rewriting for DL-Lite N horn (1) we rewrite a given CQ q into an FO query q such that answers to q in U K = answers to q in I K q = O( q T ) q = u (ϕ ϕ 1 ϕ 2 ϕ 3 ) ϕ 1 = v / u (v c R ) R is a role in T all answer variables must get ABox values NB. if ϕ 1 is replaced with ϕ 1 = aux(v), where aux is a new relation containing all c R, v / u then q = O( q ) KRDB, Bolzano 23.02.10 5
Query Rewriting for DL-Lite N horn (2) answers to q in U K = answers to q in I K U K is a forest model, so if t is matched to a non-abox element then a part of q containing t must be homomorphically embeddable into a tree a tree witness f R,t : term(q) (N R ) (finite words over roles) f R,t (t) = ε if f R,t (s) = ε and R(s, s ) q then f R,t (s ) = R if f R,t (s) = w S and S (s, s ) q with S S then f R,t (s ) = w S S if f R,t (s) = w S and S (s, s ) q then f R,t (s ) = w q = v P (v, v): f P,v does not exist q = v 2 (P (v 1, v 2 ) P (v 3, v 2 )): P P,v1 (v 3 ) = ε q = t 1 t 2 t 3 t 4 (R(t 1, t 2 ) S(t 2, t 3 ) S(t 4, t 3 )): f R,t1 (t 2 ) = R, f R,t1 (t 1 ) = ε, f R,t1 (t 3 ) = R S, f R,t1 (t 4 ) = R, f S,t4 (t 3 ) = S, f S,t4 (t 4 ) = ε, f S,t4 (t 2 ) = ε, f S,t4 (t 1 ) is not defined KRDB, Bolzano 23.02.10 6
Query Rewriting for DL-Lite N horn (3) ϕ 2 = R(t,t ) q f R,t does not exist (t c R ) if no tree witness exists then t cannot be mapped to a non-abox element ϕ 3 = R(t,t ) q f R,t exists ( (s = c R ) R(s,s ) q f R,t (s)=ε f R,t (s)=ε ) (s = t) if both s and t are labelled with ε for role R and s is mapped onto c R, for R(s, s ) q, then s = t NB. in fact, f R,t (s) = ε induces an equivalence relation R q, and so, ϕ 3 = O( q ) KRDB, Bolzano 23.02.10 7
Canonical Interpretation by FO Queries regard the ABox as a relational instance and then define (domain-independent) FO-queries q T A (x) and qt P (x, y) constructing I K 1. for each concept C, define queries exp T,j C (x): e.g., (extension of concept C on step j of the SLD derivation) exp T,0 (x) = A(x) A exp T,j+1 C (x) = exp T,j no more than T steps required C (x) C 1...C n C 1 i n exp T,j C i (x) 2. q T P (x, y) = P (x, y) ( gen T P (x) (y = c P ) ) ( gen T P (y) (x = c P ) ) 3. q T A (x) = expt A (x) D(x), where D(x) = c R N T I ( (x = cr ) z gen T R (z)) such queries can be implemented as materialised views (updates!) Example: h(x, y) = h(x, y) h = hastutor, t = teachesto ( ( y h(x, y ) S(x) (x = c t ) y t(y, x)) y h(x, y ) (y = c h ) ) z ( ( y t(z, y ) P(z) (z = c h ) y h(y, z)) y t(z, y ) (x = c t ) (y = c h ) ) KRDB, Bolzano 23.02.10 8
Combining the two Rewriting Steps polynomial pure query rewriting for DL-Lite F core and even for DL-Lite N core (if the aggregation function COUNT is available) otherwise exp T,0 k R (x) = O(k2 ), which is exponential in T if binary coding of k is used Example: ( q(x) = (x c h ) (x c t ) t(x, y) ( (P(x) y h(y, x)) y t(x, y ) (y = c t ) ) w ( (S(w) y t(y, w)) y h(w, y ) (x = c h ) (y = c t ) )) ( h(y, z) ( (S(y) z t(z, y)) z h(y, z ) (z = c h ) ) w ( (P(w ) z h(z, w )) z t(w, z ) (y = c t ) (z = c h ) )) which is equivalent to q(x) = t(x, y) P(x) y h(y, x) KRDB, Bolzano 23.02.10 9
Other Applications of the Technique only exponential blowup for positive existential query answering in DL-Lite (HN ) horn without the UNA, the technique is applicable to query answering in DL-Lite (HF) horn experiments show that the approach is competitive (and this is P-complete for data complexity) with executing the original query over the data (the formulas ϕ 1 ϕ 3 introduce additional selection conditions on top of the original query) Open Questions is the exponential blowup unavoidable for role inclusions? is the exponential blowup unavoidable for positive existential queries? are there other fragments with pure polynomial rewriting? more at http://www.dcs.bbk.ac.uk/~roman/ KRDB, Bolzano 23.02.10 10