Preliminries DTD Document Type Definition References Jnury 29, 2009
Preliminries DTD Document Type Definition References Structure Preliminries Unrnked Trees Recognizble Lnguges DTD Document Type Definition simple DTDs Specilized DTDs Strong Vlidtion Vlidting well-formed XML Documents References
Preliminries DTD Document Type Definition References Unrnked Trees From XML to unrnked Trees <b o o k C o l l e c t i o n> <book> < t i t l e>the Lord o f the Rings</ t i t l e> </ book> <book> <r e l t e d> < t i t l e>the Lord o f the Rings</ t i t l e> </ r e l t e d> < t i t l e>the H i s t o r y o f Middle e r t h</ t i t l e> </ book> </ b o o k C o l l e c t i o n>
Preliminries DTD Document Type Definition References Unrnked Trees From XML to unrnked Trees bookcollection book book title relted title title
Preliminries DTD Document Type Definition References Unrnked Trees From XML to unrnked Trees r c b c c
Preliminries DTD Document Type Definition References Unrnked Trees From XML to unrnked Trees r c b c c Forml representtion: Σ = {r,, b, c} r((c()), (b(c()), c())) = t T Σ
Preliminries DTD Document Type Definition References Unrnked Trees From XML to unrnked Trees r c b c c Forml representtion: Σ = {r,, b, c} r((c()), (b(c()), c())) = t T Σ String representtion: rccbccbccr = [t] [T Σ ]
Preliminries DTD Document Type Definition References Recognizble Lnguges Recognizble Lnguges Myhill-Nerode Theorem Let L be lnguge over n lphbet Σ. We define the Nerode reltion Σ Σ s follows: for every u, v Σ : u v w Σ : uw L vw L The Nerode reltion prtitions Σ in equivlence clsses. Theorem (Myhill-Nerode Theorem) A lnguge L is recognizble iff the Nerode reltion prtitions Σ in finitely mny equivlence clsses. [Bder, 2007]
Preliminries DTD Document Type Definition References simple DTDs DTD Document Type Definition Definition A DTD is tuple (Σ, r, P) where Σ is n lphbet, r Σ is clled the root lbel, nd P { R Σ, R Reg Σ } is finite set of so-clled productions. Nottion: D d... set of trees stisfying DTD d L(d) = [D d ]... set of string representtions of the trees in D d
Preliminries DTD Document Type Definition References simple DTDs DTD Document Type Definition Exmple A DTD which is stisfied by the tree c r b c c cn be: d = (Σ, r, P) where Σ = {r,, b, c} nd P = {r, bc + c, b c, c ε} So L(d) = {r} {cc, bccbcc} {r}.
Preliminries DTD Document Type Definition References Specilized DTDs Specilized DTDs Definition (specilized DTD) A specilized DTD over Σ is tuple d = (Σ, Σ, d, µ) where Σ nd Σ re lphbets, d is DTD over Σ, nd µ: Σ Σ is mpping.
Preliminries DTD Document Type Definition References Specilized DTDs Specilized DTDs Exmple Specilized DTD which is only stisfied by the tree d = (Σ, Σ, d, µ) where c r b c : c Σ = {r,, b, c}, Σ = {r, x, y, b, c}, d = (Σ, r, P), P = {r xy, x c, y bc, b c, c ε}, { α Σ if α {x, y}, : µ(α) = α otherwise.
Preliminries DTD Document Type Definition References Specilized DTDs Specilized DTDs Exmple Σ = {r,, b, c}, Σ = {r, x, y, b, c}, d = (Σ, r, P), P = {r xy, x c, y bc, b c, c ε}, { α Σ if α {x, y}, : µ(α) = α otherwise. So L(d ) = {rxccxybccbccyr} nd L(d) = {rccbccbccr}.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Definition We cll (specilized) DTD d strongly recognizble iff L(d) is recognizble.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Exmple (non-recursive DTD) Agin consider the DTD d = (Σ, r, P) where Σ = {r,, b, c} nd P = {r, bc + c, b c, c ε} The DTD d is not recursive nd the lnguge L(d) cn be represented by the regulr expression r (cc + bccbcc) r. Hence, this DTD is strongly recognizble.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Exmple (recursive DTD) Let d = (Σ, r, P) where Σ = {r, } nd P = {r, + ε}. The DTD d is obviously recursive. Moreover L(d) = {r n n r n 1}. Hence, d is not strongly recognizble.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Theorem Theorem A specilized DTD is strongly recognizble iff it is non-recursive. [Segoufin & Vinu, 2002]
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 Let d = (Σ, Σ, d, µ) be specilized DTD. Step 1: d is strongly recognizble d is non-recursive: Let d be strongly recognizble. Then there exists n FSA A which ccepts L(d).
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 Suppose Σ is recursive with respect to d.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 Suppose Σ is recursive with respect to d. Then d nd d re recursive nd there exists tree t D d such tht repets on pth of t. So [t] hs the form [t] = ru 1 v 1 wv 2 u 2 r where u 1 u 2 nd v 1 v 2 re well-blnced words.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 Suppose Σ is recursive with respect to d. Then d nd d re recursive nd there exists tree t D d such tht repets on pth of t. So [t] hs the form [t] = ru 1 v 1 wv 2 u 2 r where u 1 u 2 nd v 1 v 2 re well-blnced words. Since is recursive we cn repet the prts v 1 nd v 2 nd the trees n > 0: [t n ] = ru 1 (v 1 ) n w(v 2 ) n u 2 r re lso in L(d ) nd A ccepts µ([t n ]).
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 However, with the Myhill-Nerode theorem we cn show tht L(d) is not regulr: There is n infinite number of equivlence clsses of strings over Σ Σ becuse i, j 1: i j µ(ru 1 (v 1 ) i ) µ(ru 1 (v 1 ) j ).
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 1 However, with the Myhill-Nerode theorem we cn show tht L(d) is not regulr: There is n infinite number of equivlence clsses of strings over Σ Σ becuse i, j 1: i j µ(ru 1 (v 1 ) i ) µ(ru 1 (v 1 ) j ). This is contrdiction to the ssumption tht L(d) is regulr nd tht A recognizes L(d), hence, d nd d cn not be recursive.
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction Let d = (Σ, Σ, d, µ) be specilized DTD where Σ = {r,, b}, Σ = {ρ, α, β}, d = (Σ, ρ, P ), P = {ρ α, α β + ε, β ε} nd µ(ρ) = r, µ(α) =, µ(β) = b Since d is not recursive there exists strongly vlidting FSA. Our utomt A b for every b Σ re
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction Let d = (Σ, Σ, d, µ) be specilized DTD where Σ = {r,, b}, Σ = {ρ, α, β}, d = (Σ, ρ, P ), nd P = {ρ α, α β + ε, β ε} µ(ρ) = r, µ(α) =, µ(β) = b Since d is not recursive there exists strongly vlidting FSA. Our utomt A b for every b Σ re α A ρ : q 0,ρ r r
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction Let d = (Σ, Σ, d, µ) be specilized DTD where Σ = {r,, b}, Σ = {ρ, α, β}, d = (Σ, ρ, P ), nd P = {ρ α, α β + ε, β ε} µ(ρ) = r, µ(α) =, µ(β) = b Since d is not recursive there exists strongly vlidting FSA. Our utomt A b for every b Σ re A α : q 0,α β
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction Let d = (Σ, Σ, d, µ) be specilized DTD where Σ = {r,, b}, Σ = {ρ, α, β}, d = (Σ, ρ, P ), P = {ρ α, α β + ε, β ε} nd µ(ρ) = r, µ(α) =, µ(β) = b Since d is not recursive there exists strongly vlidting FSA. Our utomt A b for every b Σ re A β : q 0,β b b
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction α A ρ : q 0,ρ r r Now we build the trget utomton step by step. Our A 0 is equl to A ρ. The following re:
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction A α : q 0,α β A 1 : r ε ε r β
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction A β : q 0,β b b A 2 : r ε ε r ε b b ε
Preliminries DTD Document Type Definition References Strong Vlidtion Strong Vlidtion Proof Step 2: FSA Construction A 2 : r ε ε r ε b b ε Since A 2 contins no symbols from Σ nymore the following utomt A 3, A 4,... will be the sme. So A 2 is the desired utomton.
Preliminries DTD Document Type Definition References Vlidting well-formed XML Documents Exmple (recognizble DTD) Consider the DTD d = (Σ, r, P) where Σ = {r, } nd P = {r, + ε} gin.
Preliminries DTD Document Type Definition References Vlidting well-formed XML Documents Exmple (recognizble DTD) Consider the DTD d = (Σ, r, P) where Σ = {r, } nd P = {r, + ε} gin. There is regulr lnguge L R such tht L(d) = [T Σ ] L R.
Preliminries DTD Document Type Definition References Vlidting well-formed XML Documents Exmple (recognizble DTD) Consider the DTD d = (Σ, r, P) where Σ = {r, } nd gin. P = {r, + ε} There is regulr lnguge L R such tht L(d) = [T Σ ] L R. Let L R for exmple be L(r r). Then L R = {r m n r m 1, n 0} nd [T Σ ] L R = {r n n r n 1} = L(d).
Preliminries DTD Document Type Definition References Vlidting well-formed XML Documents Exmple (recognizble DTD) Consider the DTD d = (Σ, r, P) where Σ = {r, } nd gin. P = {r, + ε} There is regulr lnguge L R such tht L(d) = [T Σ ] L R. Let L R for exmple be L(r r). Then L R = {r m n r m 1, n 0} nd [T Σ ] L R = {r n n r n 1} = L(d). Or let L R for exmple be L(r r). So L R is mbiguous.
Preliminries DTD Document Type Definition References Vlidting well-formed XML Documents Exmple (not recognizble DTD) Let d = (Σ, b, P) be DTD where Σ = {, b, c} nd P = {b b + bc + ε, ε, c ε}. b b b b b c c Figure: Grphicl representtion for tree in D d.
Preliminries DTD Document Type Definition References References Bder, Prof. Dr.-Ing. Frnz. 2007 (mrch). Skript zur Lehrvernstltung Grundlgen der Theoretischen Informtik. Segoufin, Luc, & Vinu, Victor. (2002). Vlidting streming XML documents. Pges 53 64 of: Symposium on principles of dtbse systems. Assocition for Computing Mchinery.