XML and Databases. Lecture 5 XML Validation using Automata. Sebastian Maneth NICTA and UNSW

Size: px
Start display at page:

Download "XML and Databases. Lecture 5 XML Validation using Automata. Sebastian Maneth NICTA and UNSW"

Transcription

1 XML nd Dtbses Lecture 5 XML Vlidtion using Automt Sebstin Mneth NICTA nd UNSW CSE@UNSW -- Semester 1, 2009

2 Outline 2 1. Recp: deterministic Reg Expr s / Glushkov Automton 2. Complexity of DTD vlidtion 3. Beyond DTDs: XML Schem nd RELAX NG 4. Sttic Methods, bsed on Tree Automt

3 Previous Lecture 3 XML type definition lnguges wnt to specify certin subset of XML doc s = type of XML documents Remember The specifiction/type definition should be simple, so tht vlidtor cn be built utomticlly (nd efficiently) the vlidtor runs efficient on ny XML input (similr demnds s for prser) Type def. lnguge must be SIMPLE! (similrly: prser genertors use EBNF or smller subclsses: LL / LR) O(n^3) prsing

4 XML Type Definition Lnguges 4 DTD (Document Type Definition, W3C) Originted from SGML. Now prt of XML DTD my pper t the beginning of n XML document XML Schem (W3C) Now t version 1.1 HUGE lnguge, mny built-in simple types Reg Exprs must be deterministic (=1-unmbiguous) sme!! Unique Prticle Attribution Schems themselves: written in XML See the Schem Primer t RELAX NG (Osis) For tree structure definition, more powerful thn Schems&DTDs

5 XML Type Definition Lnguges 5 DTD (Document Type Definition) <!DOCTYPE root-element [ doctype declrtion ]> <!ELEMENT element-nme content-model> content-models EMTPY ANY (#PCDATA elem-nme_1 elem-nme_n)* deterministic Reg Expr over element nmes <!ATTLIST element-nme ttr-nme ttr-type ttr-defult..> Types: CDATA, (v1..), ID, IDREFs Defults: #REQUIRED, #IMPLIED, vlue, #FIXED

6 XML Type Definition Lnguges 6 DTD (Document Type Definition) <!DOCTYPE root-element [ doctype declrtion ]> <!ELEMENT element-nme content-model> content-models EMTPY ANY (#PCDATA elem-nme_1 elem-nme_n)* deterministic Reg Expr Most interesting / chllenging spect of DTDs <!ATTLIST element-nme ttr-nme ttr-type ttr-defult..> Types: CDATA, (v1..), ID, IDREFs Defults: #REQUIRED, #IMPLIED, vlue, #FIXED

7 7 Summry In order to check whether (lrge) document is vlid wrt to given DTD ( it vlidtes ) you need to check if children lists mtch the given Reg Expr s This cn be done efficiently, using finite-utomt (FAs)! To check if Reg Expr e is llowed in DTD we hve to construct prticulr finite utomton: the Glushkov utomton. Glu(e) must be deterministic. Glu(e) Note If Glu(e) is deterministic, then its size (# trnsitions) is liner in size(e)!

8 8 Summry In order to check whether (lrge) document is vlid wrt to given DTD ( it vlidtes ) you need to check if children lists mtch the given Reg Expr s This cn be done efficiently, using finite-utomt (FAs)! To check if Reg Expr e is llowed in DTD we hve to construct prticulr finite utomton: the Glushkov utomton. Glu(e) must be deterministic. Glu(e) Note If Glu(e) is deterministic, then its size (# trnsitions) is liner in size(e)! Question Cn you explin why this is the cse?

9 9 Summry In order to check whether (lrge) document is vlid wrt to given DTD ( it vlidtes ) you need to check if children lists mtch the given Reg Expr s This cn be done efficiently, using finite-utomt (FAs)! To check if Reg Expr e is llowed in DTD we hve to construct prticulr finite utomton: the Glushkov utomton. Glu(e) must be deterministic. Glu(e) Note If Glu(e) is deterministic, then its size (# trnsitions) is liner in size(e)! Question Cn you explin why this is the cse? not correct: liner in size(e) * #letters(e)

10 10 More Notes (1) From deterministic FA you cnnot necessrily obtin deterministic (= 1-unmbiguous) regulr expression!! Exmple: e = ( b)* ( b) NO 1-unmbigous reg exp exists for e To check if Reg Expr e is llowed in DTD we hve to construct prticulr finite utomton: the Glushkov utomton. Glu(e) must be deterministic. Glu(e) Note If Glu(e) is deterministic, then its size (# trnsitions) is liner in size(e)! Question Cn you explin why this is the cse? not correct: liner in size(e) * #letters(e)

11 More Notes For more detils: See pper by Brüggemnn-Klein, Linked from the course web-pge. 11 (1) From deterministic FA you cnnot necessrily obtin deterministic (= 1-unmbiguous) regulr expression!! Exmple: e = ( b)* ( b) NO 1-unmbigous reg exp exists for e (2) Glu(e) is closely relted to Thomson(e) [remove ε-trnsitions] nd to Berry/Sethi(e) [sme] nd Brzozowski(e) To check if Reg Expr e is llowed in DTD we hve to construct prticulr finite utomton: the Glushkov utomton. Glu(e) must be deterministic. Glu(e) Note If Glu(e) is deterministic, then its size (# trnsitions) is liner in size(e)! Question Cn you explin why this is the cse? not correct: liner in size(e) * #letters(e)

12 12 Glushkov utomton Glu(e) Ech letter-position in the Reg Expr e becomes one stte of Glu; plus, Glu hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 }

13 Following slides from: 13

14 position 1 position 2 position

15 15

16 16

17 17

18 FIRST( e ) 18

19 19 Glushkov utomton G(e) Ech position in the Reg Expr e becomes one stte of G; plus, G hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 } FOLLOW( e, x ) = ll possible positions following position x in e e.g. FOLLOW( R (E G) (EX)*, R 1 ) = { E 2, G 3 } From stte R 1 : dd E-trnsition to E 2 G-trnsition to G 3

20 FOLLOW( e, R 1 ) = 20

21 21

22 FOLLOW( e, E 2 ) = 22

23 FOLLOW( e, G 3 ) = 23

24 FOLLOW( e, E 4 ) = 24

25 FOLLOW( e, X 5 ) = 25

26 26 Glushkov utomton G(e) Ech position in the Reg Expr e becomes one stte of G; plus, G hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 } FOLLOW( e, x ) = ll possible positions following position x in e e.g. FOLLOW( R (E G) (EX)*, R 1 ) = { E 2, G 3 } From stte R 1 : dd E-trnsition to E 2 G-trnsition to G 3 LAST( e ) = ll possible end positions of words mtching e e.g. LAST( R (E G) (EX)* ) = { E 2, G 3, X 5 }

27 27

28 Is this utomton deterministic?? 28

29 29 Glushkov utomton G(e) Another exmple (* b)* This FA is deterministic. 1 b 2 3 b b Which of these is deterministic? (b) (c) (b c) ( b)*c

30 30 Glushkov utomton G(e) Ech position in the Reg Expr e becomes one stte of G; plus, G hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 } FOLLOW( e, x ) = ll possible positions following position x in e LAST( e ) = ll possible end positions of words mtching e Nïve implementtion: O(n^3) time, where n = size( e ) (for ech position: computing FOLLOW goes through every position t ech step, needs to compute union O( n*n*n )

31 31 Glushkov utomton G(e) Ech position in the Reg Expr e becomes one stte of G; plus, G hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 } FOLLOW( e, x ) = ll possible positions following position x in e LAST( e ) = ll possible end positions of words mtching e Nïve implementtion: O(n^3) time, where n = size( e ) (for ech position: computing FOLLOW goes through every position t ech step, needs to compute union O( n*n*n ) Not relly needed. Cn be improved to O(n^2)

32 32 Glushkov utomton G(e) Ech position in the Reg Expr e becomes one stte of G; plus, G hs one extr begin stte. FIRST( e ) = ll possible begin positions of words mtching e e.g. FIRST( R (E G) (EX)* ) = { R 1 } FOLLOW( e, x ) = ll possible positions following position x in e LAST( e ) = ll possible end positions of words mtching e Nïve implementtion: O(n^3) time, where n = size( e ) (for ech position: computing FOLLOW goes through every position t ech step, needs to compute union O( n*n*n ) Cn be improved to O( size(e) + size(g(e)) ) Not relly needed. Cn be improved to O(n^2)

33 33 Glushkov utomton G(e) Note If G(e) is deterministic, then its size (# trnsitions) is qudrtic in size(e)! Liner in size(e) * #letters(e), if G(e) is deterministic! O( size(e) * #letters(e) ) Nïve implementtion: O(n^3) time, where n = size( e ) (for ech position: computing FOLLOW goes through every position t ech step, needs to compute union O( n*n*n ) Cn be improved to O( size(e) + size(g(e)) ) Not relly needed. Cn be improved to O(n^2)

34 To void these expensive running times 34 DTD requires tht FA=G(e) must be deterministic! n = length(w) m = size(e) Totl Running time O(n + m) If s = #letters(e) is ssumed fixed (not prt of the input) Otherwise: O(n + ms) How cn you implement regulr expression? Input: Reg Expr e, string w Question: Does w mtch e? deterministic FA: run on w tkes time liner in length(w) Algorithm FA = BuildFA(e); DFA = BuildDFA(FA); Size of FA is liner in size(e)=m Size of DFA is exponentil in m Unrestricted Reg Expr e Totl Running time O(n + 2^m) Other lterntive: O(nm)

35 35 Summry Deterministic (1-unmbiguous) content models give rise to efficient mtching lgorithms. (they void O(nm) or O(n+2^m) complexities) Disdvntges Hrd to know whether given reg expr is OK (deterministic) Det. reg exprs re NOT closed under union. (not so nice..) Question Cn you see why? Hint: find det. reg. exprs. e1 nd e2 such tht their union is equl to ( b)* ( b)

36 36 Now tht we know how the check ll the different content-models (in prticulr det. Reg Expr s) how to build full vlidtor for DTD? elem-nme_1 elem-nme_2 elem-nme_k RegExpr_1 RegExpr_2 RegExpr_k Automt A_1, A_2,, A_k The Vlidtion Problem Given DTD T nd document D, is D vlid wrt T? Top-Down Implementtion t element node w. lbel elem-nme_i, run utomton A_i check ttribute constrints check ID/IDREF constrints Totl Running time (Given A_1, A_2,, A_k) liner in the sum of sizes of the DTD nd the document. O( size(t) + size(d) )

37 DTDs hve the 37 lbel-gurded subtree exchnge property: t1, t2 trees in DTD lnguge T v1 node in t1, lbeled lb v2 node in t2, lbeled lb trees obtined by exchnging the subtrees rooted t v1 nd v2 re lso in T k locl content model only depends on lbel of prent t1 t2 v1 lb v2 lb

38 Beyond DTDs 38 Often, the expressive power of DTDs is not sufficient. Problem ech element nme hs precisely one content-model in DTD. Would like to distuingish, depending on the context (prent). deler used new cr cr model yer model cr hs different structure, in different contexts.

39 Beyond DTDs 39 Often, the expressive power of DTDs is not sufficient. Problem ech element nme hs precisely one content-model in DTD. Would like to distuingish, depending on the context (prent). deler used new speciliztion cr used cr new model yer model cr hs different structure, in different contexts.

40 Specilized DTDs 40 deler used, new used (cr used )* new (cr new )* cr used model, yer cr new model deler deler used new used new cr used cr new cr cr model yer model model yer model

41 Specilized DTDs deler used, new used (cr used )* new (cr new )* cr used model, yer cr new model New nottion. Use cpitlized TYPE Nmes Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] 41 deler deler used new used new cr used cr new cr cr model yer model model yer model

42 New nottion. Use cpitlized TYPE Nmes 42 Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Not locl Let us cll this new concept grmmr. the locl restriction A grmmr G is locl, if for ny lbel[regexpr_1], lbel[regexpr_2] present in G it holds tht RegExpr_1 = RegExpr_2. By definition: Every DTD is locl grmmr, nd vice vers.

43 New nottion. Use cpitlized TYPE Nmes 43 Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Not locl Let us cll this new concept grmmr. the locl restriction A grmmr G is locl, if for ny lbel[regexpr_1], lbel[regexpr_2] present in G it holds tht RegExpr_1 = RegExpr_2. By definition: Every DTD is locl grmmr, nd vice vers. A grmmr G is single-type, if for ny lbel[regexpr_1], lbel[regexpr_2] occurring in the sme rule of G it holds tht RegExpr_1 = RegExpr_2. WRONG the single-type restriction

44 New nottion. Use cpitlized TYPE Nmes 44 competing Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Alterntively: Cll two TYPE Nmes T1 nd T2 competing if they hve the sme element nme (but not identicl rules) Clsses of Grmmrs locl no competing TYPE nmes! (DTDs) single-type TYPE nmes in the sme content model do not compete! (XML Schem s) regulr no restriction (RELAX NG)

45 New nottion. Use cpitlized TYPE Nmes 45 competing Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Question Are there single-type grmmrs (XML Schems) which cnnot be expressed by locl grmmrs (DTDs). Clsses of Grmmrs locl no competing TYPE nmes! (DTDs) single-type TYPE nmes in the sme content model do not compete! (XML Schem s) regulr no restriction (RELAX NG)

46 New nottion. Use cpitlized TYPE Nmes 46 competing but re not in sme content model! Person person [PersonNme, Gender, Spouse?, Pet*] PersonNme nme [First, Lst] Pet pet [Kind, PetNme] PetNme nme [#PCDATA] Question Are there single-type grmmrs (XML Schems) which cnnot be expressed by locl grmmrs (DTDs). YES! Clsses of Grmmrs locl no competing TYPE nmes! (DTDs) single-type TYPE nmes in the sme content model do not compete! (XML Schem s) regulr no restriction (RELAX NG)

47 New nottion. Use cpitlized TYPE Nmes 47 competing Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Through the use of TYPE Nmes (nonterminls / sttes) you cn distinguish deep context! deler used cr Cr new cr new cr new speciliztion through prent model yer model

48 New nottion. Use cpitlized TYPE Nmes 48 competing Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Through the use of TYPE Nmes (nonterminls / sttes) you cn distinguish deep context! deler Cn we model context tht is fr wy from the specilized node? used cr Cr new cr new cr new speciliztion through prent model yer model

49 New nottion. Use cpitlized TYPE Nmes 49 competing Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] db root Through the use of TYPE Nmes (nonterminls / sttes) you cn distinguish deep context! deler specil user Cn we model context tht is fr wy from the specilized node? Sure! used cr cr My Cr new new cr speciliztion through following node model yer model #owners

50 DB db [Deler, User] Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Through the use of TYPE Nmes (nonterminls / sttes) you cn distinguish deep context! deler specil Cn we model context tht is fr wy from the specilized node? Sure! scr used sused used cr cr My Cr new new cr model yer model #owners Doc root [DB sdb] sdb db [sdeler, suser] sdeler deler [sused, New] sused used [(scr used )*] scr used cr [Model, Own, Yer] sdeler db root sdb suser user 50

51 DB db [Deler, User] Deler deler [Used, New] Used used [(Cr used )*] New new [(Cr new )*] Cr used cr [Model, Yer] Cr new cr [Model] Through the use of TYPE Nmes (nonterminls / sttes) you cn user distinguish deep context! deler specil Cn we model context tht is fr wy from the specilized node? Sure! scr used sused used cr cr My Cr new new cr model yer model #owners Doc root [DB sdb] sdb db [sdeler, suser] sdeler deler [sused, New] sused used [(scr used )*] scr used cr [Model, Own, Yer] sdeler db root sdb suser Question Sure this grmmr is not locl (DTD). But, is it single-type? 51

52 52 root root db db specil deler deler used new used new Question Is this grmmr single-type?

53 prev. exmple: probbly, not expressble in single-type (XML Schem). 53 Other exmple: Person MPerson FPerson MPerson person[nme, gender[mle], FSpouse?, Children?] FPerson person[nme, gender[femle], MSpouse?, Children?] Mle mle[] Femle femle[] FSpouse spouse[nme, gender[femle]] MSpouse spouse[nme, gender[mle]] Children children[person+] person s spouse must hve opposite gender. Note This exmple nd the Pet-exmple re tken from Hosoy s book (see course web pge).

54 prev. exmple: probbly, not expressble in single-type (XML Schem). 54 Other exmple: competing Reg Expr but not in content Person MPerson FPerson MPerson person[nme, gender[mle], FSpouse?, Children?] FPerson person[nme, gender[femle], MSpouse?, Children?] Mle mle[] Femle femle[] FSpouse spouse[nme, gender[femle]] MSpouse spouse[nme, gender[mle]] Children children[person+] person s spouse must hve opposite gender. BUT, is this even grmmr in our sense?

55 prev. exmple: probbly, not expressble in single-type (XML Schem). 55 Other exmple: competing Reg Expr in content Person root[mperson FPerson] MPerson person[nme, gender[mle], FSpouse?, Children?] FPerson person[nme, gender[femle], MSpouse?, Children?] Mle mle[] Femle femle[] FSpouse spouse[nme, gender[femle]] MSpouse spouse[nme, gender[mle]] Children children[person+] person s spouse must hve opposite gender. BUT, is this even grmmr in our sense? NO! Reg Expr only llowed inside content ( under n element nme ).

56 Clsses XML Type Formlisms 56 locl no competing TYPE nmes! (DTDs) single-type TYPE nmes in the sme content model do not compete! (XML Schem s) regulr no restriction (RELAX NG) Incresing Expressivness of defining sets of trees ( tree lnguges ) Questions Given two DTDs D1 nd D2 cn we check if ll documents vlid for D1 re lso vlid for D2? (DTD inclusion problem) D1 nd D2 describe the sme set of documents? (DTD equlity problem) Given Relx NG grmmr G, cn we check if there exists ny document tht is vlid for G? (emptiness problem) there is document vlid for G nd vlid for G2? (intersection & emptiness)

57 Clsses XML Type Formlisms 57 locl no competing TYPE nmes! (DTDs) single-type TYPE nmes in the sme content model do not compete! (XML Schem s) regulr no restriction (RELAX NG) Incresing Expressivness of defining sets of trees ( tree lnguges ) Questions Given two DTDs D1 nd D2 cn we check if ll documents vlid for D1 re lso vlid for D2? (DTD inclusion problem) D1 nd D2 describe the sme set of documents? (DTD equlity problem) Given Relx NG grmmr G, cn we check if there exists ny document tht is vlid for G? (emptiness problem) there is document vlid for G nd vlid for G2? (intersection & emptiness) If we cn do it for regulr tree grmmrs, then lso works for single-type/locl!!

58 58 All of the checks cn be done utomticlly, for regulr tree grmmrs! Tree Automt: very powerful frmework, Hve ll the good properties of string utomt! Yet, they re more expressive! equivlent to tree utomt Questions Given two DTDs D1 nd D2 cn we check if ll documents vlid for D1 re lso vlid for D2? (DTD inclusion problem) D1 nd D2 describe the sme set of documents? (DTD equlity problem) Given Relx NG grmmr G, cn we check if there exists ny document tht is vlid for G? (emptiness problem) there is document vlid for G nd vlid for G2? (intersection & emptiness) If we cn do it for regulr tree grmmrs, then lso works for single-type/locl!!

59 59 All of the checks cn be done utomticlly, for regulr tree grmmrs! Tree Automt: very powerful frmework, Hve ll the good properties of string utomt! Yet, they re more expressive! equivlent to tree utomt Note String utomt re not sufficient to check DTDs / Schems! Even if we only consider well-brcketed strings! Exmple 1 c c[, c, b ] empty b empty c empty Exmple 2 [ c, ] [, b ] / b / c empty

60 60 All of the checks cn be done utomticlly, for regulr tree grmmrs! constnt memory computtion equivlent to tree utomt Finite-stte utomt re importnt: Think you re in mze, with only fixed memory nd you cn only red the mze (cnnot mrk nything). wll Model by finite utomton. In stte q1, (to [N S E W], ) ( q2, [N S E W] ) q2, (to [N S E W], ) ( q3, [N S E W] ) q1 empty Cn n utomton serch the mze?

61 61 All of the checks cn be done utomticlly, for regulr tree grmmrs! constnt memory computtion equivlent to tree utomt Finite-stte utomt re importnt: Think you re in mze, with only fixed memory nd you cn only red the mze (cnnot mrk nything). wll Model by finite utomton. In stte q1, (to [N S E W], ) ( q2, [N S E W] ) q2, (to [N S E W], ) ( q3, [N S E W] ) q1 empty Cn n utomton serch the mze? No!! need mrkers ( pebbles ). How mny? 5? 2?

62 62 All of the checks cn be done utomticlly, for regulr tree grmmrs! constnt memory computtion equivlent to tree utomt Finite-stte utomt re importnt: In our context, e.g., for KMP (efficient string mtching) [Knuth/Morris/Prtt] generliztion using utomt. Used, e.g., in grep Compression Sttic nlysis of schems & queries (= everything you cn do *before* before running over the ctul dt )

63 4. Sttic Methods, bsed on Tree Automt 63 Person MPerson FPerson MPerson person [Nme, gender[mle], FSpouse?] FPerson person [Nme, gender[femle], MSpouse?] Regulr Tree Grmmr Rules of the form TypeNme Tree person Leves my be lbeled by TypeNmes person Nme gender FSpouse Nme gender Mle Mle Alterntively, regulr tree lnguges re defined by Tree Automt. stte, element-nme stte1, stte2 conventionlly, defined for binry/rnked trees.

64 4. Sttic Methods, bsed on Tree Automt 64 Given grmmrs D1 nd D2 cn we check if ll documents vlid for D1 re lso vlid for D2? (inclusion problem) D1 nd D2 describe the sme set of documents? (equlity problem) does there exists ny document tht is vlid for D1? (emptiness problem) there is document vlid for D1 *nd* vlid for D2? (intersection & emptiness) ALL these checks re possible for regulr tree grmmrs!! hence, they re lso solvble for DTDs / XML Schems / RELAX NG s (1) use binry tree encodings (2) trnslte XML Type Definition to Tree Grmmr (esy) Alterntively, regulr tree lnguges re defined by Tree Automt. stte, element-nme stte1, stte2 conventionlly, defined for binry/rnked trees.

65 4. Sttic Methods, bsed on Tree Automt 65 Given grmmrs D1 nd D2 cn we check if ll documents vlid for D1 re lso vlid for D2? (inclusion problem) D1 nd D2 describe the sme set of documents? (equlity problem) does there exists ny document tht is vlid for D1? (emptiness problem) there is document vlid for D1 *nd* vlid for D2? (intersection & emptiness) ALL these checks re possible for regulr tree grmmrs!! The checks bove give rise to very powerful optimiztion procedures for XML Dtbses! For exmple: documents d_1, d_2,, d_n re vlid for your schem Smll_xhtml. Are they lso vlid for schem XHTML? Check inclusion problem for Smll_html nd XHTML!

66 Automt 3 on words 66 1 b b 2 1 b b b

67 Automt 3 on words 67 1 b b 2 b b b 1

68 Automt 3 on words 68 1 b b 2 b b b 1

69 Automt 3 on words 69 1 b b 2 b b b 2

70 Automt 3 on words 70 1 b b 2 exp. blow-up exp. more succinct Expressiveness deteterministic = nondeterministic left-to-right = right-to-left b b b word is ccepted by the utomton

71 Automt 3 on words 71 1 b b 2 exp. blow-up exp. more succinct Expressiveness deteterministic = nondeterministic left-to-right = right-to-left b b b word is ccepted by the utomton Automt on trees 1. bottom-up LABEL( stte1, stte2 ) stte F OR AND T OR 0 1 0() F 1() T OR( F, F ) F OR( F, T ) T 0 1

72 Automt 3 on words 72 1 b b Expressiveness deteterministic = nondeterministic left-to-right = right-to-left 2 exp. blow-up exp. more succinct b b b word is ccepted by the utomton Automt on trees 1. bottom-up LABEL( stte1, stte2 ) stte F T OR AND T OR 0 F 1 0() F 1() T OR( F, F ) F OR( F, T ) T 0 1

73 Automt 3 on words 73 1 b b Expressiveness deteterministic = nondeterministic left-to-right = right-to-left 2 exp. blow-up exp. more succinct b b b word is ccepted by the utomton Automt on trees 1. bottom-up LABEL( stte1, stte2 ) stte F T OR F AND T OR 0 F 1 T 0() F 1() T OR( F, F ) F OR( F, T ) T AND( T, F ) F 0 1

74 Automt 3 on words 74 1 b b Expressiveness deteterministic = nondeterministic left-to-right = right-to-left 2 exp. blow-up exp. more succinct b b b word is ccepted by the utomton F Automt on trees T OR F T AND 1 T F OR 0 T 1. bottom-up LABEL( stte1, stte2 ) stte 0() F 1() T OR( F, F ) F OR( F, T ) T AND( T, F ) F Accepting Sttes = { T } 0 1 tree is ccepted by the utomton

75 Automt 3 on words 75 1 b b Expressiveness deteterministic = nondeterministic left-to-right = right-to-left 2 exp. blow-up exp. more succinct b b b word is ccepted by the utomton F Automt on trees T OR F T T OR 0 1 AND T 0 F 1 1. bottom-up LABEL( stte1, stte2 ) stte 0() F 1() T OR( F, F ) F OR( F, T ) T AND( T, F ) F tree is ccepted by the utomton Accepting Sttes = { T } This utomton is deterministic. nondeterminism LABEL( st1, st2 ) { st3, }

76 76 Question How much memory do you need exctly, to run such bottom-up tree utomton? F Automt on trees T OR F T T OR 0 1 AND T 0 F 1 1. bottom-up LABEL( stte1, stte2 ) stte 0() F 1() T OR( F, F ) F OR( F, T ) T AND( T, F ) F tree is ccepted by the utomton Accepting Sttes = { T } This utomton is deterministic. nondeterminism LABEL( st1, st2 ) { st3, }

77 Similrly s for word utomt: 77 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton. Agin, the construction cn cuse exponentil size blow-up. 2. top-down stte, LABEL (stte1, stte2) b must contin $-lef,b = binry node lbels e,$ = lef node lbels top-most -node on the left-most pth must hve right-subtree which contins $-node.

78 Similrly s for word utomt: 78 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton. Agin, the construction cn cuse exponentil size blow-up. 2. top-down stte, LABEL (stte1, stte2) b must contin $-lef,b = binry node lbels e,$ = lef node lbels top-most -node on the left-most pth must hve right-subtree which contins $-node. begin, (ny, find$) begin, b (begin, ny) find$, /b { (find$, ny), (ny, find$) } find$, $ ACC find$, e REJ ny, /b (ny,ny) ny, $/e ACC nondeterministic ccept tree if there exists n ccepting run (= ll leves go to ACC)

79 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton. 79 Question Cn you find n equivlent bottom-up utomton for this exmple? 2. top-down stte, LABEL (stte1, stte2) b must contin $-lef,b = binry node lbels e,$ = lef node lbels top-most -node on the left-most pth must hve right-subtree which contins $-node. begin, (ny, find$) begin, b (begin, ny) find$, /b { (find$, ny), (ny, find$) } find$, $ ACC find$, e REJ ny, /b (ny,ny) ny, $/e ACC nondeterministic ccept tree if there exists n ccepting run

80 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton, nd there is n equivlent nondeterministic top-down tree utomton. 80 Yes! you cn 2. top-down stte, LABEL (stte1, stte2) b must contin $-lef,b = binry node lbels e,$ = lef node lbels top-most -node on the left-most pth must hve right-subtree which contins $-node. begin, (ny, find$) begin, b (begin, ny) find$, /b { (find$, ny), (ny, find$) } find$, $ ACC find$, e REJ ny, /b (ny,ny) ny, $/e ACC nondeterministic ccept tree if there exists n ccepting run

81 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton, nd there is n equivlent nondeterministic top-down tree utomton. 81 Question Is there n equivlent deterministic top-down utomton?? 2. top-down stte, LABEL (stte1, stte2) b must contin $-lef,b = binry node lbels e,$ = lef node lbels top-most -node on the left-most pth must hve right-subtree which contins $-node. begin, (ny, find$) begin, b (begin, ny) find$, /b { (find$, ny), (ny, find$) } find$, $ ACC find$, e REJ ny, /b (ny,ny) ny, $/e ACC nondeterministic ccept tree if there exists n ccepting run

82 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton, nd there is n equivlent nondeterministic top-down tree utomton. 82 Question Is there n equivlent deterministic top-down utomton?? NO! nme nme first lst lst first This set of two trees cnnot be recognized by ny determinstic top-down tree utomton!! Why?

83 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton, nd there is n equivlent nondeterministic top-down tree utomton. 83 Question Is there n equivlent deterministic top-down utomton?? NO! Questions Wht bout locl tree lnguges (defined by DTDs). Cn they be ccepted by deterministic top-down utomt? Wht bout single-type tree lnguges (defined by XML Schem s) Cn they be ccepted by deterministic top-down utomt?

84 For every nondeterministic bottom-up tree utomton there is n equivlent deterministic bottom-up tree utomton, nd there is n equivlent nondeterministic top-down tree utomton. 84 Question Is there n equivlent deterministic top-down utomton?? NO! Questions Wht bout locl tree lnguges (defined by DTDs). Cn they be ccepted by deterministic top-down utomt? Wht bout single-type tree lnguges (defined by XML Schem s) Cn they be ccepted by deterministic top-down utomt? Yes! Hence, there is no DTD / Schem for { nme[first,lst], nme[lst,first] }

85 For every deterministic bottom-up tree utomton there exists miniml unique equivlent one! 85 Equivlence is decidble In fct, YOU hve lredy produced miniml bottom-up tree utomt! The miniml DAG of tree t cn be seen s the miniml unique tree utomton tht only ccepts the tree t.

86 For every deterministic bottom-up tree utomton there exists miniml unique equivlent one! 86 Equivlence is decidble In fct, YOU hve lredy produced miniml bottom-up tree utomt! The miniml DAG of tree t cn be seen s the miniml unique tree utomton tht only ccepts the tree t. Question How expensive (complexity) to find mininml one? Sme s for word utomt?

87 Tree Automt re very useful concept in CS! 87 Hevily used in verifiction Derive property of complex object from the properties of its constituents h 1 h 2 g = glue 1 ( g 1, glue 2 ( h 1, h 2 ) ) glue 1 g 1 glue 2 g 1 h 1 h 2 Do ll grphs / chip-lyouts produced in this wy, hve property P? Use the hierrchicl construction history of n object, in order to work on prse tree insted of complex grph. From there, use tree utomt. Mny NP-complete grph problems become trctble on bounded-treewidth grphs!

88 XML Tree Automt ply crucil rule for 88 Efficient vlidtors ginst XML Types Optimiztions If doc1 is of TYPE1, then no need to vlidte ginst TYPE2, if we know TYPE2 included in TYPE1 - if only slightly different then only need to vlidte there - incrementl vlidtion ginst updtes -etc, etc. Efficient query evlutors, use richer utomt which cn select nodes nd produce query nswers Optimiztions If nswer of QUERY1 is in cche, then no need to evlute QUERY2, if included in QUERY1. - if every possible nswer set to QUERY1 (of TYPE X) is EMPTY, then no need to evlute on the rel dt! XML Type Checking for Progrmming Lnguges

89 The Future 89 In 5-10 yers from now: You cn write function in Progrmming Lnguge X Function foo(xml document D: TYPE1): TYPE2 { trverse D & compute output;... return output } Compiler (XML Type Checker) will complin, if your function does not compute documents of TYPE2. If no complint, then gurnteed: ALL outputs re ALWAYS of correct type!!)

90 The Future In 5-10 yers from now: You cn write function in Progrmming Lnguge X Function foo(xml document D: TYPE1): TYPE2 { trverse D & compute output;... return output } Experimentl PL s In this direction: CDuce XDuce 90 Compiler (XML Type Checker) will complin, if your function does not compute documents of TYPE2. If no complint, then correct type gurnteed. Compilers will hve to be ble to give sttic gurntees bout input/output behviour of progrm!

91 91 END Lecture 5

XML and Databases. Outline. Previous Lecture. XML Type Definition Languages. Sebastian Maneth NICTA and UNSW

XML and Databases. Outline. Previous Lecture. XML Type Definition Languages. Sebastian Maneth NICTA and UNSW Outline XML nd Dtses Lecture 5 XML Vlidtion using Automt Sestin Mneth NICA nd UNSW. Recp: deterministic Reg Expr s / Glushkov Automton. Complexity of DD vlidtion. Beyond DDs: XML Schem nd RELAX NG 4. Sttic

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

Good-for-Games Automata versus Deterministic Automata.

Good-for-Games Automata versus Deterministic Automata. Good-for-Gmes Automt versus Deterministic Automt. Denis Kuperberg 1,2 Mich l Skrzypczk 1 1 University of Wrsw 2 IRIT/ONERA (Toulouse) Séminire MoVe 12/02/2015 LIF, Luminy Introduction Deterministic utomt

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1 Non-Deterministic Finite Automt Fll 2018 Costs Busch - RPI 1 Nondeterministic Finite Automton (NFA) Alphbet ={} q q2 1 q 0 q 3 Fll 2018 Costs Busch - RPI 2 Nondeterministic Finite Automton (NFA) Alphbet

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51 Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!) CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

On Determinisation of History-Deterministic Automata.

On Determinisation of History-Deterministic Automata. On Deterministion of History-Deterministic Automt. Denis Kupererg Mich l Skrzypczk University of Wrsw YR-ICALP 2014 Copenhgen Introduction Deterministic utomt re centrl tool in utomt theory: Polynomil

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Foundations of XML Types: Tree Automata

Foundations of XML Types: Tree Automata 1 / 43 Foundtions of XML Types: Tree Automt Pierre Genevès CNRS (slides mostly sed on slides y W. Mrtens nd T. Schwentick) University of Grenole Alpes, 2017 2018 2 / 43 Why Tree Automt? Foundtions of XML

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tble of Contents: Week 1: Preliminries (set lgebr, reltions, functions) (red Chpters 1-4) Weeks

More information

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA Nondeterminism Nondeterministic Finite Automt Nondeterminism Subset Construction A nondeterministic finite utomton hs the bility to be in severl sttes t once. Trnsitions from stte on n input symbol cn

More information

XPath Node Selection over Grammar-Compressed Trees

XPath Node Selection over Grammar-Compressed Trees XPth Node Selection over Grmmr-Compressed Trees Sebstin Mneth University of Edinburgh with Tom Sebstin (INRIA Lille) XPth Node Selection Given: XPth query Q nd tree T Question: wc-time to evlute Q over

More information

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa CS:4330 Theory of Computtion Spring 208 Regulr Lnguges Equivlences between Finite utomt nd REs Hniel Brbos Redings for this lecture Chpter of [Sipser 996], 3rd edition. Section.3. Finite utomt nd regulr

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

Module 9: Tries and String Matching

Module 9: Tries and String Matching Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1 CS4 45- Determinisitic Finite Automt -: Genertors vs. Checkers Regulr expressions re one wy to specify forml lnguge String Genertor Genertes strings in the lnguge Deterministic Finite Automt (DFA) re nother

More information

Regular Languages and Applications

Regular Languages and Applications Regulr Lnguges nd Applictions Yo-Su Hn Deprtment of Computer Science Yonsei University 1-1 SNU 4/14 Regulr Lnguges An old nd well-known topic in CS Kleene Theorem in 1959 FA (finite-stte utomton) constructions:

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Part 5 out of 5. Automata & languages. A primer on the Theory of Computation. Last week was all about. a superset of Regular Languages

Part 5 out of 5. Automata & languages. A primer on the Theory of Computation. Last week was all about. a superset of Regular Languages Automt & lnguges A primer on the Theory of Computtion Lurent Vnbever www.vnbever.eu Prt 5 out of 5 ETH Zürich (D-ITET) October, 19 2017 Lst week ws ll bout Context-Free Lnguges Context-Free Lnguges superset

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 2013 Outline 1 Riemnn Sums 2 Riemnn Integrls 3 Properties

More information

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation Strong Bisimultion Overview Actions Lbeled trnsition system Trnsition semntics Simultion Bisimultion References Robin Milner, Communiction nd Concurrency Robin Milner, Communicting nd Mobil Systems 32

More information

Lecture 6 Regular Grammars

Lecture 6 Regular Grammars Lecture 6 Regulr Grmmrs COT 4420 Theory of Computtion Section 3.3 Grmmr A grmmr G is defined s qudruple G = (V, T, S, P) V is finite set of vribles T is finite set of terminl symbols S V is specil vrible

More information

Lexical Analysis Part III

Lexical Analysis Part III Lexicl Anlysis Prt III Chpter 3: Finite Automt Slides dpted from : Roert vn Engelen, Florid Stte University Alex Aiken, Stnford University Design of Lexicl Anlyzer Genertor Trnslte regulr expressions to

More information

CS5371 Theory of Computation. Lecture 20: Complexity V (Polynomial-Time Reducibility)

CS5371 Theory of Computation. Lecture 20: Complexity V (Polynomial-Time Reducibility) CS5371 Theory of Computtion Lecture 20: Complexity V (Polynomil-Time Reducibility) Objectives Polynomil Time Reducibility Prove Cook-Levin Theorem Polynomil Time Reducibility Previously, we lernt tht if

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 203 Outline Riemnn Sums Riemnn Integrls Properties Abstrct

More information

Formal Language and Automata Theory (CS21004)

Formal Language and Automata Theory (CS21004) Forml Lnguge nd Automt Forml Lnguge nd Automt Theory (CS21004) Khrgpur Khrgpur Khrgpur Forml Lnguge nd Automt Tle of Contents Forml Lnguge nd Automt Khrgpur 1 2 3 Khrgpur Forml Lnguge nd Automt Forml Lnguge

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

Fundamentals of Computer Science

Fundamentals of Computer Science Fundmentls of Computer Science Chpter 3: NFA nd DFA equivlence Regulr expressions Henrik Björklund Umeå University Jnury 23, 2014 NFA nd DFA equivlence As we shll see, it turns out tht NFA nd DFA re equivlent,

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

Finite-State Automata: Recap

Finite-State Automata: Recap Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under

More information

Streamed Validation of XML Documents

Streamed Validation of XML Documents Preliminries DTD Document Type Definition References Jnury 29, 2009 Preliminries DTD Document Type Definition References Structure Preliminries Unrnked Trees Recognizble Lnguges DTD Document Type Definition

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Faster Regular Expression Matching. Philip Bille Mikkel Thorup

Faster Regular Expression Matching. Philip Bille Mikkel Thorup Fster Regulr Expression Mtching Philip Bille Mikkel Thorup Outline Definition Applictions History tour of regulr expression mtching Thompson s lgorithm Myers lgorithm New lgorithm Results nd extensions

More information

1 Probability Density Functions

1 Probability Density Functions Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes CSE 260-002: Exm 3-ANSWERS, Spring 20 ime: 50 minutes Nme: his exm hs 4 pges nd 0 prolems totling 00 points. his exm is closed ook nd closed notes.. Wrshll s lgorithm for trnsitive closure computtion is

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

Where did dynamic programming come from?

Where did dynamic programming come from? Where did dynmic progrmming come from? String lgorithms Dvid Kuchk cs302 Spring 2012 Richrd ellmn On the irth of Dynmic Progrmming Sturt Dreyfus http://www.eng.tu.c.il/~mi/cd/ or50/1526-5463-2002-50-01-0048.pdf

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Learning Moore Machines from Input-Output Traces

Learning Moore Machines from Input-Output Traces Lerning Moore Mchines from Input-Output Trces Georgios Gintmidis 1 nd Stvros Tripkis 1,2 1 Alto University, Finlnd 2 UC Berkeley, USA Motivtion: lerning models from blck boxes Inputs? Lerner Forml Model

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

20 MATHEMATICS POLYNOMIALS

20 MATHEMATICS POLYNOMIALS 0 MATHEMATICS POLYNOMIALS.1 Introduction In Clss IX, you hve studied polynomils in one vrible nd their degrees. Recll tht if p(x) is polynomil in x, the highest power of x in p(x) is clled the degree of

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Non-deterministic Finite Automata

Non-deterministic Finite Automata Non-deterministic Finite Automt Eliminting non-determinism Rdoud University Nijmegen Non-deterministic Finite Automt H. Geuvers nd T. vn Lrhoven Institute for Computing nd Informtion Sciences Intelligent

More information

DISCRETE MATHEMATICS HOMEWORK 3 SOLUTIONS

DISCRETE MATHEMATICS HOMEWORK 3 SOLUTIONS DISCRETE MATHEMATICS 21228 HOMEWORK 3 SOLUTIONS JC Due in clss Wednesdy September 17. You my collborte but must write up your solutions by yourself. Lte homework will not be ccepted. Homework must either

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Riemann is the Mann! (But Lebesgue may besgue to differ.)

Riemann is the Mann! (But Lebesgue may besgue to differ.) Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri Preliminries Finite lphet, empty string.

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information