Validating Streaming XML Documents

Size: px
Start display at page:

Download "Validating Streaming XML Documents"

Transcription

1 Vlidting Streming XML Documents Luc Segoun INRIA-Rocquencourt Victor Vinu UC Sn Diego ABSTRACT This pper investigtes the on-line vlidtion of streming XML documents with respect to DTD, under memory constrints We rst consider vlidtion using constnt memory, formlized y nite-stte utomton (fs) We exmine two vors of the prolem, depending on whether or not the XML document is ssumed to e well-formed The min results of the pper provide conditions on the DTDs under which vlidtion of either vor cn e done using n fs For DTDs tht cnnot e vlidted y n fs, we investigte two lterntives The rst relxes the constnt memory requirement y llowing stck ounded in the depth of the XML document, while mintining the deterministic, one-pss requirement The second pproch consists in rening the DTD to provide dditionl informtion tht llows vlidtion y n fs 1 INTRODUCTION The Extended Mrkup Lnguge (XML) is emerging s the stndrd for dt exchnge on the We Mny pplictions, rnging from e-commerce nd B2B to scientic pplictions monitoring sensor or stellite dt, incresingly require on-line processing of lrge mounts of dt in XML formt using limited memory Such processing includes querying XML documents, computing running ggregtes of strems of numericl dt, nd vlidting XML documents ginst given Document Type Denitions (DTDs) In this pper we tke rst step towrds forml investigtion of processing streming XML documents, y studying the vlidtion question This is n importnt prcticl prolem, which is lredy eing tckled in in- This uthor supported in prt y the NSF under grnt numer IIS dustry, with some commercil products developed (see relted work elow) In its most restrictive form, the prolem of vlidting streming XML is to verify tht n XML document is vlid with respect to given DTD in single pss nd using xed mount of memory, depending on the DTD ut not on the size of the XML document In other words, vlidtion is done y nite-stte utomton (fs) performing pss on the XML document s it strems through the network, with constnt memory The prolem comes in two vors, depending on whether or not vlidtion includes checking tht the input is well-formed XML document Vlidtion tht includes checking well-formedness is referred to s strong vlidtion Checking stisfction of the DTD under the ssumption tht the input is well-formed XML document is referred to simply s vlidtion It is esy to see tht vlidtion of either vor is not possile for ll DTDs using n fs DTDs for which (strong) vlidtion cn e done using n fs re referred to s (strongly) recognizle DTDs The min results of the pper provide conditions on DTDs under which they re (strongly) recognizle The chrcteriztion of strongly recognizle DTDs is strightforwrd: the DTD hs to e non-recursive Chrcterizing recognizle DTDs is much more intricte nd techniclly dicult To put the prolem in perspective, note tht vlidtion with respect to DTD mounts to checking memership of the tree ssocited with the XML document in regulr tree lnguge, while vlidtion y n fs mounts to cceptnce of the tree y restricted form of tree-wlking utomton Thus, the connection etween fs nd DTDs cn e viewed s vrint (leit simpler) of the connection etween tree-wlking utomt nd regulr tree lnguges, long-stnding open prolem [8, 9] We otin severl kinds of results First, we precisely chrcterize recognizle DTDs when the DTDs re "fully recursive", ie ll element tgs tht led to recursive tgs re mutully recursive The condition we provide cn e tested in exptime with respect to the size of the DTD, nd in polynomil time for DTDs using 1- unmiguous regulr expressions, s required y XML- Schem [4] As side eect, we otin n lgorithm for

2 constructing from fully recursive DTD stndrd fs tht (i) lwys ccepts only documents vlid wrt the DTD (ut possily more), nd (ii) ccepts precisely the documents vlid wrt the DTD, whenever the DTD is recognizle The stndrd fs cn e constructed in time exponentil in the DTD For DTDs tht re not fully recursive, precise chrcteriztion of recognizility remins n open question We provide set of necessry conditions for recognizility, s well s n extension yielding sucient condition Furthermore, the construction of the stndrd fs cn e extended from the fully recursive cse to ritrry DTDs It turns out tht the sucient condition is chrcteriztion of the DTDs for which the stndrd fs ccepts precisely the documents vlid with respect to the DTD For the cse when vlidtion using n fs is not possile, we consider severl lterntives First, we relx the constnt memory requirement nd llow s uxiliry memory stck of depth liner in the depth of the XML document This is often resonle in prctice, since XML documents re typiclly firly shllow, lthough they my e very lrge We show tht every DTD cn e vlidted y deterministic pushdown utomton whose stck is liner in the depth of the input document Moreover, this holds even for DTDs extended with speciliztion, form of element sutyping present in recent proposls such s XML-Schem An orthogonl pproch is to explore whether non-recognizle DTDs cn e tweked in resonle wys so s to ecome recognizle We show tht for every DTD one cn nd speciliztion of it which is recognizle Intuitively, this is otined y rening the tgs of the originl DTD to include more informtion useful for quick vlidtion This provides trde-o etween \ccurcy" of the tgs nd the ility to perform ecient streming vlidtion Although limited to vlidtion, this pper provides necessry groundwork for further investigting the prolem of querying streming XML documents Indeed, the technicl mchinery developed here is likely to e useful for the more complex querying prolem Relted work As fr s we know, there is no forml work on vlidting or querying streming XML Heuristics for the evlution of regulr pth queries in streming XML documents re considered in [12] This is prt of lrger prototype, clled Tukwil, designed for processing streming XML documents currently developed in the University of Wshington [13] The Streming XML Vlidtor is commercil product from TIBCO tht performs vlidtion of streming XML with respect to DTD (see To our understnding, their pproch is sed on trditionl prsing techniques enhnced with heuristics gered towrds streming inputs A lot of work hs een done on continuous queries over the Internet [6, 3] nd on query suscription [16, 14] In this scenrio the query is xed nd outputs strem of dt produced on-line from n incoming strem of dt The emphsis is on ltering nd incrementl mintennce of views, including ggregte functions Another lrge ody of work focuses on numericl dt strems such s sensor dt The pper is orgnized s follows Our strction of XML documents nd DTDs, s well s sic notions on tree utomt re reviewed in the Preliminries Section 3 concerns strongly recognizle DTDs Section 4 presents the results on recognizle DTDs Alterntive pproches to vlidtion re descried in Section 5 Finlly, rief conclusions re provided in Section 6 2 PRELIMINARIES We introduce here the sic formlism used throughout the pper, including our strction of XML documents nd DTDs We lso recll informlly some sic notions relted to tree utomt nd lnguges Let e nite lphet Tree document We strct XML documents y \tree documents" cpturing the nesting structure of elements in the document A tree document t over is nite unrnked tree with lels in nd n order on the children of ech node The following represents simple tree document c c r c String representtion XML documents re string representtion of trees using opening nd closing tgs for ech element A streming XML processor sees the sequence of opening nd closing tgs in the order in which they pper in the document It is therefore useful to consider explicitly this string representtion of n XML document For ech 2 let itself represent the opening tg nd represent the closing tg for Let = f j 2 g With this nottion, the string ssocited to the tree document ove is rcc cc ccr More generlly, we ssocite to ech tree document t string representtion denoted [t] nd dened inductively s follows: if t is single root leled, then [t] = ; if t consists of root leled nd sutrees t 1 : : : t k then [t] = [t 1] : : : [t k ] Note tht nd cn e viewed s opening nd closing multisorted prenthesis, nd for ech tree document t the string [t] is well-lnced string over [ corresponding to depth-rst trversl of t If T is set of tree

3 documents, we denote y L(T ) the lnguge consisting of the string representtions of the tree documents in T Tree types nd DTDs DTDs nd their vrints provide typing mechnism for XML documents We will use severl notions of types for trees The rst corresponds closely to the DTDs proposed for XML documents, nd we therefore (y slight use) continue to use the sme term A DTD consists of n extended context-free grmmr 1 over lphet (we mke no distinction etween terminl nd non-terminl symols) A tree document over stis- es DTD d (or is vlid wrt d) if it is derivtion tree of the grmmr For exmple, the tree document r! ove is vlid wrt the DTD 2! c :! c? Since regulr expressions re closed under union, we cn ssume c! wlog tht ech DTD hs unique rule! R for ech symol 2 In the following R will denote oth the regulr expression nd the corresponding regulr lnguge The set of tree documents stisfying DTD d is denoted y SAT (d) We lso denote y L(d) the lnguge over [ consisting of the string representtions of ll tree documents in SAT (d), tht is f[t] j t 2 SAT (d)g Clerly, L(d) is context-free lnguge for every DTD d In fct, such lnguges of welllnced strings of multisorted prenthesis hve een studied in forml lnguge theory under the nme of Dyck lnguges [10] The most recent DTD proposl, clled XML-Schem, imposes restriction on the regulr expressions ssocited with ech symol: the expressions hve to e 1-unmiguous This property gurntees tht the deterministic fs for the regulr expression is polynomil in the expression Such regulr expressions nd other vrints re studied formlly in [4] We next consider n extension of sic DTDs, lso present in XML-Schem This is motivted y severe limittion of DTDs: their denition of the type of given tg depends only on the tg itself nd not on the context in which it occurs For exmple, this mens tht the singleton tree document represented ove cnnot e descried y DTD, ecuse the \type" of the rst diers from tht of the second This nturlly leds to n extension of DTDs with speciliztion (lso clled decoupled types) which, intuitively, llows dening the type of tg y severl \cses" depending on the context Specilized DTDs hve een studied in [17] nd re equivlent to formlisms proposed in [2, 7] They re present in restricted form in XML-Schem Formlly, we hve: 1 In n extended cfg, the right-hnd sides of productions re regulr expressions over the terminls nd nonterminls 2 c? is n revition for (cj) Definition 21 A specilized DTD over is tuple d = (; 0 ; d 0 ; ) where nd 0 re nite lphets; d 0 is DTD over 0 ; nd is mpping from 0 to A tree document t over stises specilized DTD d, if t 2 (SAT (d 0 )) Intuitively, 0 provides for some 's in set of speciliztions of, nmely those for which ( 0 ) = We lso denote y the homomorphism induced on strings nd trees y, extended whenever needed to symols in 0 y ( 0 ) = ( 0 ) Tree utomt We ssume fmilirity with sic notions of lnguge theory, including (nondeterministic) nite-stte utomt ((n)fs), context-free grmmr (cfg) nd lnguge (cfl), nd (deterministic) push-down utomton ((d)pd) (eg, see [11]) We lso use results on regulr tree lnguges nd tree utomt Regulr tree lnguges re nturl extensions to trees of the fmilir string regulr lnguges, nd re clssiclly dened for inry trees A nondeterministic top-down regulr tree utomton over hs nite set Q of sttes, including distinguished initil stte q 0 nd n ccepting stte q f In computtion, the utomton lels the nodes of the tree with sttes, ccording to set of rules, clled trnsitions An internl node trnsition is of the form (; q)! (q 0 ; q 00 ), for 2 It sys tht, if n internl node hs symol nd is leled y stte q, then its left nd right children my e leled y q 0 nd q 00, respectively A lef trnsition is of the form (; q)! q f for 2 It llows chnging the lel of lef with symol from q to the ccepting stte q f Ech computtion strts y leling the root with the strt stte q 0, nd proceeds y leling the nodes of the trees non-deterministiclly ccording to the trnsitions The input tree is ccepted if some computtion results in leling ll leves y q f A set of complete inry trees is regulr i it is ccepted y some top-down tree utomton Regulr lnguges of nite inry trees re surveyed in [18] The extension to the unrnked cse is discussed in [5] Regulr tree lnguges hve similr closure properties to regulr string lnguges, in oth the rnked nd unrnked cses It is worth noting tht regulr tree lnguges cn e de- ned y mny other equivlent formlisms, including ottom-up (non)deterministic utomt nd Mondic Second-Order logic (MSO) on the stndrd structures ssocited to trees Interestingly, it turns out tht specilized DTDs re precisely equivlent to top-down nondeterministic tree utomt over unrnked trees [5, 17] Thus, they dene precisely the regulr tree lnguges

4 This is more evidence tht specilized DTDs re roust nd nturl speciction mechnism Another useful kind of utomt on trees re the treewlking utomt (dened y [1] for the rnked cse) These re more sequentil in nture thn the utomt descried erlier: there is hed tht resides t ny time t single given node In the unrnked version, trnsitions depend on the current lel nd the stte, nd consist of moving the hed up, down (on the leftmost child), or horizontlly to the left or right neighor It is esily seen tht trees ccepted y tree-wlking utomt cn e dened in MSO, so re regulr tree lnguges Conversely, it is conjectured tht tree-wlking utomt cn only dene strict suset of the regulr tree lnguges [8, 9] 3 STRONG VALIDATION OF XML DOC- UMENTS We egin with the strong vlidtion prolem for streming tree documents Recll tht checking well-formedness of the XML document is now prt of the vlidtion prolem More formlly, let d e DTD (possily specilized) over nd consider the ssocited string lnguge L(d) over [ We wish to chrcterize the DTDs d for which L(d) cn e recognized y n fs, ie L(d) is regulr Such DTDs re clled strongly recognizle We rst illustrte the prolem with two exmples r! Exmple 31 : Consider the DTD d :!? which denes the trees with root r contining single rnch of ritrry length of nodes leled Thus, L(d) = fr n n r j n 2 Ng which is not regulr So, d cnnot e strongly vlidted y n fs nd is not strongly recognizle r! Exmple 32 : Consider the DTD d :! jc Now L(d) = r(( jcc)) r which is regulr So, d is strongly recognizle We provide complete chrcteriztion of the strongly recognizle (specilized) DTDs: they re precisely the non-recursive ones, dened next together with other relted notions used throughout the pper Definition 31 Let d e DTD over nd G d the grph constructed s follows: its set of vertices is, nd for ech rule! R in d there is n edge from to for ech occurring in some word in R We cll G d the dependency grph of d Two lels nd re mutully recursive if they elong to some cycle of G d, nd is recursive if it is mutully recursive with itself The DTD d is non-recursive i G d is cyclic Similrly, specilized DTD d = (; 0 ; d 0 ; ) is non-recursive i the DTD d 0 over 0 is non-recursive Finlly, DTD d is fully recursive if ll lels from which recursive lels re rechle in G d re mutully recursive We cn now show: Theorem 31 : A specilized DTD is strongly recognizle i it is non-recursive Proof: Let d = (; 0 ; d 0 ; ) e specilized DTD Suppose rst tht d is strongly recognizle, ie L(d) is regulr 3 Then there exists n fs A recognizing exctly L(d) Suppose towrds contrdiction tht d 0 is recursive nd let 2 0 e recursive lel in d 0 Hence there exists tree t in SAT (d 0 ) where repets long one pth The string [t] is of the form ru 1v 1wv 2u 2r where u 1u 2 nd v 1v 2 re well-lnced words corresponding to sutrees (or forests) of t By iterting the recursive prt of the derivtion from to, we otin tht [t] n = ru 1(v 1) n w(v 2) n u 2r is lso in L(d 0 ) for ech n > 0 Thus, ll words ([t n]) re ccepted y the fs A A simple pumping rgument then shows tht (ru 1(v 1) (n+k) w(v 2) n u 2r) is lso ccepted y A for some k > 0 This is contrdiction, since the string is not well-lnced Assume now tht d is non-recursive We cn ssume wlog tht \ 0 = ; For ech 2 0 construct n fs A recognizing ()R (), where R is the regulr expression ssocited to in d 0 An fs A recognizing L(d) is constructed inductively s follows Let A 0 e A r where r is the root lel For i 0, A i+1 is otined y modifying A i s follows For ech trnsition e = (p; ; q) of A i, where 2 0 : 1 dd copy A e of A 2 dd the trnsitions (p; ; i e) where i e is the strt stte of A e, nd (f e; ; q) for ech ccepting stte f e of A e 3 remove e Becuse d is non-recursive this process is sure to terminte Note tht the resulting fs is over lphet [ It is esy to verify tht the fs recognizes L(d) 2 To conclude the section, we consider somewht surprising converse to Theorem 31 One might legitimtely wonder if there re type systems other thn specilized DTDs tht dene fmilies T of trees tht cn e strongly vlidted y n fs Interestingly, the nswer turns out to e negtive, s shown next Theorem 32 : Let T e set of trees over The lnguge L(T ) is regulr i there exists non-recursive specilized DTD d such tht T = SAT(d) 3 Recll tht L(d) is lnguge over [

5 Proof: The \if" prt follows from Theorem 31 For the \only if" prt, suppose L(T ) is regulr so is recognized y some fs A From A we cn esily construct tree-wlking utomton A 0 tht performs depth-rst trversl of its input, simulting t ech step the corresponding move in A nd recognizing T Since treewlking utomt dene regulr tree lnguges, nd since specilized DTDs dene ll regulr tree lnguges (see Preliminries), there exists specilized DTD d such tht T = SAT(d) By Theorem 31, d is nonrecursive 2 4 VALIDATING WELL-FORMED XML DOCUMENTS We now consider the prolem of vlidting n XML document with respect to given DTD d, ssuming tht the XML document is well formed As efore, we would like to perform the vlidtion using n fs The previous requirement tht L(d) e regulr is now too strong, ecuse the fs only needs to work correctly on welllnced strings representing trees The prolem cn e formlized s follows Let L(Tree) denote the lnguge consisting of ll string representtions of trees over The DTD d cn e vlidted y n fs i there exists some regulr lnguge R such tht L(d) = L(Tree) \ R Such DTDs re clled recognizle The chrcteriztion of recognizle DTDs turns out to e non-trivil prolem In order to develop some intuition, we strt with severl exmples Exmple 41 : Let us revisit the DTD d of Exmple r! 31:!? Recll tht d is not strongly recognizle However, it is recognizle Indeed, if the input is known to e well lnced, it is sucient for n fs to check tht the string is of the form r r In other words, L(d) = L(Tree) \ r r We provide two more exmples of recognizle DTDs r!? Exmple 42 : Consider the DTD!!? with root r, which denes trees tht re verticl lterntions of nd under root r This DTD cn e vlidted ecuse L(d) = L(Tree) \ r() ( jjr) Exmple 43 : Consider the DTD!! This cn e vlidted y the following fs tht only llows the vlid trnsitions,,,,,,, rejecting ll the others We next provide n exmple of DTD tht is not recognizle Exmple 44 : Let d e the DTD: This DTD denes trees of the form: c c! ( j c j )! c! This DTD is not recognizle Intuitively, even if the document is ssumed to e well lnced, n fs cnnot store enough informtion to recll, when it reds n, whether the corresponding node hd left siling leled c (in which is not llowed to its right) The forml proof follows from Lemm 42, see Exmple 45 Also y wy of technicl wrm-up, it is worth noting tht conventionl wisdom relting to fs does not necessrily pply when inputs re restricted to well-lnced strings Bsic issues such s equivlence or minimiztion re quite dierent in this setting To illustrte, consider gin Exmple 42 The miniml deterministic fs corresponding to the regulr expression r() ( jjr) hs ve sttes, nd it is esily seen tht this is miniml mong ll fs vlidting the DTD However, it is y no mens unique { nother deterministic fs with ve sttes equivlent to the rst on well-lnced strings ut non-isomorphic to it is the miniml one for the regulr expression r(j) ( r ) Both fs hve the sme numer of sttes nd gree on the well-lnced strings However, the two fs disgree on the non well-lnced words For instnce, the regulr expression of Exmple 42 ccepts r while the one ove does not Thus there is no unique miniml fs on well-lnced inputs, unlike in the clssicl setting In prticulr, it is not cler how to minimize n fs vlidting given recognizle DTD However, equivlence of fs on welllnced inputs is decidle in exptime (y reduction to equivlence of top-down tree utomt) It is open whether this cn e improved

6 Before proceeding, we mke the following useful oservtion Lemm 41 : Let T e set of tree documents over lphet If L(T ) = L(Tree) \ R for some regulr lnguge R, then T = SAT(d) for some specilized DTD d computle in ptime from the fs for R Proof The construction of d is similr to the clssicl construction of cfg for the intersection of nother cfg with regulr lnguge, used to show closure of cfl's under intersection with regulr lnguges [10] The specilized lphet consists of triples (p; ; q) where 2 nd p; q re sttes of the fs A R for R The speciliztions of the root r re of the form (q 0; r; q f ) where q 0 is the strt stte nd q f n ccepting stte of A R The regulr lnguge ssocited to (p; ; q) is f(q 1; 1; q 2)(q 2; 2; q 3) : : : (q k ; k ; q k+1 ) j k > 0; 1 : : : k 2 R ; q i re sttes of A R, (p; ; q 1) nd (q k+1 ; ; q) re trnsitions in A Rg [f j 2 R nd (p; ; q) is trnsition in A Rg 2 We now ttempt to chrcterize recognizle DTDs Our sic rodmp is the following We lredy know from the previous section tht non-recursive DTDs re recognizle, since they re strongly recognizle We mnge to otin precise chrcteriztion of recognizle DTDs in the cse of fully recursive DTDs The chrcteriztion in the generl cse remins open However, we mke prtil progress y providing necessry conditions nd then extending them to sucient conditions for recognizility Our conjecture is tht the necessry conditions we provide re ctully lso sucient We egin with rst necessry condition in order for DTD to e recognizle As will e seen shortly, this condition is not sucient in generl However, we show in Theorem 41 tht the condition ecomes sucient in the specil cse of fully recursive DTDs Lemm 42 : Let d e recognizle DTD Then the following hold, where ; ; u; v; w re words over while x; y; z (possily suscripted) re individul symols: Let k e positive integer nd x i; z i, 1 i k e mutully recursive symols of d (not necessrily distinct) If x 1 2 R z1, 0 x k 0 2 R z1 nd u ix i?1v ix iw i 2 R zi for 1 < i k, then x 1v 2x 2 v k x k 0 must e in R z1 The proof of the lemm relies on rther involved pumping rgument nd is sketched elow We rst provide some intuition nd exmples The condition reltes to the inility of n fs to enforce non-trivil horizontl constrints on the structure of trees when they concern mutully recursive symols This stems from the inility to rememer the depth of elements, nd therefore to determine when nodes re silings Very roughly, the rule sttes tht wht is llowed t some depth must lso e llowed t ny depth, modulo limited locl constrints tht cn e enforced More speciclly, if x 1 nd x k re llowed to occur t the sme level (under z 1) nd x i?1 cn e \connected" to x i vi v i t some horizontl level for 1 < i k, then x 1 my e \connected" to x k vi the pth x 1v 2x 2 : : : v k x k t the sme level under z 1 Remrk: Note tht the condition ove cn e formulted s follows for k = 1 If x nd z re mutully recursive, x 2 R z nd 0 x 0 2 R z, then x 0 must lso e in R z We next consider few exmples Exmple 45 : Recll the DTD of Exmple 44 It is not recognizle ecuse it does not stisfy the condition in the ove lemm for k = 1 Indeed, is recursive in the DTD, R contins nd c, ut it does not contin c s required y Lemm 42 Exmple 46 : Consider the DTD! j! ()? This is not recognizle ecuse it does not stisfy condition of Lemm 42 for k = 2 Indeed, nd re mutully recursive, R contins nd, R contins ut R does not contin s required Proof of Lemm 42 (sketch) Suppose d is vlidted y n fs A with p sttes For ech 2 we x tree ^ rooted t nd vlid wrt d For simplicity, when the context is cler, we lso denote y ^ the string [^] If is word of 1 m of, ^ denotes the sequence of trees ^ 1 ^ m We will need the following fct, whose proof is strightforwrd ppliction of the pumping lemm for regulr lnguges Fct 1 : Let A e deterministic fs over, u 2, nd p the numer of sttes of A Let q e the stte of A reched fter reding u k, k p strting from some stte s Then the sme stte q is reched fter reding u k+p! strting from stte s We cn ssume wlog tht z 1 is the root of the documents ccepted y d The proof hs two steps We rst construct tree T in SAT (d), ssuming the hypothesis of the condition of the lemm Then we modify T nd otin nother tree T 0 tht is lso ccepted y A nd where the pttern required in the conclusion occurs under node leled z 1 The construction of T is somewht tricky, s we

7 hve to ensure tht pumping-like rgument cn e mde to show tht T 0 is lso ccepted We strt y giving some intuition for the construction Recll tht for ech 2, ^ denotes xed tree rooted t nd vlid for d, s well s its string representtion We rst dene some "pieces" used in the construction of T Since x i nd z i re mutully recursive, there is derivtion in d with pth contining x i followed y z i nd followed gin y x i Let ^x p! i e the tree depicted ellow which consists of p! itertions of the derivtion of x i from x i vi z i x i z i x i ^z i Figure 1: The trees ^x p! i Next, note tht ech z i cn e used to "connect" ^x p! i to x p! i?1 y expnding zi into ^uixi?1^vi^xp! i ^w i nd further expnding x i?1 into ^x p! i?1 Also, x p! 1 cn connect to x p! k y expnding z 1 into ^ 0 x k ^0 This llows to dene y induction the trees t i, depicted in Figure 2 Let T e t 1 Thus, T is otined y expnding t 1 with t k, which in turn is expnded with t k?1, etc The itertion ends y expnding t 2 with x p! 1 Next, let T 0 e the tree depicted in Figure 3 x 1 z 1 x 1 z 1 ^ ^x p! 1 ^v 2 ^x p! 2 v k ^x p! k Figure 3: The tree t 0 As we will prove formlly, the fs A (which hs p sttes) cnnot distinguish T from T 0 The sic intuition is s follows Consider the computtion of A on T nd T 0 The computtion cn e roken down into two phses: ^ 0 descending phse from the root consuming ll left sutrees long specied pth in ech tree, followed y n scending phse ck to the root In T the pth is the one going through the roots of the sutrees t k In T 0 it is the one going through the root of ^x p! 1 The fs A reches the sme stte fter its descending phse in oth T nd T 0 This is consequence of Fct 1, nd is shown formlly elow For the scending phse, it is enough to show tht A must e in the sme stte fter reding the sustrings corresponding to ^x p! i in T 0 nd t i in T The rgument is inductive The sis holds ecuse the sme stte is reched in the descending phse Suppose next tht A is in the sme stte q i fter reding the sustrings corresponding to ^x p! in i?1 T 0 nd t i?1 in T Next, A reds ^v i in oth trees This is followed in T 0 y ^x p! i, nd in T 0 y ^x p! i followed y n dditionl scending portion to the root of t i However, the extr scending string leves the A in the sme stte, gin s consequence of Fct 1 This rgument cn e iterted to show tht A returns to the root of T nd T 0 in the sme stte, so T nd T 0 re not distinguished The forml proof is omitted 2 We next show converse of Lemm 42: the necessry condition stted there in order for DTD to e recognizle is lso sucient when the DTD is fully recursive To do this, we rst show how to construct, from ny given DTD d, stndrd fs A d tht ccepts ll words in L(d) (nd possily more) We then show tht for fully recursive DTD's d stisfying the conditions of Lemm 42, A d ccepts precisely the words in L(d) Although we re primrily interested for the time eing in fully recursive DTDs, we provide for lter use the construction of A d for ritrry DTD's Construction of the stndrd fs We now outline the construction of the fs A d The construction extends the simpler one involved in the proof of Theorem 31 Let d e n ritrry DTD over lphet We will use the dependency grph G d of d Consider the equivlence reltion on whose equivlence clsses re the strongly connected components of G d Let e the prtil order on the clsses of where A B i for some 2 A nd 2 B there is n edge from to in G d Note tht hs minimum element: the clss of the root lel There re generlly severl mximl elements We construct A d y induction on strting from the mximl elements Let C e mximl element of This mens tht for every c 2 C, R c = fg or words in R c contin only symols tht re mutully recursive with c Let A c e n fs corresponding to the regulr expression R c Since A c is non-deterministic, we cn ssume wlog tht A c hs no "sink sttes", ie some ccepting stte is rechle from every stte We cn lso ssume tht the sets of sttes of the fss A c re disjoint for dierent c's Let A C e the fs whose set of sttes is the union of the sets of sttes of the fss A c for c 2 C We do not need to specify t this point initil nd nl sttes

8 x 1 x i x 2 z 1 z i z 2 x 1 x i x 2 z 1 z i z 2 ^ 0 t k ^0 ^u i t i?1 ^v i ^x p! i ^w i ^u 2 ^x p! 1 ^v 2 ^x p! 2 ^w 2 Figure 2: The trees t i for A C, ut we mrk the initil nd nl sttes of ech of the prticipting fss A c (the initil stte for A c is q c 0 nd the nl sttes f c 1 ; f c 2 ; : : : ) The trnsitions re dened s follows For ech trnsition (q; ; q 0 ) of A c we dd to A C the trnsitions (q; ; q 0) nd (f; ; q 0 ) for the initil stte q 0 nd for ech nl stte f i of A Now suppose tht C is clss of for which ll fs A D corresponding to clsses D such tht C D re lredy constructed We construct A C s follows Agin, for ech c 2 C, let A c e n fs corresponding to R c (with disjoint sttes for distinct c's) The set of sttes of A C is the union of the sets of sttes of the fss A c for c 2 C, similrly for the nl sttes, nd the initil stte is gin left unspecied The trnsitions of A C re dened s follows As in the se cse, for ech 2 C nd trnsition (q; ; q 0 ) in A c we dd to A C the trnsitions (q; ; q 0) nd (f; ; q 0 ) for the initil stte q 0 nd for ech nl stte f of A Unlike the se cse, we now hve to tke cre of symols elonging to some clss B for which C B For ech such we dd to A C new disjoint copy of the lredy constructed A B, together with the trnsitions (q; ; q 0) nd (f; ; q 0 ) for the copy of the initil stte q 0 nd for the copies of ech nl stte f of A This induction llows us to construct n fs A C for the minimum clss C contining the root lel r The nl fs A d is otined y dding new strt stte s nd nl stte g together with trnsitions (s; r; q 0) nd (f; r; g) for the strt stte q 0 nd ech nl stte f of A r We illustrte the construction of A d with some exmples q r 0 f r nd q 0 ; f 1 f 2 Thus the fs ssocited to the equivlence clss fg is: This yields the fs A d depicted in Figure 4 Notice tht A d recognizes ll the well-lnced words of L(d) But it lso recognizes dditionl well-lnced words such s rr It turns out tht this is unvoidle: there is no utomton tht recognizes the ove DTD This will e shown in Lemm 44 will show Exmple 48 : Revisit now the DTD d of Exmple! 43:! This induces one equivlence clss of symols: f; g The fs A nd A re: nd : Thus, the fs ssocited to the equivlence clss f; g is: q 0 ; f ; ; q 0; f If is ssumed to e the root, this yields the fs A d : Exmple 47 : Consider the DTD d r!!? The dependency grph G d hs the edges (r; ) nd (; ) The clsses of re frg; fg, nd frg fg The fs A r nd A re : ; ; Note tht A d is equivlent to the fs of Exmple 43

9 r q r 0 f r r Figure 4: A d nd the only well-lnced strings it ccepts re those in L(d) As expected, we cn esily show, y construction, the following property of A d Lemm 43 : For ech DTD d, let A d e the utomton constructed ove We hve: (i) every word in L(d) is ccepted y A d (ii) A d cn e constructed from d in exponentil time The construction of A d in the generl cse tkes time O(jdj jj ) where jdj is the mximum size of n fs for regulr expression of d, nd j j is the depth of the prtil order The exponentil is due to the repliction of fs's crried out in the construction Remrk The construction of the fs A d cn e strightforwrdly extended to specilized DTDs Note lso tht A d is non-deterministic even if the fs's A ssocited to R re deterministic The non-determinism stems from the fct tht closing tg my led to severl sttes We cn now prove converse of Lemm 42, which yields precise chrcteriztion of recognizle fully recursive DTDs Theorem 41 : The following re equivlent for ech fully recursive DTD d: (i) d is recognizle, (ii) d stises the conditions of Lemm 42, nd (iii) the set of well-lnced strings ccepted y the fs A d is precisely L(d) As consequence of Theorem 41, we cn show tht it is decidle whether fully recursive DTD d is recognizle, nd therefore whether the stndrd ssocited fs A d cn e used to vlidte it Theorem 42 : Given specilized fully recursive DTD d over xed lphet, it is decidle in exptime whether d is recognizle Remrk: The exponentil complexity ove is due to the construction of deterministic fs for the regulr expressions used y the DTD If the DTD only uses 1-unmiguous regulr expressions, such s required y XML schem [4], the complexity goes down to ptime Let us now consider DTDs tht re not fully recursive Consider gin the DTD of Exmple 47 Intuitively, the DTD cnnot e recognized ecuse n fs cnnot keep trck of the depth in the tree nd thus might llow the trnsition from to t depth dierent thn 1 The next lemm formlizes this intuition nd provides second necessry condition for recognizility Lemm 44 : Let d e recognizle DTD Then the following holds, where ; ; u; v; w re words over while x; y; z (possily suscripted) re individul symols: Let x 1; x 2; y; z e symols such tht x 1, x 2 nd z re mutully recursive in d If ux 1vx 2w 2 R y nd u 0 x 1v 0 x 2w 0 2 R z then ux 1v 0 x 2w must e in R y nd u 0 x 1vx 2w 0 must e in R z Remrk: The condition of Lemm 44 cn e formulted s follows in the specil cse when x 1 = x 2: (i) Suppose the occurrences of x 1 nd x 2 elow z re identicl Let x nd z e mutully recursive in d If uxvxw 2 R y nd u 0 xw 0 2 R z then u 0 xvxw 0 2 R z nd uxw 2 R y (ii) Suppose the occurrences of x 1 nd x 2 elow y re identicl If uxw 2 R y, u 0 xv 0 xw 0 2 R z then uxv 0 xw 2 R y nd u 0 xw 0 2 R z Exmple 49 : Consider the DTD of Exmple 47 r!!? This DTD is not recognizle ecuse it does not stisfy the condition of Lemm 44 Indeed, is recursive, R r contins, R contins ut R does not contin This violtes the condition of the lemm Exmple 410 : Consider the DTD r!!! This DTD is not recognizle ecuse it does not stisfy the

10 condition of Lemm 44 Indeed, nd re mutully recursive, R r contins, R contins ut R r does not contin Note tht, if we replce the rst rule y r!, conditions of Lemms 42 nd Lemm 44 re stised nd the resulting DTD is recognized y A d We conjecture tht the necessry conditions provided y Lemms 42 nd 44 re in fct precise chrcteriztion for DTD recognizility However, this remins open Short of complete chrcteriztion of recognizle DTDs, we provide of chrcteriztion of when DTD d is vlidted y the stndrd fs A d The conditions re those of Lemms 42 nd 44, together with n dditionl condition stted next: (*) Let ; ; u; v; w e words over nd x; y; z (possily suscripted) e individul symols Let k nd k 0 e positive integers Let (x i) 1ik, (z i) 2ik, (x 0 i) 1ik 0, (zi) 0 2ik 0, nd y e symols of such tht x 1 = x 0 1, x k = x 0 k 0, nd ll the xi; x0 i; z i; zi 0 re mutully recursive in d (not necessrily distinct) If ux 1v 1x 2 v k?1 x k w 2 R y nd for ech 2 i k we hve f ix i?1 i; 0 ix iig 0 R zi nd for ech 2 i k we hve u 0 ix i?1vix 0 iwi 0 2 R z 0 i then ux 0 1v1 0 x 0 kw must e in R y nd, for ech 2 i k, ix i?1v i?1x i 0 must e in R zi The next result provides precise chrcteriztion of the DTDs d tht re vlidted y the stndrd fs A d Theorem 43 : Let d e DTD The following re equivlent: (i) d stises (*) nd the conditions of Lemms 42 nd 44, nd (ii) the set of well-lnced strings ccepted y the fs A d is precisely L(d) We note tht the conditions of Theorem 43 cn e veried in time douly exponentil with respect to d This is done y checking directly tht A d vlidtes d, s follows We rst uild specilized DTD d 0 such tht SAT (d 0 ) consists of the trees ccepted y A d This cn e done in exptime y Lemm 41 Next, the equivlence of d nd d 0 cn e checked in exptime using tree utomt equivlence test To understnd why the conditions in Theorem 43 re not complete chrcteriztion of recognizle DTDs, consider the following exmple, tht provides recognizle DTD d violting (*) For this DTD, we will exhiit n fs dierent from the stndrd A d, tht vlidtes it Exmple 411 : Consider the DTD d: r! c! dj d! dcj! jj c! jcj First notice tht A d does not recognize this DTD ecuse d violtes (*) Indeed the DTD stises the premise of (*) ut not its conclusion For exmple, dc is not in R r s required However, consider the fs tht works like A d, ut dditionlly counts the numer of trnsitions d nd d modulo 2 nd ccepts only if the two re equl It cn e veried tht this fs vlidtes d In summry, the conditions of Lemms 42 nd 44 re necessry in order for DTD to e recognizle The conditions of Theorem 43 re sucient, nd in prticulr provide precise chrcteriztion of when the stndrd fs works The complete chrcteriztion of recognizle DTDs remins open 5 ALTERNATIVE APPROACHES TO VALIDATION We next consider two lterntive pproches for vlidting DTDs tht re not recognizle The rst is to relx the constnt memory requirement The second consists in rening the originl DTD y dding informtion llowing it to e vlidted y n fs Vlidtion with ounded stck We egin with relxing the memory requirement Specificlly, we llow s uxiliry memory stck whose depth is ounded in the depth of the XML document The requirement tht vlidtion e done in single, deterministic pss is mintined This pproch is ppeling in prctice, ecuse mny XML documents tend to e shllow even if their DTDs re recursive We strt with simple exmple Exmple 51 : Consider the DTD of Exmple 47 r!!? which is not recognizle However, deterministic pd cn vlidte the DTD y llowing only trnsitions nd nd rememering the current depth using the stck In ddition, the pd llows single trnsition nd only t depth one Note tht this pd is deterministic nd its stck never exceeds the depth of the tree represented y the well-lnced input string Rther surprisingly, we cn show tht every specilized DTD cn e strongly vlidted y deterministic pd When the input string is well-lnced, the stck of the pd is ounded in the depth of the tree represented y the input string Theorem 51 : Let d e specilized DTD There exists deterministic pd tht ccepts precisely L(d)

11 using stck of depth ounded y the mximum numer of unmtched open tgs occurring s the input is red from left to right In prticulr, if the input string is well-lnced, the depth of the stck is ounded y the depth of the tree represented y the input string Proof: Let d = (; 0 ; d 0 ; ) Recll tht d 0 is DTD over 0 nd is the ssocited speciliztion mpping We wish to check whether string w over [ represents tree stisfying d The stck is used to check tht the string represents tree nd to keep informtion out the pth from the root to the currently visited node in the tree For ech node long the pth, the stck keeps set of cndidte speciliztions for the node lel, comptile with the informtion seen so fr Intuitively, cndidte speciliztion is cceptle if there re cceptle speciliztions of its children whose sequence forms word in the regulr lnguge R ssocited to y d 0 The pd must verify this recursively, nd ccept the input if the root is left with t lest one cceptle speciliztion To chieve this, the pd simultes the run of the fs for R on the children of given node with cndidte speciliztion This is done y keeping on the stck, together with ech such, the set of sttes reched in the fs for R fter reding the sequence of children seen so fr, with their respective llowed speciliztions This cn e done ecuse the stck symol contining this informtion for given node ecomes the top of the stck every time one of its sutrees hs een completely red After reding the entire sequence of its children with their llowed speciliztions, cndidte speciliztion for the node is discounted unless the ssocited set of sttes contins some ccept stte in the fs for R We now descrie the pd in more detil For ech 2 0 let A e the stndrd non-deterministic fs for R, with strt stte q 0 Let Q e the disjoint union of the sets of sttes of the fs's A The stck lphet of the pd, denoted V, consists of symols of the form (; S) where 2, nd S is set of elements h 0 ; Hi such tht 0 2 0, ( 0 ) =, nd H is suset of the sttes Q Thus, V is suset of Q The trnsitions work s follows When 2 is red, the symol (; fh 0 ; fq 0 0gi j ; = ( 0 )g) of V is pushed on the stck When symol 2 is red, the pd pops the current stck symol If the input string is well lnced, the top of the stck must e of the form (; S); otherwise the input is rejected Note tht, since the sutree rooted t hs een completely processed, we now know which of the cndidte speciliztions of re cceptle: they re the 0 such tht h 0 ; Hi 2 S nd H contins some ccepting stte of A 0 At this point the new top of the stck symol, sy (; T ), needs to e updted The symol is popped nd replced t the top of the stck y (; new(t )) where new(t ) contins, for ech h 0 ; B 0 i 2 T the pir h 0 ; new(b 0 )i where new(b 0 ) contins the sttes q 0 such tht (q; 0 ; q 0 ) is trnsition of the fs A 0 for some q 2 B 0 nd some llowed speciliztion 0 of occurring in S Finlly, the pd ccepts if the root node leled r hs t lest one cceptle speciliztion r 0 This informtion is ville in the lst symol popped from the stck efore it ecomes empty It is strightforwrd to check tht the ove pd ccepts L(d) 2 Refining the DTD We nlly consider n pproch to vlidtion orthogonl to the ones exmined so fr It consists of rening the given DTD y providing in the tgs dditionl informtion tht cn e used for vlidtion The renement is formlized y speciliztion of the originl DTD More precisely, we cn show the following Theorem 52 : For every DTD d over there exists n equivlent specilized DTD d = (; 0 ; d 0 ; ) of size qudrtic in d such tht d 0 is recognizle Proof: For ech 2, let A e stndrd nondeterministic fs for the regulr lnguge R speci- ed for y the DTD d The ide for constructing the specilized DTD d is strightforwrd: keep trck in the tgs of the children of node of the stte of A in n ccepting computtion on the sequence of children tgs More precisely, let Q e the disjoint union of the sets of sttes of the fs's A nd let 0 = Q The DTD d 0 ssocites to ech symol (; q) in 0 the regulr lnguge consisting of ll words of the form ( 1; q 1)( 2; q 2) : : : ( k ; q k ) such tht 1 2 : : : k 2 R nd (q i?1; i; q i) re vlid trnsitions in A, 1 i k, where q 0 is the strt stte nd q k n ccept stte for A Clerly, the specilized DTD d is equivlent to d An fs cn vlidte well-lnced input strings wrt the DTD d 0 y llowing only the following trnsitions: 1 (; q)( 1; q 1) where (q 0; 1; q 1) is trnsition in A nd q 0 is the strt stte of A ; 2 (; q)(; p) where (q; ; p) is trnsition in the fs to which q elongs 3 (; q) (; p) where q is n ccepting stte in the fs to which it elongs 2 Exmple 52 : Revisit the DTD of Exmple 47 r!!? which is not recognizle However the following DTD r! 1 2 1!? 1 2!? 2 is recognizle (y the regulr expression r r) nd denes similr fmily of tree documents 6 CONCLUSIONS This pper provides rst step towrds the forml investigtion of processing streming XML We focused

12 on the prolem of on-line vlidtion of streming XML documents with respect to DTD, under memory constrints The min results provide conditions under which vlidtion cn e done in single pss nd constnt memory, using n fs We lso considered lterntive pproches y relxing the constnt memory requirement or y enriching the DTD with dditionl informtion tht cn e used in vlidtion Severl questions remin open Minly, precise chrcteriztion of recognizle DTDs is not yet ville, except in the fully recursive cse For the generl cse, we conjecture tht (i) the necessry conditions we provided for DTD to e recognizle re lso sucient, nd (ii) whenever DTD d is recognizle it cn e vlidted y the stndrd fs A d ugmented with counting certin ptterns modulo 2, s discussed in Exmple 411 Another interesting open prolem concerns chrcterizing the specilized DTDs tht re recognizle It cn e seen tht the conditions we provided for recognizle DTDs no longer work when speciliztion is llowed Indeed, the prolem seems considerly hrder in this cse Note tht, since every recognizle fmily of trees is necessrily denle y specilized DTD (Lemm 41), chrcterizing the recognizle specilized DTDs would essentilly close the prolem of understnding which fmilies of trees cn e vlidted y fs Finlly, it would e useful to exhiit nturl clsses of DTDs tht cn lwys e vlidted y n fs, y providing restricted speciction lnguges for document structure tht re powerful enough for wide rnge of pplictions of prcticl interest Beyond the immedite focus on vlidtion, we expect tht the techniques developed here will lso e useful in investigting the more complex prolem of querying streming XML documents Acknowledgment We wish to thnk Bertrm Ludescher nd Ynnis Ppkonstntinou for pointing to us the prolem of processing streming XML We re lso grteful to Serge Aiteoul nd Tov Milo for interesting discussions on the topic 7 REFERENCES [1] AV Aho nd JD Ullmn Trnsltions on context free grmmr Informtion nd Control, 19(19):439{475, 1971 [2] C Beeri nd T Milo Schems for integrtion nd trnsltion of structured nd semi-structured dt In Int'l Conf on Dtse Theory, pges 296{313, 1999 [3] S Bu nd J Widom Continuous Queries over Dt Strems In Sigmod Record, Sept 2001 [4] A Bruggemnn-Klein nd D Wood One-unmiguous regulr lnguges Informtion nd Computtion, 142(2):182{206, My 1998 [5] A Bruggemnn-Klein, M Murt, nd D Wood Regulr tree nd regulr hedge lnguges over non-rnked lphets Hong Kong Univ of Science nd Technology Computer Science Center Reserch Report HKUST-TCSC , 2001 Aville t [6] J Chen et l NigrCQ: A Sclle Continuous Query System for Internet Dtses In Proc ACM SIGMOD Conf, Dlls, TX, June 2000 [7] S Cluet, C Deloel, J Simeon, nd K Smg Your meditors need dt conversion! In Proc ACM SIGMOD Conf, pges 177{188, 1998 [8] J Engelfriet nd H J Hoogeoom Tree-wlking pele utomt In J Krhum ki, H Murer, G Pun nd G Rozenerg, eds, Jewels re forever, contriutions to Theoreticl Computer Science in honor of Arto Slom, pp 72-83, Springer-Verlg, 1999 [9] J Engelfriet, H J Hoogeoom nd J-P vn Best Trips on trees Act Cyernetic, 14, pp 51-64, 1999 [10] S Ginsurg The Mthemticl Theory of Context-Free Lnguges McGrw-Hill, 1966 [11] J E Hopcroft nd J D Ullmn Introduction to Automt Theory, Lnguges, nd Computtion Addison-Wesley, 1979 [12] Z Ives, Alon Levy nd D Weld Ecient Evlution of Regulr Pth Expressions on Streming XML Dt Technicl Report, University of Wshington, 2000 [13] Z Ives, Alon Levy nd D Weld Integrting Network-Bound XML Dt Dt Engineering Bulletin, 24(2), 2001 [14] Ling Liu nd Clton Pu nd Wei Tng nd Wei Hn Conquer: A continul query system for updte monitoring in the WWW In Interntionl Journl of Computer Systems, Science nd Engineering, 2000 [15] F Neven nd T Schwentick On the Power of Tree-Wlking Automt ICALP 2000: [16] Benjmin Nguyen, Serge Aiteoul, Gregory Coen, Mihi Pred Monitoring XML dt on the We In Proc ACM SIGMOD Conf, pp , 2001 [17] Y Ppkonstntinou nd V Vinu DTD inference for views of XML dt In Proc ACM PODS, pp 35-46, 2000 [18] GRozenerg nd A Slom Hndook of Forml Lnguge, vol 3 Springer Verlg, 1997

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Foundations of XML Types: Tree Automata

Foundations of XML Types: Tree Automata 1 / 43 Foundtions of XML Types: Tree Automt Pierre Genevès CNRS (slides mostly sed on slides y W. Mrtens nd T. Schwentick) University of Grenole Alpes, 2017 2018 2 / 43 Why Tree Automt? Foundtions of XML

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch.

Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch. Overview H9 Vertlerouw H 9: Prsing: op-down & LL(1) do 3 mei 2001 56 heo Ruys h. 8 - Prsing 8.1 ontext-free Grmmrs 8.2 op-down Prsing 8.3 LL(1) Grmmrs See lso [ho, Sethi & Ullmn 1986] for more thorough

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Revision Sheet. (a) Give a regular expression for each of the following languages:

Revision Sheet. (a) Give a regular expression for each of the following languages: Theoreticl Computer Science (Bridging Course) Dr. G. D. Tipldi F. Bonirdi Winter Semester 2014/2015 Revision Sheet University of Freiurg Deprtment of Computer Science Question 1 (Finite Automt, 8 + 6 points)

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Formal Languages and Automata Theory. D. Goswami and K. V. Krishna

Formal Languages and Automata Theory. D. Goswami and K. V. Krishna Forml Lnguges nd Automt Theory D. Goswmi nd K. V. Krishn Novemer 5, 2010 Contents 1 Mthemticl Preliminries 3 2 Forml Lnguges 4 2.1 Strings............................... 5 2.2 Lnguges.............................

More information

Regular Languages and Applications

Regular Languages and Applications Regulr Lnguges nd Applictions Yo-Su Hn Deprtment of Computer Science Yonsei University 1-1 SNU 4/14 Regulr Lnguges An old nd well-known topic in CS Kleene Theorem in 1959 FA (finite-stte utomton) constructions:

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages lgorithms & Models of omputtion S/EE 374, Spring 209 NFs continued, losure Properties of Regulr Lnguges Lecture 5 Tuesdy, Jnury 29, 209 Regulr Lnguges, DFs, NFs Lnguges ccepted y DFs, NFs, nd regulr expressions

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Homework Solution - Set 5 Due: Friday 10/03/08

Homework Solution - Set 5 Due: Friday 10/03/08 CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

PART 2. REGULAR LANGUAGES, GRAMMARS AND AUTOMATA

PART 2. REGULAR LANGUAGES, GRAMMARS AND AUTOMATA PART 2. REGULAR LANGUAGES, GRAMMARS AND AUTOMATA RIGHT LINEAR LANGUAGES. Right Liner Grmmr: Rules of the form: A α B, A α A,B V N, α V T + Left Liner Grmmr: Rules of the form: A Bα, A α A,B V N, α V T

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7 CS103 Hndout 32 Fll 2016 Novemer 11, 2016 Prolem Set 7 Wht cn you do with regulr expressions? Wht re the limits of regulr lnguges? On this prolem set, you'll find out! As lwys, plese feel free to drop

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Domino Recognizability of Triangular Picture Languages

Domino Recognizability of Triangular Picture Languages Interntionl Journl of Computer Applictions (0975 8887) Volume 57 No.5 Novemer 0 Domino Recognizility of ringulr icture Lnguges V. Devi Rjselvi Reserch Scholr Sthym University Chenni 600 9. Klyni Hed of

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

Bases for Vector Spaces

Bases for Vector Spaces Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v. 1 Exercises Chpter 1 Exercise 1.1. Let Σ e n lphet. Prove wv = w + v for ll strings w nd v. Prove # (wv) = # (w)+# (v) for every symol Σ nd every string w,v Σ. Exercise 1.2. Let w 1,w 2,...,w k e k strings,

More information

CM10196 Topic 4: Functions and Relations

CM10196 Topic 4: Functions and Relations CM096 Topic 4: Functions nd Reltions Guy McCusker W. Functions nd reltions Perhps the most widely used notion in ll of mthemtics is tht of function. Informlly, function is n opertion which tkes n input

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

This lecture covers Chapter 8 of HMU: Properties of CFLs

This lecture covers Chapter 8 of HMU: Properties of CFLs This lecture covers Chpter 8 of HMU: Properties of CFLs Turing Mchine Extensions of Turing Mchines Restrictions of Turing Mchines Additionl Reding: Chpter 8 of HMU. Turing Mchine: Informl Definition B

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

Streamed Validation of XML Documents

Streamed Validation of XML Documents Preliminries DTD Document Type Definition References Jnury 29, 2009 Preliminries DTD Document Type Definition References Structure Preliminries Unrnked Trees Recognizble Lnguges DTD Document Type Definition

More information

SWEN 224 Formal Foundations of Programming WITH ANSWERS

SWEN 224 Formal Foundations of Programming WITH ANSWERS T E W H A R E W Ā N A N G A O T E Ū P O K O O T E I K A A M Ā U I VUW V I C T O R I A UNIVERSITY OF WELLINGTON Time Allowed: 3 Hours EXAMINATIONS 2011 END-OF-YEAR SWEN 224 Forml Foundtions of Progrmming

More information

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

Handout: Natural deduction for first order logic

Handout: Natural deduction for first order logic MATH 457 Introduction to Mthemticl Logic Spring 2016 Dr Json Rute Hndout: Nturl deduction for first order logic We will extend our nturl deduction rules for sententil logic to first order logic These notes

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

2.4 Linear Inequalities and Interval Notation

2.4 Linear Inequalities and Interval Notation .4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or

More information

Centrum voor Wiskunde en Informatica REPORTRAPPORT. Supervisory control for nondeterministic systems

Centrum voor Wiskunde en Informatica REPORTRAPPORT. Supervisory control for nondeterministic systems Centrum voor Wiskunde en Informtic REPORTRAPPORT Supervisory control for nondeterministic systems A. Overkmp Deprtment of Opertions Reserch, Sttistics, nd System Theory BS-R9411 1994 Supervisory Control

More information