Regular Expressions and NFAs without ε-transitions

Size: px
Start display at page:

Download "Regular Expressions and NFAs without ε-transitions"

Transcription

1 Regulr Expressions nd NFAs without ε-trnsitions Georg chnitger Institut für Informtik, Johnn Wolfgng Goethe-Universität, Robert Myer trße 11 15, Frnkfurt m Min, Germny georg@thi.informtik.uni-frnkfurt.de Abstrct. We consider the problem of converting regulr expressions into ε-free NFAs with s few trnsitions s possible. If the regulr expression hs length n nd is defined over n lphbet of size k, then the previously best construction uses O(n min{k, log 2 n} log 2 n) trnsitions. We show tht O(n log 2 2k log 2 n) trnsitions suffice. For smll lphbets, for instnce if k = O(log 2 log 2 n), we further improve the upper bound to O(k 1+log n n). In prticulr, O(2 log 2 n n) trnsitions nd hence lmost liner size suffice for the binry lphbet! Finlly we show the lower bound Ω(n log 2 2 2k) nd s consequence the upper bound O(n log 2 2 n) of [7] for generl lphbets is best possible. Thus the conversion problem is solved for lrge lphbets (k = n Ω(1) ) nd lmost solved for smll lphbets (k = O(1)). Keywords: Automt nd forml lnguges, descriptionl complexity, nondeterministic utomt, regulr expressions. 1 Introduction One of the centrl tsks on the border between forml lnguge theory nd complexity theory is to describe infinite objects such s lnguges by finite formlisms such s utomt, grmmrs, expressions mong others nd to investigte the descriptionl complexity nd cpbility of these formlisms. Formlisms like expressions nd finite utomt hve proven to be very useful in building compilers, nd techniques converting one formlism into nother were used s bsic tools in the design of computer systems such s UNIX ([12] nd [5], p. 123). A typicl ppliction in lexicogrphicl nlysis strts with regulr expression tht hs to be converted into n ε-free nondeterministic finite utomton. Here, the descriptionl complexity of n expression R is its length nd the descriptionl complexity of nondeterministic finite utomton (NFA) is the number of its edges or trnsitions, where identicl edges with distinct lbels re differentited. All clssicl conversions [1, 3, 9, 12] produce ε-free NFAs with worst-cse size qudrtic in the length of the given regulr expression nd for some time this ws ssumed to be optiml [10]. But then Hromkovic, eibert nd Wilke [7] constructed ε-free NFAs with surprisingly only O(n(log 2 n) 2 ) trnsitions for regulr expressions of length n nd this trnsformtion cn even be implemented to run in time O(n log 2 n + m), where m is the size of the output [4]. ubsequently Geffert [2] showed tht even ε-free NFAs with O(n k log 2 n) trnsitions suffice for lphbets of size k, improving the bound of [7] for smll lphbets. We considerbly improve the upper bound of [2] for lphbets of smll size k. In prticulr we show Work supported by DFG grnt CHN 503/4-1.

2 Theorem 1. Every regulr expression R of length n over n lphbet of size k cn be recognized by n ε-free NFA with t most trnsitions. O(n min{log 2 n log 2 2k, k 1+log n }) As first consequence we obtin ε-free NFAs of size O(n log 2 n log 2 2k) for regulr expressions of length n over n lphbet of size k. For smll lphbets, for instnce if k = O(log 2 log 2 n), the upper bound O(n k 1+log n ) is better. In prticulr, O(n 2 log 2 n ) trnsitions nd hence lmost liner size suffice for the binry lphbet. A first lower bound ws lso given in [7], where it is shown tht the regulr expression E n = (1 + ε) (2 + ε) (n + ε) over the lphbet {1,..., n} requires NFAs of size t lest Ω(n log 2 n). Lifshits [8] improves this bound to Ω(n(log 2 n) 2 / log 2 log 2 n). We use ides developed in [8] to prove the following optiml symptotic bound for E n. Theorem 2. There re regulr expressions of length n over n lphbet of size k such tht ny equivlent ε-free NFA hs t lest Ω(n log 2 2 2k) trnsitions. Thus the construction of [7] is optiml for lrge lphbets, i.e., if k = n Ω(1). ince Theorem 1 is lmost optiml for lphbets of fixed size, only improvements for lphbets of intermedite size, i.e., ω(1) = k = n o(1), re still required. In ection 2 we show how to construct smll ε-free NFAs for given regulr expression R using ides from [2,7]. We obtin in Lemm 1 the upper bound O(n log 2 n log 2 2k) by short-cutting ε-pths within the cnonicl NFA. Wheres the cnonicl NFA (with ε-trnsitions) is derived from the expression tree of R, the shortcuts re derived from decomposition tree which is blnced version of the expression tree. The subsequent improvement for smll lphbets is bsed on repetedly pplying the previous upper bound to lrger nd lrger subexpressions. We show the lower bound for the regulr expression E n in ection 3. Conclusions nd open problems re stted in ection 4. 2 mll ε-free NFAs for Regulr Expressions We first describe the cnonicl construction of n NFA (with ε-trnsitions), given the expression tree of given regulr expression R. We then proceed in the next section by defining decomposition tree for R. After showing in section 2.2 how to obtin smll ε-free NFA from the decomposition tree we then recursively pply our recipe in section 2.3 to obtin close to optiml ε-free NFAs. First observe tht R = R 1 + R2 or R = for subexpressions R 1, R 2, of R. Fig. 1 shows this recursive expnsion in NFA-nottion. Thus, fter completing this recursive expnsion, we rrive t the cnonicl NFA N R with unique initil stte q 0 nd unique finl stte q f. Moreover, no trnsition enters q 0 nd no trnsition leves q f ; thus q f is trp stte. Finlly observe tht N R hs t most O(n) trnsitions for ny regulr expression R of length n. 2.1 Decomposition Trees Let T R be the expression tree of R with root r. To define decomposition trees we first introduce prtil cuts, where prtil cut is set of nodes of T R with no two nodes of C being ncestors or descendnts of ech other. We define TR v (C), for

3 R1 R2 R 1 ε ε conctention R2 union ε str Fig.1. The initil step in determining the NFA N R for regulr expression R. The undirected version of N R is series-prllel grph with unique source nd unique sink. The sink is trp stte. node v nd prtil cut C of T R, s the subtree of T R with root v nd ll childrenlinks for nodes in C removed. Hence nodes in C which re lso descendnts of v re (rtificil) leves in TR v (C). Moreover we lbel the lef for x C with n rtificil symbol denoting the regulr subexpression determined by x in T R : if R v (C) denotes the subexpression specified by TR v(c), then Rv (C) contins for ech rtificil lef the corresponding rtificil symbol. Finlly NR v (C) is the NFA obtined by recursively expnding R v (C) except for the rtificil symbols of R v (C); ny rtificil symbol is modeled by n rtificil trnsition p q, where p is the unique initil stte nd q is the unique finl stte of the cnonicl NFA for. + R 4 R 4 R 1 R 3 + R 3 R 2 R 1 R 2 Fig.2. Expression tree nd cnonicl NFA with rtificil trnsitions for R = (R 3 (R 1 + R 2)) + R 4 We introduce the decomposition tree TR for R s blnced, smll depth version of T R. We begin by determining seprting node v of T R, nmely node of T R with subtree of t lest n 3, but less thn 2n 3 leves. Then T 1 = TR v ( ) is the subtree of T R with root v nd T 1 determines the regulr expression = R v ( ). We remove the edge connecting v to its prent, rettch v s n rtificil lef lbeled with the rtificil symbol nd obtin the second subtree T 2 = TR r ({v}) specifying the regulr expression R r ({v}). We obtin the originl expression R fter replcing the rtificil symbol in R r ({v}) by the expression R v ( ). NR r ({v}) contins unique trnsition q 1 q2 with lbel. We obtin N R from NR r ({v}) fter identifying the unique initil nd finl sttes of NR v( ) with q 1 nd q 2 respectively nd then replcing the trnsition q 1 q2 by N v R ( ). To define the decomposition tree TR we crete new root s. We sy tht q 1 q 2 is the rtificil trnsition of s nd lbel s with the qudruple (r,, q 1 q2, v). In generl, if we lbel node t of TR with (u, C, q 1 q 2, v),

4 N v R ( ) T R : v T 2 N R : q 0 q 1 q 2 q f T 1 N r R ({v}) : q 0 q 1 q 2 q f Fig.3. The first expnsion step for T R nd the corresponding NFAs. - then we sy tht t represents the expression tree TR u (C) s well s the NFA NR u(c). When expnding node t, the NFA Nu R (C) is decomposed using the seprting node v. - The left child of t represents the expression tree TR v (C) s well s the NFA NR v(c) which hs the unique initil stte q 1 nd the unique finl stte q 2. - The right child of t represents the expression tree TR u (C {v}) nd the NFA NR v (C {v}). - One obtins the NFA NR u(c) from the NFA Nv R (C {v}) of the right child fter replcing the rtificil trnsition q 1 q2 by the NFA NR v (C) of the left child. We recursively repet this expnsion process for the left child of s representing TR v( ) s well s the right child representing TR r ({v}). Observe tht seprting nodes hve to hve t lest N 3, but less thn 2N 3 originl (i.e., non-rtificil) leves in their subtrees, where N is the current number of originl leves. We continue to expnd until ll trees contin t most one originl lef. If we hve reched node t of TR whose expression tree T R u (C) hs exctly one lef l (representing the originl trnsition q 1 q2 ), then we lbel t by the qudruple (u, C, q 1 q2, l) nd stop the expnsion. We summrize the importnt properties of TR. Proposition 1. Let R be regulr expression of length n. () Ech ε-free trnsition of N R ppers exctly once s n rtificil trnsition of lef of T R. (b) Assume tht node t is lbeled with (u, C, q 1 q2, v). Then ny ε-pth in N R from stte of the left child NFA NR v(c) to stte outside of Nv R (C) hs to trverse q 2. Any ε-pth in N R from stte outside of NR v (C) to stte of NR v(c) trverses q 1. (c) Let p q nd r b s be two ε-free trnsitions of N R corresponding to leves l 1, resp. l 2 of T R. Moreover let t be the lowest common ncestor of l 1 nd l 2 in T R. If there is n ε-pth q r in N R nd if q 1 q2 is the rtificil trnsition of t, then the pth trverses q 1 or q 2. (d) The depth of T R is bounded by O(log 2 n). Proof. () Any ε-free trnsition of N R ppers exctly once s lef of the expression tree T R. The clim follows, since ech expnsion step for the decomposition tree T R decomposes T R. (b) When growing the cnonicl NFA while expnding the expression tree T R, we lwys replce n rtificil trnsition by n NFA N with unique initil nd finl stte. No stte outside of N will ever be linked directly with stte inside of

5 N. Moreover N cn only be entered through its initil stte nd left through its finl stte. (c) The lowest common ncestor t represents the expression R u (C) for some node u nd cut C. If w is the seprting node of t, then R u (C) is decomposed into the expressions R w (C), recognized by N w R (C), nd Ru (C {w}), recognized by N u R (C {w}). Assume tht both endpoints of, sy, p q belong to N w R (C), the endpoints of r b s lie outside. But ccording to prt (b), the left child NFA NR w (C) cn only be entered through its initil or finl stte which coincides with n endpoint of the rtificil trnsition of t. (d) follows, since the number of originl leves is reduced ech time by t lest the fctor Constructing ε-free NFAs from the Decomposition Tree Property (c) of Proposition 1 is of prticulr importnce when building smll ε-free NFA from the regulr expression R, resp. from the NFA N R : ssume for instnce tht there re ε-pths q 1 p2, q 2 p3 nd q 3 p4 s well s ε-free trnsitions p i i qi for i = 1,...,4 in N R. Then there is pth P from p 1 to q 4 built from the ε-trnsition nd the four ε-free trnsitions. How cn we simulte P without ε-trnsitions? Assume tht within the decomposition tree u is the lowest common ncestor (lc) of p 1 1 q1 nd p 2 2 q2, v the lc of p 2 2 q2 nd p 3 3 q3 nd finlly tht w is the lc of p 3 3 q3 nd p 4 4 q4. Moreover let q1 u nd q2, u q1 v nd q2, v q1 w nd q2 w be the endpoints of the rtificil trnsitions of u, v nd w respectively. Pth P hs to trverse one of the endpoints for ll three lc s ccording to Proposition 1 (c). In prticulr, P my hve the form p 1 1 q1 q u 2 2 p2 q2 q v 1 3 p3 q3 q w 1 4 p4 q4. We concentrte on the two pth frgments q u 2 If we introduce new ε-free trnsitions p 2 2 q2 q v 1 nd q1 v p 3 3 q3 q w 1. q u 2 2 q v 1 nd q1 v 3 q w 1, then disregrding the very first nd the very lst ε-free trnsitions of P, we hve utilized the lc s s shortcuts in the ε-free equivlent q2 u 2 q v 3 1 q w 1 of P. We now nlyze this procedure in generl. In prticulr we improve upon the conversion of [2], where O(n log 2 n k) trnsitions re shown to suffice. Our pproch combines ides in [2] with Proposition 1 (c). Lemm 1. Let R be regulr expression of length n over n lphbet of size k. Then there is n ε-free NFA N for R with O(n log 2 n log 2 2k) trnsitions. N hs unique initil stte. If ε L(R), then N hs one finl stte nd otherwise N hs t most two finl sttes. Proof. Assume tht q 0 is the unique initil stte of N R nd q f its unique finl stte. We moreover ssume without loss of generlity tht ll sttes of N R hve n ε-loop. We choose q 0 nd q f s well s ll endpoints of rtificil trnsitions ssigned to nodes of T R s sttes for our ε-free NFA. q 0 is still the initil stte nd q f is (jointly with q 0, whenever ε L(R)) the only finl stte. Thus the ε-free NFA results from N R fter removing ll ε-trnsitions (nd ll sttes which re incident to ε-trnsitions only) nd inserting new ε-free trnsitions.

6 Let p q be n ε-free trnsition of N R nd let l be the corresponding lef of TR. We define the set A to contin q 0, q f s well s ll ncestors of l including l itself. We ssume tht q 0 nd q f re roots of imginry trees such tht ll leves of TR belong to the right subtree of q 0 s well s to the left subtree of q f. Consider ny two nodes v, w A nd let q1 v, qv 2 nd qw 1, qw 2 be the endpoints of the rtificil trnsitions for v nd w respectively. (We set q1 v = q2 v = q 0 for v = q 0 nd q1 v = q2 v = q f for v = q f.) We insert the trnsition q v i q w j for i, j {1, 2}, if there re ε-pths qi v p nd q qj w in N R. Let N be the NFA obtined from N R w ǫ v ǫ p q Fig.4. Introducing trnsitions between ncestors fter these insertions nd fter removing ll ε-trnsitions. Obviously ny ccepting pth from q 0 to q f in N cn be extended vi ε- trnsitions to n ccepting pth in N R nd hence L(N) L(N R ). Now consider n ccepting pth ε ε q 0 1 ε ε p1 q1 r ε ε pr qr qf for the word 1 r in N R. ince ll sttes of N R hve ε-loops, we my ssume tht ll ε-free trnsitions re seprted by ε-pths. Let l i be the lef of TR corresponding to the trnsition p i i qi. To obtin n ccepting pth in N, let v 0 = q 0, v 1,..., v r 1, v r = q f be sequence of nodes, where v i (1 i r 1) is the lowest common ncestor of l i nd l i+1 in TR. By Proposition 1(c) the ε-pths from q i 1 to p i nd from q i to p i+1 hve to hit endpoints q vi 1 j i 1 {q vi 1 1, q vi 1 2 } nd q vi j i {q vi 1, qvi 2 } of the respective rtificil trnsition. But then N contins the trnsition q vi 1 i j i 1 q v i nd j i q 1 0 q v 1 2 i 1 v j 1 q i 1 i j i 1 q v i i+1 v j i q i+1 i+2 j i+1 r qf is n ccepting pth in N. Hence L(N R ) L(N) nd N nd N R re equivlent. We still hve to count the number of trnsitions of N. We introduce trnsitions qi s qj t (resp. qt i qj s) only for trnsitions p q which re represented by leves belonging to the subtrees of s nd t. Hence s must be n ncestor or descendnt of t in TR. Thus, for given nodes s, t we introduce t most min{ s, t, k} trnsitions, where s nd t re the number of leves in the subtrees of s nd t respectively. We fix s. There re O( s /k) descendnts t of s with t k nd t most O( s k k) = O(s) newly introduced trnsitions connect s with high node t. The remining low nodes re prtitioned into O(log 2 k) levels, where one level produces t most O( s ) trnsitions, since t most t trnsitions connect s with node t of the level. Thus the number of trnsitions between qi s nd qj t for descendnts t of s is bounded by O( s (1 + log 2 k)) nd hence by O( s log 2 2k). Finlly we prtition ll nodes s of TR into O(log 2 n) levels, where one level requires O(n log 2 2k) trnsitions, nd overll O(n log 2 n log 2 2k) trnsitions suffice.

7 i ε ε ε ε i 1 p i 1 qi 1 p i i i+1 qi p i+1 qi+1 Fig.5. The construction of n equivlent pth 2.3 A Recursive Construction of mll ε-free NFAs How cn we come up with even smller ε-free NFAs? Assume tht we hve prtitioned the regulr expression R into (very smll) subexpressions of roughly sme size η. We pply the construction of Lemm 1 to ll subexpressions nd introduce t most O(n log 2 η log 2 2k) trnsitions, significnt reduction if η is drsticlly smller thn n. However now we hve to connect different subexpressions with globl trnsitions nd Lemm 1 inserts the vst mjority of trnsitions, leding to totl of O(n log 2 n log 2 2k) trnsitions. But we cn do fr better, if we re willing to increse the size of ε-free NFAs for every subexpression. Definition 1. Let N be n ε-free NFA with initil stte q 0 nd let F be the set of finl sttes. We sy tht ny trnsition (q 0, r) is n initil trnsition nd tht r is post-initil stte. Anlogously ny trnsition (r, s) is finl trnsition, provided s F, nd r is pre-finl stte. Observe tht it suffices to connect globl trnsition for subexpression with post-initil or pre-finl stte of n ε-free NFA for. As consequence, the number of globl trnsitions for is reduced drsticlly, provided we hve only few postinitil nd pre-finl sttes. But, given n ε-free NFA, how lrge re equivlent ε-free NFAs with reltively few initil or finl sttes? Proposition 2. Let N be n ε-free NFA with s trnsitions over n lphbet Σ of size k. Then there is n equivlent ε-free NFA N with O(k 2 + k s) trnsitions nd t most 3k + k 2 initil or finl trnsitions. N hs one initil stte. If ε L(N), then N hs one finl stte nd otherwise t most two finl sttes. Proof. Assume tht q 0 is the initil stte of N, F is the set of finl sttes nd Σ = {1,...,k}. Let ρ 1,...,ρ p be the post-initil sttes of N nd σ 1,..., σ q be the pre-finl sttes of N. Moreover let R i be the set of post-initil sttes in N receiving n i-trnsitions from q 0 nd let i be the set of pre-finl sttes of N sending n i-trnsitions into stte of F. We introduce new initil stte q 0 nd new finl stte q f. (q 0 is the second ccepting stte, if ε L(N).) For every Σ L(N) we insert the trnsition q 0 q f. We introduce new post-initil sttes r 1,...,r k, new pre-finl sttes s 1,...,s k nd insert i-trnsitions from q 0 to r i s well s from s i to q f. If ρ j R i nd if (ρ j, s) is trnsition with lbel b, then insert the trnsition (r i, s) with lbel b. Anlogously, if σ j i nd if (r, σ j ) is trnsition with lbel b, then we insert the trnsition (r, s i ) with lbel b. Thus the new sttes r i nd s i inherit their outgoing respectively incoming trnsitions from the sttes they re

8 responsible for. Finlly, to ccept ll words of length two in L(N), we introduce t most k 2 further initil nd finl trnsitions incident with q 0, q f nd post-initil sttes. Observe tht the new NFA N is equivlent with N, since, fter leving the new sttes r i nd before reching the new sttes s j, N works like N. The sttes q 0 nd q f re incident with t most k + 2 k + k 2 trnsitions: up to k trnsitions link q 0 nd q f, 2 k trnsitions connect q 0 nd the r i s (or the s i s nd q f ) nd t most k 2 trnsitions ccept words of length two. Finlly t most k s trnsitions leve sttes r i, not more thn ks trnsitions enter sttes s j nd hence the number of trnsitions increses from s to t most s (2k + 1) + 3 k + k 2. We now observe tht the combintion of Lemm 1 nd Proposition 2 provides significnt svings for lphbets of smll size. Proof of Theorem 1. Let T R be the expression tree nd TR be the decomposition tree. If u is node of TR, then let T R (u) denote the subtree of T R with root u. We begin with sketch of the construction of smll ε-free NFA for the expression R. We proceed itertively. In first phse we process cut of low nodes of TR, where ny such node u hs t most L 1 originl trnsitions in its subtree TR (u), where L 1 will be fixed lter. We insert dditionl trnsitions between descendnts of u ccording to Lemm 1 nd obtin n ε-free NFA N 1 (u). We repet this procedure in phse j; this time the cut consists of nodes u with t most L j originl trnsitions in their respective subtree TR (u). The prmeters L j will be fixed lter; here we only ssume tht L 1 < < L j 1 < L j < holds. Thus we process cuts of nodes of incresing height until we rech the root of TR. Let D be the set of descendnts w of u which we processed in the previous phse. At the beginning of phse j we ssume tht ll ε-free NFA N j 1 (w) for w D hve been constructed. When constructing n ε-free NFA N j (u) we re now fcing more complicted sitution thn in Lemm 1. Firstly, when building N j (u) in phse j > 1, we hve to merge the ε-free NFA N j 1 (w) for ll w D. As opposed to the cse of originl trnsitions, ny such NFA N j 1 (w) my ccept rbitrry strings insted of just single letters. To prepre for this more complicted scenrio we first pply Proposition 2 to replce N j 1 (w) by n equivlent ε-free NFA with few post-initil nd pre-finl sttes: ll dditionl trnsitions cn now be connected with one of these reltively few sttes. Assume tht u represents the expression tree TR v(c). Then the ε-free NFA Nj (u) will be built from NR v (C D). In prticulr, the rtificil trnsitions corresponding to ny w D re replced by the ε-free NFA N j 1 (w). But, nd this is the second difference to the sitution of Lemm 1, the rtificil trnsitions q 1 q2 corresponding to node in C re kept: this procedure llows to plug in the ε-free NFA N recognizing, whenever N is constructed. In summry, N j (u) results from its bse NFA NR v (C D) by removing ll ε-free trnsitions, keeping ll rtificil trnsitions for nodes in C, replcing rtificil trnsitions for nodes in D by the previously determined ε-free NFA nd by dding new ε-free trnsitions. We hve to del with the lst point in detil. Our construction utilizes the following invrint: if w D represents the expression tree TR x(c ), then the NFA N j 1 (w) is equivlent with the NFA NR x(c ). Here we ssume, for N j 1 (w) s well s for NR x(c ), tht ny rtificil trnsition q 1 q2 produces ll words of the regulr expression. We do not differentite between the first nd subsequent phses, since subsequent phses re more generl: in phse 1 only ε-free trnsitions re to be merged. But we my interpret ny such trnsition s ε-free NFAs N 0 (w) of phse 0 consisting of single trnsition only. In prticulr, the invrint holds initilly.

9 As in the proof of Lemm 1 we ssume tht ll sttes of N R hve n ε-loop. We now begin the forml description of our construction Phse j. We consider ll nodes u of TR which hve t most L j originl trnsitions in their subtree TR (u), wheres its prent hs more thn L j originl trnsitions. Obviously these nodes define cut in TR. Let D be the set of descendnts w of u which we processed in the previous phse. We build n ε-free NFA N j (u) from the ε-free NFAs N j 1 (w) for ll w D. For ny such descendnt w we pply Proposition 2 to N j 1 (w) nd obtin n ε-free NFA N j 1 (w) with t most O(k 2 ) initil or finl trnsitions; moreover size(n j 1 (w) ) O(k 2 + k size(n j 1 (w)). We utilize the few initil or finl trnsitions to cheply interconnect N j 1 (w) with (endpoints of rtificil trnsitions ssigned to) ncestors of w within TR (u). Assume gin tht u represents the expression tree TR v (C). Then the ε-free NFA N j (u) is obtined from NR v (C D) by removing ll ε-trnsitions, replcing the rtificil trnsitions ssigned to node w D by the ε-free NFA N j 1 (w), keeping ll rtificil trnsitions for nodes in C nd dding dditionl ε-free trnsitions. Only the lst point hs to be explined. As in the construction of Lemm 1, N j (u) keeps the initil nd finl sttes q1 u nd q2 u of NR v (C D). The insertion of new trnsitions is now more complex tsk thn in the sitution of Lemm 1, since we re working with full-fledged ε-free NFAs insted of ε-free trnsitions. In prticulr we hve to differentite three cses, nmely firstly the new cse ε L(N j 1 (w) ), then the originl cse considered in Lemm 1, nmely L(N j 1 (w) ) for some letter, nd finlly the second new cse, nmely tht L(N j 1 (w) ) contins words of length t lest two. Let q 1, q 2 be the unique initil nd finl sttes of N j 1 (w). (0) Assume ε L(N j 1 (w) ). This cse estblishes ε-pths nd is of interest only for the two remining cses where rechbility by ε-pths is crucil. (1) Assume L(N j 1 (w) ) for some letter. For ny ncestors t 1, t 2 of w in T R (u), for ny endpoints qt1, q t2 of their respective rtificil trnsitions nd for ny ε-pths q t1 q1 nd q 2 q t 2, introduce the trnsition q t1 q t 2. (This procedure is completely nlogous to Lemm 1.) (2) Let q 1 r be n rbitrry initil trnsition of N j 1 (w). Then, for ny ncestor t of w in T R (u), for ny endpoint qt of its rtificil trnsition nd for ny ε-pth q t q1 introduce the trnsition q t r. Anlogously, if r q2 is n rbitrry finl trnsition of N j 1 (w) nd if there is n ε-pth q 2 q t, then introduce the trnsition r q t. We tret the rtificil trnsitions of TR v (C), respectively their ε-free NFA which will replce the rtificil trnsitions t lter time, in exctly the sme wy. We hve to show tht N j (u) stisfies the invrint, i.e., tht N j (u) nd NR v(c) re equivlent whenever ny rtificil trnsition q 1 q2 produces ll words of the regulr expression. If we insert trnsition p q, then there re sttes r, s nd pth p r s q in NR v(c). Thus L(Nj (u)) L(NR v(c)). Any ccepting pth P in NR v (C) trverses ε-free NFAs corresponding to some sequence (w 1,..., w s ) for w 1,...,w s D; we my require tht t lest one letter is red for ech w i. ince ll sttes of N R hve ε-loops, we my ssume tht ll ε-free trnsitions re seprted by ε-pths. Hence, if w i represents n NFA with initil stte q wi 1 nd finl stte q wi 2, then there is ε-pth from qwi 2 to q wi+1 1 in NR v(c). Assume tht s nd t re the lest common ncestors in TR (u) of w i 1 nd w i nd of w i nd w i+1 respectively. We pply Proposition 1 (c) nd obtin ε-pths q s q wi 1 nd q wi 2 q t for pproprite endpoints q s, q t of the rtificil trnsitions of s nd t

10 respectively. If t lest two letters re red for w i then the corresponding ccepting pth Q in N j (u) jumps from q s to post-initil stte of N j 1 (w i ), then trverses N j 1 (w i ) nd finlly jumps from pre-finl stte of N j 1 (w i ) to q t ; otherwise Q jumps from q s directly to q t. Thus L(NR v(c)) L(Nj (u)): N j (u) nd NR v (C) re equivlent Accounting. In phse j we re interconnecting not only the ε-free utomt N j 1 (w) of the previous phse, but lso (the NFAs constructed for) the rtificil trnsitions. We therefore begin our nlysis by compring the number of rtificil trnsitions used for phse j with the size of the cut for phse j. Remember tht node u belongs to the cut for phse j iff u hs t most L j originl trnsitions in its subtree T R (u), but its prent p hs more thn L j originl trnsitions in its subtree T R (p). Proposition 3. () The ncestors of cut nodes define binry tree C j with cut nodes s the set of leves. (b) The number of different rtificil trnsitions generted t proper ncestor of cut node is smller thn the size of the cut. (c) The totl number of rtificil trnsitions used when constructing ll ε-free NFA N j (u) of phse j is not lrger thn the size of the cut processed in phse j 1. Proof. () Let v be n ncestor of cut node. If v does not belong to the cut, then v hs more thn L j originl trnsitions nd hence v hs two children which re ncestors of cut nodes. (Here we ssume tht node is its own ncestor.) (b) ince n ncestor genertes exctly one rtificil trnsition, there re no more rtificil trnsitions thn there re inner nodes of C j. According to prt (), C j is binry tree nd hence the number of inner nodes is smller thn the number of leves, i.e., smller thn the size of the cut. (c) Observe first tht n rtificil trnsition occurs in t most one cut node: if n rtificil trnsition is descendnt of the seprting node, it ppers only in the left subtree nd otherwise only in the right subtree. The clim follows now from prt (b). Thus we my only count the number of ε-free trnsitions introduced becuse of the ε-free NFA N j 1 (w) nd my disregrd rtificil trnsitions ll together. We hve prtitioned the new ε-free trnsitions in two clsses. To count trnsitions from clss (1), observe tht TR (u) hs t most O(log 2 L j ) levels nd the NFA N j (u) hs to merge t most O( Lj L j 1 ) descendnt NFAs N j 1 (w). We my now pply the nlysis developed for Lemm 1 to receive the upper bound O( Lj L j 1 log 2 L j log 2 2k) on the number of clss 1 trnsitions introduced for N j (u). The trnsitions in clss (2) connect one of the O(k 2 ) post-initil nd pre-finl sttes of some Nw with endpoints of rtificil trnsitions for t most O(log 2 L j ) ncestors within TR (u). Thus for ech NFA Nj (u) we hve introduced t most O(k 2 L j L j 1 log 2 L j ) trnsitions from the second clss. If ech descendnt NFA N j 1 (w) hs size t most s j 1, then ll descendnts contribute not more thn L O(k j L j 1 s j 1 ) trnsitions including the blow-up due to Proposition 2. Hence N j (u) hs t most O(s j ) trnsitions, where s j = k k L j L j 1 s j 1 + L j L j 1 log 2 L j log 2 2k + k 2 L j L j 1 log 2 L j L j s j 1 + 2k 2 L j log L j 1 L 2 L j. (1) j 1

11 We iterte recurrence (1) nd get for 1 r j s j k r L j r 1 s j r + L j r t=0 2k t+2 L j L j 1 t log 2 L j t. Assume tht the regulr expression R hs length n. Thus, if we ssume tht n = L i nd set r = j = i, then we introduce t most O(s i ) trnsitions, where s i k i n i 1 s 0 + L 0 t=0 2k t+2 n L i t 1 log 2 L i t. (2) ince phse 1 strts from single phse 0 NFAs consisting of single trnsition, we my set L 0 = s 0 = 1 nd the first term of (2) coincides with O(k i n). We set L j = 2 Lj 1 nd the sum in (2) is bounded by O(k i+1 n). Thus i log n nd the 1+log regulr expression R is recognized by n ε-free NFA with t most s i = O(k n n) trnsitions. 3 The Lower Bound We consider the regulr expression E n = (1 + ε) (2 + ε) (n + ε) of strictly incresing sequences. The following lower bound is symptoticlly optiml nd improves upon the Ω(n log 2 2 n/ log 2 log 2 n) bound of [8]. Lemm 2. ε-free NFAs for E n hve t lest Ω(n log 2 2 n) trnsitions. Before giving proof we show tht Theorem 2 is consequence of Lemm 2. We conctente E k exctly n/k times with itself to obtin R n,k = (E k ) n/k. Now ssume tht N n,k is n ε-free NFA recognizing R n,k. We sy tht trnsition e of N n,k belongs to copy i iff e is trversed by n ccepting pth with lbel sequence (1 2 k) i 1 σ (1 2 k) n/k i while reding the string σ ε. Now ssume tht there is trnsition e which belongs to two different copies i, j with i < j. Then we cn construct n ccepting pth with lbel sequence (1 2 k) j 1 τ (1 2 k) n/k i nd N n,k ccepts word outside of R n,k. Thus ny trnsition belongs to t most one copy. N n,k hs, s consequence of Lemm 2, t lest Ω(k log 2 2 k) trnsitions for ech copy nd hence N n,k hs t lest Ω( n k k log2 2 k) trnsitions. Observe tht the unry regulr expression 1 n requires NFAs of liner size nd hence we ctully get the lower bound Ω( n k k log2 2 2k) lso for k = An outline of the rgument Let n = 2 k. nd let N n be n rbitrry ε-free NFA for E n. Our bsic pproch follows the rgument of [8]. In prticulr, we my ssume tht N n is in norml l form, i.e., {0, 1,..., n} is the set of sttes of N n nd ny trnsition i j stisfies i < l j. To study the behvior of trnsitions we introduce the ordered complete binry tree T n with nodes {1,..., n 1} nd depth k 1. We ssign nmes to nodes such tht n inorder trversl of T n produces the sequence (1,..., n 1). Finlly we lbel the root r of T n with the set L(r) = {1,...,n}. If node v is lbeled with the set L(v) = {i+1,..., i+2t}, then we lbel its left child v l with L(v l ) = {i+1,..., i+t} nd its right child v r with L(v r ) = {i + t + 1,...,i+2t}. Finlly define v = L(v) s the size of v. Observe tht v = 2 holds for every lef v nd we interpret its children v l, resp. v r s virtul leves.

12 Exmple 1. Agin set n = 2 k. We recursively construct fmily of ε-free NFAs A n to recognize E n. {0, 1,..., n 1, n} is the set of sttes of A n ; stte 0 is the initil nd stte n is the finl stte of A n. To obtin A n plce two copies of A n/2 in sequence: {0, 1,..., n/2 1, n/2} nd {n/2, n/2 + 1,...n 1, n} re the sets of sttes of the first nd second copy respectively, where the finl stte n/2 of the first copy is lso the initil stte of the second copy. If ( 1,..., r, r+1,... s ) is ny incresing sequence with r n/2 < r+1, then the sequence hs n ccepting pth which strts in 0, reches stte n/2 when reding r nd ends in stte n when reding s. But incresing sequences ending in letter n/2, resp. strting in letter > n/2 hve to be ccepted s well. Therefore direct ll trnsitions, ending in the finl stte n/2 of the first copy, lso into the finl stte n. Anlogously, direct ll trnsitions, strting from the initil stte n/2 of the second copy, lso out of initil stte 0. Now unroll the recursion nd visulize A n on the tree T n (fter disregrding the initil stte 0 nd the finl stte n). The root of T n plys the role of stte n/2. In prticulr, for ny node v there re v mny trnsitions with lbels from the set L(v) between v nd the root. Thus the root is the trget of n k = n log 2 n trnsitions, implying tht A n hs n log 2 2 n trnsitions if trnsitions incident with sttes 0 or n re disregrded. Definition 2. We sy tht node v of T n is crossed from the left in N n iff for ll i L(v l ) nd ll sequences σ with σ i E n there is pth in N n with lbel sequence σ i which ends in stte y L(v r ). If ll sequences i τ E n with rbitrry i L(v r ) hve pth which strts in some stte x L(v l ), then we sy tht v is crossed from the right. In prticulr the lst trnsition of the pth crosses v, since it ends in L(v r ) nd is lbeled with letter from L(v l ). v σ i i y Fig. 6. An i-trnsition crossing v from the left Proposition 4. [8] Let v be n rbitrry node of T n. Then for ny ε-free NFA in norml form, v is crossed from the left or v is crossed from the right. Proof. Assume tht v is not crossed from the left. Then there is word σ i E n with i L(v l ) such tht no pth in N n with lbel sequence σ i hs finl trnsition crossing v. If v is lso not crossed from the right, then there is word j τ E n with j L(v r ) such tht no pth in N n with lbel sequence j τ hs n initil trnsition crossing v. But then N n rejects σ i j τ E n. Let C be the set of nodes v T n which re crossed from the left. We ssume tht more nodes re crossed from the left nd hence we concentrte on C. (ee (6) in ection 3.3 for forml definition of more ).

13 Assume tht w T n belongs to C nd tht node v belongs to Left(w), the set of nodes of T n which belong to the left subtree of w. Then ny sequence σ j E n with j L(v r ), nd hence j L(w l ), hs pth p σ,j in N n with lbel sequence σ j which ends in stte y L(w r ). Observe tht the lst trnsition e = (x, y) of p σ,j identifies w s the unique tree node with j L(w l ) nd y L(w r ). Moreover, if x L(v l ), then e lso identifies v s the unique tree node with x L(v l ) nd j L(v r ). We now observe tht N n hs Ω(n log 2 2 n) trnsitions if mjority of pirs (v, w) with v Left(w) is identified for too mny lbels j L(v r ). In prticulr, define N(h, h ) for h < h k 1 s the number of pirs (j, w), where w T n hs height h nd j belongs to the right subtree of node v Left(w) with height h. Then N(h, h ) = n/4, since for ny w exctly one fourth of ll lbels j is counted. But then h <h k 1 N(h, h ) = Ω(n log 2 2 n) holds nd it suffices to show tht ech pir (v, w) with w C nd v Left(w) hs Ω( v r ) trnsitions which identify v s well s w. Lbels j L(v r ) re problemtic if ll j-trnsitions e = (x, y) with x L(v) nd y L(w r ) re short for (v, w), i.e., ny such trnsition e strts in x L(v r ). If lbel j L(v r ) is short, then j-trnsitions into L(w r ) deprt close to j. But, since w is crossed from the left, preceding i-trnsition, for i L(u) with u Left(v), hve to rech one of these strting points, nd if these strting points re close to home for mny short lbels j, then consequently mny copies of i-trnsitions re required. To formlize this intuition we determine how fr to the left short j-trnsitions extend, but not with respect to trnsitions strting in L(v) nd ending in L(w r ) for some specific w, but rther with respect to worst-cse sequence τ = j σ k E n (with k L(w l ) for n rbitrry w C) such tht ny pth with sequence τ strts very close to j, if we require the pth to strt in L(v) nd to end in L(w r ). Definition 3. Vertices v T n, w C (with v < w) s well s lbels j L(v r ) nd k L(w l ) re given. Define d v,w (j, k) = min τ=j σ k E n nd d v (j) = min w C,k L(wl ) d v,w (j, k). mx { j x pth x y with sequence τ} x L(v),y L(w r) w v j σ k x j k y Fig.7. Mesuring the miniml distnce of strting points x of j-trnsitions from j Observe first tht d v (j) is only defined iff there is node w C with v < w. But we cn ssume tht the node w = n 1 is crossed from the left: L(w l ) = {n 1} holds nd ny trnsition with lbel n 1 hs to either end in node w or in the virtul

14 lef w r. Thus if we copy ll trnsitions with lbel n 1 from w to w r, then w is crossed from the left t the cost of t most doubling the size of the NFA. For ny i L(u) with u Left(v) there hs to be n i-trnsition which pproches j within distnce t most d v (j). Next we determine how close the mjority of lbels j L(v r ) hve to be pproched. Definition 4. Let s be mximl with the property tht t lest vr 2 lbels j L(v r ) stisfy d v (j) vr s. et s(v) = s nd cll lbel j L(v r) regulr for v iff d v (j) v r s(v) holds. If j L(v r ), then d v,w (j, k) 2 v r, since the strting point x of j-trnsition belongs to L(v). But then d v (j) 2 v r nd s(v) 1/2 (3) follows, since d v (j) 2 v r for ll lbels j L(v r ). At lest one hlf of ll lbels j L(v r ) re regulr, i.e., hve j-trnsitions which re forced by some node w C to hve strting points within distnce t most vr s(v) from j. Now, if v is crossed from the left nd if u belongs to Left(v), then t lest Ω(s(v)) i-trnsitions end in v r for ll lbels i L(u): ll regulr lbels j L(v r ) hve to be pproched within distnce t most v r /s(v). All in ll u s(v) trnsitions re required for fixed u nd v. Any such trnsition identifies v, however the sme trnsition my be counted for severl nodes u Left(v). In prticulr we show Lemm 3. N n hs t lest Ω( v C u s(v) log 2 2 (4s(u))) trnsitions. We prove Lemm 3 in the next section nd show in ection 3.3 tht Lemm 2 is consequence of Lemm hort Trnsitions Let u be n rbitrry node. Then less thn ur 2 lbels i L(u r ) stisfy d u (i) ur 2s(u) nd hence holds for more thn one hlf of ll lbels i L(u r ). u r 2s(u) < d u(i) (4) Proof of Lemm 3. We rbitrrily pick nodes u T n, v C with u Left(v) nd regulr lbel j for v. Then there is node w C with v < w nd lbel sequence τ = j τ k E n with k L(w l ) such tht ech pth for τ, which begins in x L(v) nd ends in L(w r ), stisfies j x v r /s(v). Let h be the smllest lbel in L(u l ). If i belongs to L(u r ), then ny pth with lbel sequence h i τ which ends in L(w r ), hs to hve n i-trnsition e which strts in L(u) (since h L(u l )) nd ends in L(v) (since u Left(v) nd j L(v)). From ll i-trnsitions which belong to pth u w r with lbel sequence i τ, we select n i-trnsition e = (x, y) with smllest possible left endpoint x L(u) nd cll e distinguished (for (u, v)). When counting trnsitions of N n we restrict ourselves to distinguished trnsitions. Firstly we determine the number of distinguished i-trnsitions for i L(u r ) which strt in L(u) nd end in L(v r ). econdly we bound the effect of multiple counting: ll i-trnsition do strt in L(u) L(v l ) nd hence they identify v, whenever they end in L(v r ). However i-trnsitions my not identify u.

15 At most v r /s(v) regulr lbels j for v hve j-trnsitions with common left endpoint. Moreover t most v r /s(v) regulr lbels hve trnsitions with left endpoint in L(v l ) nd therefore t lest v r /2 v r /s(v) v r /s(v) = v r /2 v r /s(v) 1 s(v) 2 1 different left endpoints in L(v r ) re required. As consequence, for ll i L(u r ), t lest s(v)/2 1 distinguished i-trnsitions strt in L(u) nd end in L(v r ). This result is meningless if s(v) < 2, but since v C, N n hs for every lbel i L(u r ) pth with lbel sequence h i, where the i-trnsition strts in L(u) nd ends in L(v r ). Thus for ll i L(u r ) t lest mx{s(v)/2 1, 1} s(v)/4 distinguished i- trnsitions strt in L(u) nd end in L(v r ). All these trnsitions identify v by their left nd right endpoint. Let E(u, v) be the set of trnsitions of N n which re distinguished for u nd v. We hve just seen tht E(u, v) u r s(v)/4 holds. Trnsitions in E(u, v) identify v, however they my not identify u. In prticulr, for i L(u r ) let e = (x, y) be distinguished i-trnsition for u nd v. If v C, then d u (i) d u,v (i, j) i x, (5) holds, since distinguished trnsitions mximize the difference between their lbel i nd their left endpoint (mong ll i-trnsitions prticipting in pth u w r which ends in j-trnsition). Furthermore let µ be left descendnt of v of smllest depth such tht the distinguished i-trnsition e is lso distinguished for µ nd v. Observe tht u hs to be descendnt of µ nd hence e is distinguished for t most µ log 2 i x nodes u. Thus, in order to control multiple counting, we hve to bound i x from below. We pply (4) nd (5) nd obtin µ r 2 s(µ) < d µ(i) i x for t lest one hlf of ll lbels i L(µ r ). Therefore trnsition e belongs to t most µ log 2 i x log µr 2 µ / 2 s(µ) = log 2 (4s(µ)) sets E(u, v). To void multiple counting we ssign weight 1/ log 2 (4s(u)) 2 to trnsition e E(u, v). If µ 1,..., µ r 1, µ r = µ re ll the tree nodes in Left(v) for which e is distinguished nd if µ i is descendnt of µ i+1, then i log 2 (4s(µ i )) nd hence r i=1 1/ log 2(4s(µ i )) 2 r i=1 1/i2 = O(1). To summrize: we hve E(u, v) u r s(v)/4 nd there is no multiple counting, if we ssign the weight 1/ log 2 2 (4s(u)) to trnsitions in E(u, v). Hence N n hs symptoticlly t lest v C E(u, v) log 2 2(4s(u)) = Ω( v C u s(v) log 2 2(4s(u)) ) trnsitions nd the clim follows. 3.3 Accounting We hve ssumed in the proof sketch tht more nodes re crossed from the left. We now formlize this to men v C,depth(v) (log 2 n)/2 v 2 log 2 v v C,depth(v) (log 2 n)/2 v 2 log 2 v. (6)

16 If (6) does not hold, then we work insted with C, the set of nodes which re crossed from the right. Thus we my ssume (6). We set T = v depth(v) (log 2 n)/2 2 log 2 v nd observe tht T = log 2 n d=(log 2 n)/2 depth(v)=d v 2 log 2 v = log 2 n d=(log 2 n)/2 n 2 d = Ω(n log2 2 n) holds. Thus Lemm 2 is consequence of Lemm 3, if we show v C which in turn follows, if s(v) u log 2 2(4s(u)) = Ω(T ), (7) u s(v) log 2 2 (4s(u)) = Ω( v log 2 v ) (8) holds for sufficiently mny nodes v. We first formlize wht sufficiently mny nodes mens. Definition 5. () w(u) = u 2 log 2 u is the weight of u nd q v (u) = w(u) w(u) is the probbility of u with respect to v, provided u belongs to Left(v). We define the probbility prob v [E] of event E Left(v) by the distribution q v. (b) K is suitbly lrge constnt with k/ log2 k 2 k1/3 for ll k K. Finlly set p(v) = 16/ log 2 K, if s(v) K, nd p(v) = 16/ log 2 s(v) otherwise. We begin by verifying (8) for ll nodes v which qulify: we disqulify v iff depth(v) < (log 2 n)/2 or v C or if prob v [s(u) p(v) s (v) u Left(v)] < p(v), (9) where s (v) = mx{s(v), K}. If we disqulify v for the lst reson, then we lso disqulify ll descendnts u Left(v) with s(u) 2 p(v) s (v) /4. Thus v is disqulified either for obvious resons (i.e., depth(v) < (log 2 n)/2 or v C) or if too few left descendnts u hve sufficiently smll s -vlues nd hence if (8) seems to be flse for v. In second step we hve to show tht node v is disqulified with sufficiently smll probbility. This is not surprising, since, if v is disqulified for non-obvious resons, then s(u) is extremely lrge in comprison to s(v) for n overwhelming mjority of left descendnts u of v. To lter disqulify left mjority-descendnt u is now even hrder, since p(u) is inversely proportionl to log 2 s(u). We begin by investigting nodes which qulify. Lemm 4. Assume tht v qulifies. Then s(v) w(v) u log 2 2 (4s(u)) 8K.

17 Proof. ince v qulifies, we know tht v belongs to C nd depth(v) (log 2 n)/2 holds. Moreover s(u) 2 p(v) s (v) /4 holds with probbility q p(v) nd hence log 2 2 (4s(u)) p(v) s (v) holds with probbility q. We set p d (v) = prob v [s(u) 2 p(v) s (v) /4 u Left(v), depth(u) = d] nd p d = prob v [ depth(u) = d u Left(v)]. Then q = d p d(v) p d p(v). We obtin u log s(v) 2 v 1 log 2 2(4s(u)) = d=0 log 2 v 1 d=0, depth(u)=d p d (v), depth(u)=d s(v) u log 2 2(4s(u)) u s(v) p(v) s (v), (10) since 1/ log 2 2(4s(u)) 1/(p(v) s (v)) holds with probbility t lest p d (v) for nodes u Left(v) with depth(u) = d. Thus we cn further simplify the right hnd side of (10) nd get But p d = u s(v) log 2 2(4s(u)) s(v) s (v) v log 2 v 1 2 p d (v) p(v). d=0 v /2 d log2 v 1 i=0 v /2 i = d log2 v 1 i=0 i 2 log 2 v (log 2 v 1) log 2 v 4 log 2 v nd hence d 4 p d(v)/ log 2 v d p d(v) p d p(v), resp. d p d(v)/p(v) (log 2 v )/4. Thus we obtin s(v) s(v) u log 2 2 (4s(u)) s (v) v log 2 v 1 2 d=0 p d (v) p(v) s(v) s (v) w(v) 4. If s(v) K, then s(v) = s (v) nd we gin the contribution w(v)/4. Otherwise s(v) K nd we obtin t lest the contribution 1/2 K w(v) 4 = w(v) 8K, since s(v) 1/2 ccording to (3). Thus we hve reched our gol of contribution of t lest w(v) 8K in both cses. It suffices to show tht sufficiently mny nodes v qulify. If v is disqulified becuse of (9), then we lose the contribution w(v) + p(v) w(u), since we not only loose v, but possibly lso p(v)-frction of ll left descendnts of v. How lrge is this loss? Proposition 5. For ny node v, w(v) + p(v) Proof. We first observe w(u) = = log 2 v 1 d=0 log 2 v 1 d=0 w(u) 2p(v), depth(u)=d u 2 d w(u). (11) v 4 d = v log 2 v (log 2 v 1) w(v) log 2 v 8 8

18 nd therefore w(v) 8 w(u)/ log 2 v follows. Hence the contribution we lose in disqulifiction step for node v is bounded by w(v) + p(v) 8 w(u) ( log 2 v + p(v)) 2p(v) w(u), w(u) since 8/ log 2 v 16/ log 2 n p(v). We re now redy to bound the probbility of disqulifying nodes. Lemm 5. A node v with depth(v) (log 2 n)/2 is disqulified with probbility t most log 2 K. Proof. Due to (6), node v (with depth(v) (log 2 n)/2) is disqulified bsed on non-membership in C with probbility t most 1/2. Thus it suffices to show tht disqulifiction becuse of (9) occurs with probbility t most 64/ log 2 K. We order the disqulifiction steps for the nodes w ccording to incresing depth of w. (If node is disqulified s consequence of n erlier disqulifiction, then this node is not listed.) Now ssume tht node v is disqulified followed by lter disqulifiction of left descendnt u of v. Remember tht with v we lso disqulify ll left descendnts u with s(u) p(v) s (v). But p(v) s (v) = s (v)/ log 2 s (v) nd by the choice of K, k/ log2 k 2 k1/3 for ll k K. As consequence, if the left descendnt u of v hs survived the disqulifiction step of v, then s(u) > 2 s (v) 1/3 nd in prticulr p(u) p(v)/2 follows. If node v is disqulified, then we lose the contribution 2p(v) w(u) due to (11). But then ll lter disqulifiction steps for left descendnts of v result in combined contribution of t most ( i 0 2p(v)/2i ) w(u) 4p(v) w(u). We re considering sequence of disqulified nodes, where the nodes re ordered ccording to incresing depth. If we double the loss contribution of node v, i.e., if we ssume the loss 4p(v) w(u) for node v, then we my demnd tht no node in the sequence is left descendnt of nother node in the sequence. The loss mesured ccording to (11) is then mximized, if we disqulify ll nodes of the rightmost pth strting in the root r nd if we ssume tht s(v) K for ll nodes of the pth. Obviously the overll loss is then bounded by (4p(r) 2 i ) i 0 u Left(r) w(u) 4 16 log 2 K 2 u Left(r) w(u). In other words, node is disqulified with probbility t most 4 16 log 2 K. Lemm 2 is now n immedite consequence of Lemm 4 nd Lemm 5, if K is chosen sufficiently lrge. 4 Conclusions nd Open Problems We hve shown tht every regulr expression R of length n over n lphbet of size k cn be recognized by n ε-free NFA with O(n min{log 2 n log 2 2k, k 1+log n }) trnsitions. For lphbets of fixed size (i.e., k = O(1)) our result implies tht O(n 2 O(log 2 n) ) trnsitions nd hence lmost liner size suffice. We hve lso shown the lower bound Ω(n log 2 2 2k) nd hence the construction of [7] is optiml for lrge lphbets, i.e., if k = n Ω(1).

19 A first importnt open question concerns the binry lphbet. Do ε-free NFAs of liner size exist or is it possible to show super-liner size lower bound? Moreover, lthough we hve considerbly nrrowed the gp between lower nd upper bounds, the gp for lphbets of intermedite size, i.e., ω(1) = k = n o(1) remins to be closed nd this is the second importnt open problem. For instnce, for k = log 2 n the lower bound Ω(n (log 2 log 2 n) 2 ) nd the upper bound O(n log 2 n log 2 log 2 n) re still by fctor of log 2 n/ log 2 log 2 n prt. Thirdly the size blowup when converting n NFA into n equivlent ε-free NFA remins to be determined. In [6] fmily N n of NFAs is constructed which hs equivlent ε-free NFAs of size Ω(n 2 / log 2 2 n) only. However the lphbet of N n hs size n/ log 2 n nd the gp between this lower bound nd the corresponding upper bound O(n 2 Σ ) remins considerble. Acknowledgement: Thnks to Gregor Grmlich nd Jurj Hromkovic for mny helpful discussions. References 1. R. Book,. Even,. Greibch, G. Ott, Ambiguity in grphs nd expressions, IEEE Trns. Comput. 20, pp , V. Geffert, Trnsltion of binry regulr expressions into nondeterministic ε-free utomt with O(nlog n) trnsitions, J. Comput. yst. ci. 66, pp , V.M. Glushkov, The bstrct theory of utomt, Russin Mth. urveys 16, pp. 1-53, Trnsltion by J. M. Jckson from Usp. Mt. Nut. 16, pp. 3-41, C. Hgenh, A. Muscholl, Computing ǫ-free NFA from regulr expressions in O(n log 2 (n)) Time, ITA, 34 (4), pp , J.E. Hopcroft, R. Motwni, J.D. Ullmn, Introduction to Automt Theory, Lnguges nd Computtion, Addison-Wesley, J. Hromkovič, G. chnitger, Compring the size of NFAs with nd without ε-trnsitions. Theor. Comput. ci., 380 (1-2), pp , J. Hromkovič,. eibert, T. Wilke, Trnslting regulr expression into smll ε-free nondeterministic utomt, J. Comput. yst. ci., 62, pp , Y. Lifshits, A lower bound on the size of ε-free NFA corresponding to regulr expression, Inf. Process. Lett. 85(6), pp , M.O. Rbin, D.cott, Finite utomt nd their decision problems, IBM J. Res. Develop. 3, pp , ippu, E. oislon-oininen, Prsing Theory, Vol. I: Lnguges nd Prsing, pringer-verlg, G. chnitger, Regulr expressions nd NFAs without ε trnsitions, Proc. of the 23rd TAC, Lecture Notes in Computer cience 3884, pp , K. Thompson, Regulr expression serch, Commun. ACM 11, pp , 1968.

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!) CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir CSCI 2400 Models of Computtion, Section 3 Solutions to Homework 4 Problem 1. ll the solutions below refer to the Pumping Lemm of Theorem 4.8, pge 119. () L = f n b l k : k n + lg Let's ssume for contrdiction

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51 Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004

Advanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004 Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

Improper Integrals, and Differential Equations

Improper Integrals, and Differential Equations Improper Integrls, nd Differentil Equtions October 22, 204 5.3 Improper Integrls Previously, we discussed how integrls correspond to res. More specificlly, we sid tht for function f(x), the region creted

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1 Non-Deterministic Finite Automt Fll 2018 Costs Busch - RPI 1 Nondeterministic Finite Automton (NFA) Alphbet ={} q q2 1 q 0 q 3 Fll 2018 Costs Busch - RPI 2 Nondeterministic Finite Automton (NFA) Alphbet

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa CS:4330 Theory of Computtion Spring 208 Regulr Lnguges Equivlences between Finite utomt nd REs Hniel Brbos Redings for this lecture Chpter of [Sipser 996], 3rd edition. Section.3. Finite utomt nd regulr

More information

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA Nondeterminism Nondeterministic Finite Automt Nondeterminism Subset Construction A nondeterministic finite utomton hs the bility to be in severl sttes t once. Trnsitions from stte on n input symbol cn

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as

Improper Integrals. Type I Improper Integrals How do we evaluate an integral such as Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford Probbilistic Model Checking Michelms Term 2011 Dr. Dve Prker Deprtment of Computer Science University of Oxford Long-run properties Lst lecture: regulr sfety properties e.g. messge filure never occurs

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Chapter 0. What is the Lebesgue integral about?

Chapter 0. What is the Lebesgue integral about? Chpter 0. Wht is the Lebesgue integrl bout? The pln is to hve tutoril sheet ech week, most often on Fridy, (to be done during the clss) where you will try to get used to the ides introduced in the previous

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Turing Machines Part One

Turing Machines Part One Turing Mchines Prt One Hello Hello Condensed Condensed Slide Slide Reders! Reders! Tody s Tody s lecture lecture consists consists lmost lmost exclusively exclusively of of nimtions nimtions of of Turing

More information

1 Structural induction, finite automata, regular expressions

1 Structural induction, finite automata, regular expressions Discrete Structures Prelim 2 smple uestions s CS2800 Questions selected for spring 2017 1 Structurl induction, finite utomt, regulr expressions 1. We define set S of functions from Z to Z inductively s

More information

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite

Goals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite

More information

GNFA GNFA GNFA GNFA GNFA

GNFA GNFA GNFA GNFA GNFA DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Fundamentals of Computer Science

Fundamentals of Computer Science Fundmentls of Computer Science Chpter 3: NFA nd DFA equivlence Regulr expressions Henrik Björklund Umeå University Jnury 23, 2014 NFA nd DFA equivlence As we shll see, it turns out tht NFA nd DFA re equivlent,

More information

DIRECT CURRENT CIRCUITS

DIRECT CURRENT CIRCUITS DRECT CURRENT CUTS ELECTRC POWER Consider the circuit shown in the Figure where bttery is connected to resistor R. A positive chrge dq will gin potentil energy s it moves from point to point b through

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying Vitli covers 1 Definition. A Vitli cover of set E R is set V of closed intervls with positive length so tht, for every δ > 0 nd every x E, there is some I V with λ(i ) < δ nd x I. 2 Lemm (Vitli covering)

More information

Handout: Natural deduction for first order logic

Handout: Natural deduction for first order logic MATH 457 Introduction to Mthemticl Logic Spring 2016 Dr Json Rute Hndout: Nturl deduction for first order logic We will extend our nturl deduction rules for sententil logic to first order logic These notes

More information

A recursive construction of efficiently decodable list-disjunct matrices

A recursive construction of efficiently decodable list-disjunct matrices CSE 709: Compressed Sensing nd Group Testing. Prt I Lecturers: Hung Q. Ngo nd Atri Rudr SUNY t Bufflo, Fll 2011 Lst updte: October 13, 2011 A recursive construction of efficiently decodble list-disjunct

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

Spanning tree congestion of some product graphs

Spanning tree congestion of some product graphs Spnning tree congestion of some product grphs Hiu-Fi Lw Mthemticl Institute Oxford University 4-9 St Giles Oxford, OX1 3LB, United Kingdom e-mil: lwh@mths.ox.c.uk nd Mikhil I. Ostrovskii Deprtment of Mthemtics

More information

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one

More information

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams

Chapter 4 Contravariance, Covariance, and Spacetime Diagrams Chpter 4 Contrvrince, Covrince, nd Spcetime Digrms 4. The Components of Vector in Skewed Coordintes We hve seen in Chpter 3; figure 3.9, tht in order to show inertil motion tht is consistent with the Lorentz

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tble of Contents: Week 1: Preliminries (set lgebr, reltions, functions) (red Chpters 1-4) Weeks

More information

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes

Jim Lambers MAT 169 Fall Semester Lecture 4 Notes Jim Lmbers MAT 169 Fll Semester 2009-10 Lecture 4 Notes These notes correspond to Section 8.2 in the text. Series Wht is Series? An infinte series, usully referred to simply s series, is n sum of ll of

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages C 314 Principles of Progrmming Lnguges Lecture 6: LL(1) Prsing Zheng (Eddy) Zhng Rutgers University Ferury 5, 2018 Clss Informtion Homework 2 due tomorrow. Homework 3 will e posted erly next week. 2 Top

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Mehryr Mohri Cournt Institute nd Google Reserch mohri@cims.nyu.com Preliminries Finite lphet Σ, empty string. Set of ll strings over

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar)

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar) Lecture 3 (5.3.2018) (trnslted nd slightly dpted from lecture notes by Mrtin Klzr) Riemnn integrl Now we define precisely the concept of the re, in prticulr, the re of figure U(, b, f) under the grph of

More information

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation Strong Bisimultion Overview Actions Lbeled trnsition system Trnsition semntics Simultion Bisimultion References Robin Milner, Communiction nd Concurrency Robin Milner, Communicting nd Mobil Systems 32

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri Preliminries Finite lphet, empty string.

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information