Informed learners DRAFT. Understanding is compression, comprehension is compression! Greg Chaitin (Chaitin, 2007)

Size: px
Start display at page:

Download "Informed learners DRAFT. Understanding is compression, comprehension is compression! Greg Chaitin (Chaitin, 2007)"

Transcription

1 12 Informed lerners Understnding is compression, comprehension is compression! Greg Chitin (Chitin, 2007) Compro. Hl de un juego donde ls regls no sen l líne de slid, sino el punto de llegd No? Arturo Pérez-Reverte, el pintor de tlls Lerning from n informnt is the setting in which the dt consists of lelled strings, ech lel indicting whether or not the string elongs to the trget lnguge. Of ll the issues which grmmticl inference scientists hve worked on, this is proly the one on which most energy hs een spent over the yers. Algorithms hve een proposed, competitions hve een lunched, theoreticl results hve een given. On one hnd, the prolem hs een proved to e on pr with mighty theoreticl computer science questions rising from comintorics, numer theory nd cryptogrphy, nd on the other hnd cunning heuristics nd techniques employing ides from rtificil intelligence nd lnguge theory hve een devised. There would e point in presenting this theme with specil focus on the clss of context-free grmmrs with hope tht the theory for the prticulr clss of the finite utomt would follow, ut the history nd the techniques tell us otherwise. The min focus is therefore going to e on the simpler yet sufficiently rich question of lerning deterministic finite utomt from positive nd negtive exmples. We shll justify this s follows: On one hnd the tsk is hrd enough, nd, through showing wht doesn t work nd why, we will hve precious insight into more complex clsses. On the other hnd nything useful lernt on DFAs cn e nicely trnsferred thnks to reductions (see Chpter 7) to other supposedly richer clsses. Finlly, there re historicl resons: on the specific question of lerning DFAs from n informed presenttion, some of the most importnt lgorithms in grmmticl inference hve een invented, nd mny new ides hve een introduced due to this effort. The specific question of lerning context-free grmmrs nd lnguges from n informnt will e studied s seprte prolem, in Chpter

2 238 Informed lerners 12.1 The prefix tree cceptor (PTA) We shll e deling here with lerning smples composed of lelled strings: Definition Let e n lphet. An informed lerning smple is mde of two sets S + nd S such tht S + S =. The smple will e denoted s S = S +, S. We will lterntively denote (x, 1) S for x S + nd (x, 0) S for x S.Let A =, Q, q λ, F A, F R,δ e DFA. Definition A is wekly consistent with the smple S= S +, S if def x S +,δ(q λ, x) F A nd x S,δ(q λ, x) F A. Definition A is strongly consistent with the smple S= S +, S if def x S +,δ(q λ, x) F A nd x S,δ(q λ, x) F R. Exmple The DFA from Figure 12.1 is only wekly consistent with the smple {(, 1), (, 0), (, 0)} which cn lso e denoted s: S + ={} S ={, }. String s in stte q λ which is unlelled (neither ccepting nor rejecting), nd string (from S ) cnnot e entirely prsed. Therefore the DFA is not strongly consistent. On the other hnd the sme utomton cn e shown to e strongly consistent with the smple {(, 1), (, 1)(, 0)}. A prefix tree cceptor (PTA) is tree-like DFA uilt from the lerning smple y tking ll the prefixes in the smple s sttes nd constructing the smllest DFA which is tree ( q Q, {q : δ(q, ) = q} 1}), strongly consistent with the smple. A forml lgorithm (BUILD-PTA) is given (Algorithm 12.1). An exmple of PTA is shown in Figure Note tht we cn lso uild PTA from set of positive strings only. This corresponds to uilding the PTA ( S +, ). In tht cse, for the sme smple we would get the PTA represented in Figure q λ q q q Fig A DFA.

3 12.1 The prefix tree cceptor (PTA) 239 Algorithm 12.1: BUILD-PTA. Input: smple S +, S Output: A = PTA( S +, S ) =, Q, q λ, F A, F R,δ F A ; F R ; Q {q u : u PREF(S + S )}; for q u Q do δ(q u, ) q u ; for q u Q do if u S + then F A F A {q u }; if u S then F R F R {q u } return A q λ q q q q q q q q Fig PTA ({(, 1), (, 1), (, 1), (, 0), (, 0)}). q λ q q q q q q q Fig PTA ({(, 1), (, 1), (, 1)}). In Chpter 6 we will consider the prolem of grmmticl inference s the one of serching inside spce of dmissile ised solutions, in which cse we will introduce non-deterministic version of the PTA. Most lgorithms will tke the PTA s strting point nd try to generlise from it y merging sttes. In order not to get lost in the process (nd not undo merges tht hve een mde some time go) it will e interesting to divide the sttes into three ctegories:

4 240 Informed lerners q λ q q q q q q q q Fig Colouring of sttes: RED ={q λ, q },BLUE ={q, q, q }, nd ll the other sttes re WHITE. q λ q q Fig The DFA A q. The RED sttes which correspond to sttes tht hve een nlysed nd which will not e revisited; they will e the sttes of the finl utomton. The BLUE sttes which re the cndidte sttes: they hve not een nlysed yet nd it should e from this set tht stte is drwn in order to consider merging it with RED stte. The WHITE sttes, which re ll the others. They will in turn ecome BLUE nd then RED. Exmple We conventionlly drw the RED sttes in drk grey nd the BLUE ones in light grey s in Figure 12.4, where RED ={q λ, q } nd BLUE ={q, q, q }. We will need to descrie the suffix lnguge in ny stte q, consisting of the lnguge recognised y the utomton when tking this stte q s initil. We denote this utomton formlly y A q with L(A q ) ={w : δ(q,w) F A }. In Figure 12.5 we hve used the utomton A from Figure 12.1 nd chosen stte q s initil The sic opertions We first descrie some opertions common to mny of the stte merging techniques. Stte merging techniques itertively consider n utomton nd two of its sttes nd im to merge them. This will e done when these sttes re comptile. Sometimes, when noticing tht prticulr stte cnnot e merged, it gets promoted. Furthermore t ny moment ll sttes re either RED, BLUE or WHITE. Let us lso suppose tht the current utomton is consistent with the smple. The strting point is the prefix tree cceptor (PTA). Initilly, in the PTA, the unique RED stte is q λ wheres the BLUE sttes re the immedite successors of q λ. q

5 12.2 The sic opertions 241 q λ q q q λ q q λ () PTA({, 1), (λ, 0)}. () After merging q λ nd q. Fig Incomptiility is not locl ffir. (c) After merging q λ, q nd q. There re three sic opertions tht shll e systemticlly used nd need to e studied indepently of the lerning lgorithms: COMPATIBLE, MERGE nd PROMOTE COMPATIBLE: deciding equivlence etween sttes The question here is of deciding if two sttes re comptile or not. This is the sme s deciding equivlence for the Nerode reltion, ut with only prtil knowledge out the lnguge. As oviously we do not hve the entire lnguge to help us decide upon this, ut only the lerning smple, the question is to know if merging these two sttes will not result in creting confusion etween ccepting nd rejecting sttes. Typiclly the comptiility might e tested y: q A q L FA (A q ) L FR (A q ) = nd L FR (A q ) L FA (A q ) =. But this is usully not enough s the following exmple (Figure 12.6) shows. Consider the three-stte PTA (Figure 12.6()) uilt from the smple S ={(, 1), (λ,0)}. Deciding equivlence etween sttes q λ nd q through the formul ove is not sufficient. Indeed lnguges L(A qλ ) nd L(A q ) re wekly consistent, ut if q λ nd q re merged together (Figure 12.6()), the stte q must lso e merged with these (Figure 12.6(c)) to preserve determinism. This results in prolem: is the new unique stte ccepting or rejecting? Therefore more complex opertions will e needed, involving merging, folding nd then testing consistency MERGE: merging two sttes The merging opertion tkes two sttes from n utomton nd merges them into single stte. It should e noted tht the effect of the merge is tht deterministic utomton (see Figure 12.7()) will possily lose the determinism property through this (Figure 12.7()). Indeed this is where the lgorithms cn reject merge. Consider for instnce utomton 12.8(). If sttes q 1 nd re merged, then to ensure determinism, sttes q 3 nd q 4 will lso hve to e merged, resulting in utomton 12.8(). If we hve in our lerning smple string (in S ), then the merge should e rejected. Algorithm 12.2 is given n NFA (with just one initil stte, for simplicity), nd two sttes. It updtes the utomton.

6 242 Informed lerners q q 1 q 3 q 4 q 3, q q 1 q 4 () Before merging sttes q 1 nd. () After merging the sttes. Fig Merging two sttes my result in the utomton not remining deterministic. q q 1 q 3 q 4 () Before merging sttes 1 nd 2. Algorithm 12.2: MERGE. q 5 q 6 Fig Aout merging. q 5, q q 1 q 3 q 6 () After merging the sttes recursively (from 1 nd 2). Input: nnfa: A =, Q, I, F A, F R,δ N, q 1 nd in Q, with I Output: nnfa: A =, Q, I, F A, F R,δ N in which q 1 nd hve een merged into q 1 for q Q do for do if δ N (q, ) then δ N (q, ) δ N (q, ) {q 1 }; if q δ N (, ) then δ N (q 1, ) δ N (q 1, ) {q} if I then I I {q 1 }; if F A then F A F A {q 1 }; if F R then F R F R {q 1 }; Q Q \{ }; return A Since non-deterministic utomt re in mny wys cumersome, we will ttempt to void hving to use these to define merging when mnipulting only deterministic utomt PROMOTE: promoting stte Promotion is nother deterministic nd greedy decision. The ide here is tht hving decided, t some point, tht BLUE cndidte stte is different from ll the RED sttes, it

7 12.3 Gold s lgorithm 243 Algorithm 12.3: PROMOTE. Input: DFA: A =, Q, q λ, F A, F R,δ, BLUE stte q u,setsred, BLUE Output: DFA: A =, Q, q λ, F A, F R,δ, setsred, BLUE updted RED RED {q u }; for : q u is not RED do dd q u to BLUE; return A should ecome RED. We cll this promotion nd descrie the process in Algorithm The nottions tht re used here pply to the cse where the sttes involved in promotion re the sis of tree. Therefore, the successors of node q u re nmed q u with in Gold s lgorithm The first non-enumertive lgorithm designed to uild DFA from informed dt is due to E. Mrk Gold, which is why we shll simply cll this lgorithm GOLD. The gol of the lgorithm is to find the minimum DFA consistent with the smple. For tht, there re two steps. The first is deductive: from the dt, find set of prefixes tht hve to led to different sttes for the resons given in Section ove, nd therefore represent n incompressile set of sttes. The second step is inductive: ls, fter finding the incompressile set of sttes, we re not done ecuse it is not usully esy or even possile to fold in the rest of the sttes. Since direct construction of the DFA from there is usully impossile, (contrdictory) decisions hve to e tken. This is where rtificil intelligence techniques might come in (see Chpter 14) s the prolems one hs to solve re proved to e intrctle (in Chpter 6). But s more nd more strings ecome ville to the lerning lgorithm (i.e. in the identifiction in the limit prdigm), the numer of choices left will ecome more nd more restricted, with, finlly, just one choice. This is wht llows convergence The key ides The min ides of the lgorithm re to represent the dt (positive nd negtive strings) in tle, where ech row corresponds to string, some of which will correspond to the RED sttes nd the others to the BLUE sttes. The gol is to crete through promotion s mny RED sttes s possile. For this to e of rel use, the set of strings denoting the sttes will e closed y prefixes, i.e. if q uv is stte so is q u. Formlly: Definition (Prefix- nd suffix-closed sets) A set of strings S is prefix-closed (respectively suffix-closed) if def uv S = u S (respectively if uv S = v S). No inference is mde during this phse of representtion of the dt: the lgorithm is purely deductive.

8 244 Informed lerners The tle then expresses n inequivlence reltion etween the strings nd we should im to complete this inequivlence relted to the Nerode reltion tht defines the lnguge: x y [ w xw L yw L ]. Once the RED sttes re decided, the lgorithm chooses to merge the BLUE sttes tht re left with the RED ones nd then checks if the result is consistent. If it is not, the lgorithm returns the PTA. In the following we will voluntrily ccept confusion etween the sttes themselves nd the nmes or lels of the sttes. The informtion is orgnised in tle STA, EXP, OT, clled n oservtion tle, used to compre the cndidte sttes y exmining the dt, where the three components re: STA is finite set of (lels of) sttes. The sttes will e denoted y the indexes (strings) from finite prefix-closed set. Becuse of this lelling, we will often conveniently use string terminology when referring to the sttes. Set STA will therefore oth refer to the set of sttes nd to the set of lels of these sttes, with context lwys llowing us to determine which. We prtition STA s follows: STA = RED BLUE. TheBLUE sttes (or stte lels) re those u in STA such tht uv STA = v = λ. TheRED sttes re the others. BLUE ={u RED : u RED} is the set of sttes successors of RED tht re not RED. EXP is the experiment set. This set is closed y suffixes, i.e. if uv is n experiment, so is v. OT : STA EXP {0, 1, } is function tht indictes if mking n experiment in stte q u is going to result into n ccepting, rejecting or n unknown sitution: the vlue OT[u][e] is then respectively 1,0 or. In order to improve redility we will write it s tle indexed y two strings, the first indicting the lel of the stte from which the experiment is mde nd the second eing the experiment itself: 1 ifue L OT[u][e] = 0 ifue L otherwise (not known). Oviously, the tle should e redundnt in the following sense. Given three strings u,v nd w, ifot[u][vw] nd OT[uv][w] re defined (i.e. u, uv STA nd w, vw EXP), then OT[u][vw] =OT[uv][w]. Exmple In oservtion tle 12.1 we cn red: STA ={λ,,,, } nd mong these RED ={λ,}. OT[][λ] = 0so L. On the other hnd we hve OT[][λ] =1 which mens tht L. Note lso tht the tle is redundnt: for exmple, OT[][λ] =OT[][] =0, nd similrly OT[λ][] =[][λ] =. This is only due to the fct tht the tle is n oservtion of the dt. It does not compute or invent new informtion. In the following, we will e only considering legl tles, i.e. those tht re sed on set STA prefix-closed, set EXP suffix-closed nd redundnt tle. Leglity cn e

9 Algorithm 12.4: GOLD-check leglity Gold s lgorithm 245 Tle An oservtion tle. λ λ Input: tle STA, EXP, OT Output: Boolen indicting if the tle is legl or not OK true; for s STA do /* check if STA is prefix-closed */ if PREF(s) STA then OK flse for e EXP do /* check if EXP is suffix-closed */ if PREF(e) EXP then OK flse for p STA do /* check if ll is legl */ for e EXP do for p PREF(e) do if [sp STA OT[s][e] = OT[sp][p 1 e] then OK flse return OK checked with Algorithm The complexity of the lgorithm cn e esily improved (see Exercise 12.1). In prctice, good policy is to stock the dt in n ssocition tle, with the string s key nd the lel s vlue, nd to mnipulte in the oservtion tle just the keys to the tle. This mkes the leglity issue esy to del with. Definition (Holes) A hole intle STA, EXP, OT is pir (u, e) such tht OT[u][e] =. A hole corresponds to missing oservtion. Definition (Complete tle) The tle STA, EXP, OT is complete (or hs no holes) if def u STA, e EXP, OT[u][e] {0, 1}.

10 246 Informed lerners λ λ () An oservtion tle, which is not closed, ecuse of row. Fig λ λ () A complete nd closed oservtion tle. Definition (Rows) We will refer to OT[u] s the row indexed y u nd will sy tht two rows OT[u] nd OT[v] re comptile for OT (or u nd v re consistent for OT) if def e EXP : ( OT[u][e] =0 nd OT[v][e] =1 ) or ( OT[u][e] =1 nd OT[v][e] =0 ). We denote this y u OT v. The gol is not going to e to detect if two rows re comptile, ut if they re not. Definition (Oviously different rows) Rows u nd v re oviously different (OD) for OT (we lso write OT[u] nd OT[v] re oviously different) if def e EXP : OT[u][e], OT[v][e] {0, 1} nd OT[u][e] = OT[v][e]. Exmple Tle 12.1 is incomplete since it hs holes. Rows OT[λ] nd OT[] re incomptile (nd OD), ut row OT[] is comptile with oth OT[λ] nd OT[] Complete tles We now consider the idel (leit unrelistic) setting where there re no holes in the tle. Definition (Closed tle) A tle OT is closed if def u BLUE, s RED : OT[u] =OT[s]. The tle presented in Figure 12.9() is not closed (ecuse of row ) ut Tle 12.9() is. Being closed mens tht every BLUE stte cn e mtched with RED one Building DFA from complete nd closed tle Building n utomton from tle STA, EXP, OT cn e done very esily s soon s certin conditions re met: The set of strings mrking the sttes in STA is prefix-closed, The set EXP is suffix-closed, The tle should e complete: holes correspond to undetermined pieces of informtion, The tle should e closed.

11 12.3 Gold s lgorithm 247 λ λ () The oservtion tle. q λ q q q q λ q () The trnsition tle. q λ, (c) The utomton. Fig A tle nd the corresponding utomton. Once these conditions hold we cn use Algorithm GOLD-BUILDAUTOMATON (12.5) nd convert the tle into DFA. Algorithm 12.5: GOLD-BUILDAUTOMATON. Input: closed nd complete oservtion tle (STA, EXP, OT) Output: DFAA =, Q, q λ, F A, F R,δ Q {q r : r RED}; F A {q we : we RED OT[w][e] =1}; F R {q we : we RED OT[w][e] =0}; for q w Q do for do δ(q w, ) q u : u RED OT[u] =OT[w] return, Q, q λ, F A, F R,δ Exmple Consider Tle 12.10(). We cn pply the construction from Algorithm 12.5 nd otin Q = {q λ, q }, F A = {q }, F R = {q λ } nd δ is given y the trnsition tle 12.10(). Automton 12.10(c) cn e uilt. Definition (Consistent tle) Given n utomton A nd n oservtion tle STA, EXP, OT, A is consistent with STA, EXP, OT when the following holds: OT[u][e] =1 = ue L FA (A), OT[u][e] =0 = ue L FR (A). L FA (A) is the lnguge recognised y A y ccepting sttes, wheres L FR (A) is the lnguge recognised y A y rejecting sttes. Theorem (Consistency theorem) Let STA, EXP, OT e n oservtion tle closed nd complete. If STA is prefix-closed nd EXP is suffix-closed then GOLD-BUILDAUTOMATON( STA, EXP, OT ) is consistent with the informtion in STA, EXP, OT. q

12 248 Informed lerners Proof The proof is strightforwrd s GOLD-BUILDAUTOMATON uilds DFA directly from the dt from STA, EXP, OT Building tle from the dt The second question is tht of otining tle from smple. At this point we wnt the tle to e consistent with the smple, nd to e just n lterntive representtion of the smple. Given smple S nd set of sttes RED prefix-closed, it is lwys possile to uild set of experiments EXP such tht the tle STA, EXP, OT contins ll the informtion in S (nd no other informtion!). There cn e mny possile tles, one corresponding to ech set of RED sttes we wish to consider. And, of course, in most cses, these tles re going to hve holes. Algorithm 12.6: GOLD-BUILDTABLE. Input: smples = S +, S,setRED prefix-closed Output: tle STA, EXP, OT EXP SUFF(S); BLUE RED \ RED; for p RED BLUE do for e EXP do if p.e S + then OT[p][e] 1 else if p.e S then OT[p][e] 0 else OT[p][e] return STA, EXP, OT Algorithm GOLD-BUILDTABLE (12.6) uilds the tle corresponding to given smple nd specific set RED. Exmple Tle 12.2 constructed for the smple S = {(, 1), (, 1), (, 0)} nd for the set of RED sttes {λ, } is given here. We hve not entered the symols to increse redility (i.e. n empty cell denotes symol ). We notice the following: Updting the tle Proposition If t BLUE such tht OT[t] is oviously different from ny OT[s] (with s RED), then whtever wy we fill the holes in STA, EXP, OT, the tle will not e closed.

13 12.3 Gold s lgorithm 249 Tle GOLD-BUILDTABLE ( {(, 1), (, 1), (, 0)}), {λ, } ). λ λ In other words, if one BLUE stte is oviously different from ll the RED sttes, then even guessing ech correctly is not going to e enough. This mens tht this BLUE stte should e promoted efore ttempting to fill in the holes Algorithm GOLD The generl lgorithm cn now e descried. It is composed of four steps requiring four su-lgorithms: (i) Given smple S,Algorithm GOLD-BUILDTABLE (12.6,pge248) uilds n initil tle with RED ={λ}, BLUE = nd E = SUFF(S). (ii) Find BLUE stte oviously different (OD) with ll RED sttes. Promote this BLUE stte to RED nd repet. (iii) Fill in the holes tht re left (using Algorithm GOLD-FILLHOLES). If the filling of the holes fils, return the PTA (using Algorithm BUILD-PTA (12.1,pge239)). (iv) Using Algorithm GOLD-BUILDAUTOMATON (12.5, pge247), uild the utomton. If it is inconsistent with the originl smple, return the PTA insted A run of the lgorithm Exmple We provide n exmple run of Algorithm GOLD (12.7). We use the following smple: S + ={,,, } S ={,,, }. We first uild the oservtion tle corresponding to RED ={λ}. Now, Tle 12.3 is not closed ecuse of row OT[]. Sowepromoteq nd updte the tle, otining Tle Tle The tle for S + ={,,, } S ={,,, } nd RED ={λ}. λ λ

14 250 Informed lerners Algorithm 12.7: GOLD for DFA identifiction. Input: smples Output: DFA consistent with the smple RED {λ}; BLUE ; STA, EXP, OT GOLD-BUILDTABLE(S,RED); while x BLUE such tht OT[x] is OD do RED RED {x}; BLUE BLUE {x : }; for u STA do for e EXP do if ue S + then OT[u][e] 1 else if ue S then OT[u][e] 0 else OT[u][e] OT GOLD-FILLHOLES(OT); if fil then return BUILD-PTA(S) else A GOLD-BUILDAUTOMATON( STA, EXP, OT ); if CONSISTENT(A,S) then return A else return BUILD-PTA(S) Tle The tle for S + ={,,, } S ={,,, } nd RED ={λ, }. λ λ But Tle 12.4 is not closed ecuse of OT[]. Since q is oviously different from oth q λ (ecuse of experiment λ) nd q (ecuse of experiment ), we promote q nd updte the tle to Tle At this point there re no BLUE rows tht re oviously different from the RED rows. Therefore ll tht is needed is to fill the holes. Algorithm GOLD-FILLHOLES is now used to mke the tle complete.

15 12.3 Gold s lgorithm 251 Tle The tle for S + ={,,, } S ={,,, } nd RED ={λ,, }. λ λ Tle The tle for S + ={,,, } S ={,,, } fter running the first phse of GOLD-FILLHOLES. λ λ Tle The tle for S + ={,,, } S ={,,, } fter phse 2 of GOLD-FILLHOLES. λ λ This lgorithm first fills the rows corresponding to the RED rows y using the informtion contined in the BLUE rows which re comptile (in the sense of Definition ). In this cse there re numer of possiilities which my conflict. For exmple, we hve OT λ ut lso OT. And we re only considering pirs where the first prefix/stte is BLUE nd the second is RED. We suppose tht in this prticulr cse, the lgorithm hs selected OT, OT λ, OT λ nd OT. This results in uilding first Tle Then ll the holes in the RED rows re filled y 1s (Tle 12.7).

16 252 Informed lerners Tle The complete tle for S + ={,,, } S ={,,, } fter running GOLD-FILLHOLES. λ λ Algorithm 12.8: GOLD-FILLHOLES. Input: tle STA, EXP, OT Output: the tle OT updted, with holes filled for p BLUE do /* First fill in ll the RED lines */ if r RED : p OT r then /* Find comptile RED */ for e EXP do if OT[ p][e] = then OT[r][e] OT[ p][e] else return fil for r RED do for e EXP do if OT[r][e] = then OT[r][e] 1 for p BLUE do /* Now fill in ll the BLUE lines */ if r RED : p OT r then /* Find comptile RED */ for e EXP do if OT[ p][e] = then OT[ p][e] OT[r][e] else return fil return STA, EXP, OT The second prt of the lgorithm gin visits the BLUE sttes, tries to find comptile RED stte nd copies the corresponding RED row. This results in Tle Finlly, the third prt of Algorithm 12.8 fills the remining holes of the BLUE rows.this results in Tle 12.8.

17 12.3 Gold s lgorithm 253 q q q q q λ () The sfe informtion., q λ () Guessing the missing informtion. Fig The finl filling of the holes for GOLD. Next, Algorithm GOLD-BUILDAUTOMATON (12.5) is run on the tle, with the resulting utomton depicted in Figure 12.11(). The DFA ccepts which is in S +, therefore the PTA is returned insted. One might consider tht GOLD-FILLHOLES is fr too trivil for such complex tsk s tht of lerning utomt. Indeed, when considering Tle 12.5 certin numer of sfe decisions out the utomton cn e mde. These re depicted in Figure 12.11(). The others hve to e guessed: equivlent lines for could e λ nd (so q could e either q λ or q ), possile cndidtes for re λ nd (so q could e either q λ or q ), possile cndidtes for re λ nd (so q could e either q λ or q ). In the choices ove not only should the possiilities efore merging e considered, ut lso the interctions etween the merges. For exmple, even if in theory oth sttes q nd q could e merged into stte q λ, they oth cnnot e merged together! We will not enter here into how this guessing cn e done (greediness is one option). Therefore the lgorithm, hving filed, returns the PTA depicted in Figure 12.12(). If it hd guessed the holes correctly n utomton consistent with the dt (see Figure 12.12()) might hve een returned Proof of the lgorithm We first wnt to prove tht, ls, filling in the holes is where the rel prolems strt. There is no trctle strtegy tht will llow us to fill the holes esily: Theorem (Equivlence of prolems) Let RED e set of sttes prefix-closed, nd S e smple. Let STA, EXP, OT e n oservtion tle consistent with ll the dt in S, with EXP suffix-closed.

18 254 Informed lerners, q q q q q q λ The question: q q () The PTA. q q q λ () The correct solution. Fig The result nd the correct solution. Nme: Consistent Instnce: A smple S, prefix-closed set RED Question: Does there exist DFA =, Q, q λ, F A, F R,δ with Q ={q u : u RED}, nd if u RED, δ(q u, ) = q u, consistent with S? is equivlent to: Nme: Holes Instnce: An oservtion tle STA, EXP, OT Question: Cn we fill the holes in such wy s to hve STA, EXP, OT closed? Proof If we cn fill the holes nd otin closed tle, then DFA cn e constructed which is consistent with the dt (y Theorem ). If there is DFA with the sttes of RED then we cn use the DFA to fill the holes. From this we hve: Theorem The prolem: Nme: Minimum consistent DFA rechle from RED Instnce: A smple S, set RED prefix-closed such tht ech OT[s] (s RED) is oviously different from ll the others with respect to S, nd positive integer n Question: Is there DFA =, Q, q λ, F A, F R,δ with the conditions {q u : u RED} Q, Q =n, consistent with S? is NP-complete.

19 12.4 RPNI 255 And s consequence: Corollry Given S nd positive integer n, the question: Nme: Minimum consistent DFA Instnce: A smple S nd positive integer n Question: Is there DFA with n sttes consistent with S? is NP-complete. Proof We leve the proofs tht these prolems re NP-hrd to Section 6.2 (pge 119). Proving tht either of the prolems elongs to NP is not difficult: simply producing DFA nd checking consistency is going to tke polynomil time only s it consists of prsing the strings in S. On the positive side, identifiction in the limit cn e proved: Theorem Algorithm GOLD, given ny smple S = S +, S : outputs DFA consistent with S, dmits polynomil-sized chrcteristic smple, runs in time nd spce polynomil in S. Proof Rememer tht in the worst cse the PTA is returned. The chrcteristic smple cn e constructed in such wy s to mke sure tht ll the sttes re found to e OD. The numer of such strings is qudrtic in the size of the trget utomton. Furthermore it cn e proved tht none of these strings needs to e of length more thn n 2. It should e noted tht for this to work, the order in which the BLUE sttes re explored for promotion mtters. If the size of the cnonicl cceptor of the lnguge is n, then there is chrcteristic smple CS L with CS L =2n 2 ( +1), such tht GOLD(S) produces the cnonicl cceptor for ll S CS L. Spce complexity is in O( S n) wheres time complexity is in O(n 2 S ). We leve for Exercise 12.5 the question of otining fster lgorithm. Corollry Algorithm GOLD identifies DFA( ) in POLY-CS time. Identifiction in the limit in POLY-CS polynomil time follows from the previous remrks RPNI In Algorithm GOLD, descried in Section 12.3, there is more thn rel chnce tht fter mny itertions the finl prolem of filling the holes is not solved t ll (nd perhps cnnot e solved unless more sttes re dded) nd the PTA is returned. Even if this is mthemticlly dmissile (since identifiction in the limit is ensured), in prctice one

20 256 Informed lerners would prefer n lgorithm tht does some sort of generlistion in ll circumstnces, nd not just in the fvourle ones. This is wht is proposed y lgorithm RPNI (Regulr Positive nd Negtive Inference). The ide is to greedily crete clusters of sttes (y merging) in order to come up with solution tht is lwys consistent with the lerning dt. This pproch ensures tht some type of generlistion tkes plce nd, in the est of cses (which we cn chrcterise y giving sufficient conditions tht permit identifiction in the limit), returns the correct trget utomton The lgorithm We descrie here generic version of Algorithm RPNI. A numer of vrints hve een pulished tht re not exctly equivlent. These cn e studied in due course. Bsiclly, Algorithm RPNI (12.13) strts y uilding PTA(S + ) from the positive dt (Algorithm BUILD-PTA (12.1, pge 239)), then itertively chooses possile merges, checks if given merge is correct nd is mde etween two comptile sttes (Algorithm RPNI-COMPATIBLE (12.10)), mkes the merge (Algorithm RPNI-MERGE (12.11)) if dmissile nd promotes the stte if no merge is possile (Algorithm RPNI-PROMOTE (12.9)). The lgorithm hs s strting point the PTA, which is deterministic finite utomton. In order to void prolems with non-determinism, the merge of two sttes is immeditely followed y folding opertion: the merge in RPNI lwys occurs etween RED stte nd BLUE stte. The BLUE sttes hve the following properties: If q is BLUE stte, it hs exctly one predecessor, i.e. whenever δ(q 1, 1 ) = δ(, 2 ) = q, then necessrily q 1 = nd 1 = 2. q is the root of tree, i.e. if δ(q, u )=δ(q,v ) then necessrily u = v nd =. Algorithm 12.9: RPNI-PROMOTE. Input: DFAA =, Q, q λ, F A, F R,δ, setsred, BLUE Q, q u BLUE Output: A, RED, BLUE updted RED RED {q u }; BLUE BLUE {δ(q u, ), }; return A, RED, BLUE Algorithm RPNI-PROMOTE (12.9), given BLUE stte q u, promotes this stte to RED nd ll the successors in A of this stte ecome BLUE. Algorithm RPNI-COMPATIBLE (12.10) returns YES if the current utomton cnnot prse ny string from S ut returns NO if some counter-exmple is ccepted y the current utomton. Note tht the utomton A is deterministic.

21 12.4 RPNI 257 Algorithm 12.10: RPNI-COMPATIBLE. Input: A, S Output: Boolen, indicting if A is consistent with S for w S do if δ A (q λ,w) F A = then return flse return true Algorithm RPNI-MERGE (12.11) tkes s rguments RED stte q nd BLUE stte q. It first finds the unique pir (q f, ) such tht q = δ A (q f, ). This pir exists nd is unique ecuse q is BLUE stte nd therefore the root of tree. RPNI-MERGE then redirects δ(q f, ) to q. After tht, the tree rooted in q (which is therefore disconnected from the rest of the DFA) is folded (RPNI-FOLD) into the rest of the DFA. The possile intermedite situtions of non-determinism (see Figure 12.7, pge 242) re delt with during the recursive clls to RPNI-FOLD. This two-step process is shown in Figures nd Algorithm 12.11: RPNI-MERGE. Input: DFAA, sttes q RED, q BLUE Output: A updted Let (q f, ) e such tht δ A (q f, ) = q ; δ A (q f, ) q; return RPNI-FOLD(A, q, q ) Algorithm RPNI (12.13) deps on the choice of the function CHOOSE. Provided it is deterministic function (such s one tht chooses the miniml u, in the lexicogrphic order), convergence is ensured. The RED sttes q λ q q f q Fig The sitution efore merging.

22 258 Informed lerners Algorithm 12.12: RPNI-FOLD. Input: DFAA, sttes q, q Q, q eing the root of tree Output: A updted, where sutree in q is folded into q if q F A then F A F A {q}; for do if δ A (q, ) is defined then if δ A (q, ) is defined then A RPNI-FOLD(A, δ A (q, ), δ A (q, )) else δ A (q, ) δ A (q, ); return A Theorem time. The RED sttes q λ q q f q Fig The sitution fter merging nd efore folding The lgorithm s proof Algorithm RPNI (12.13) identifies in the limit DFA( ) in POLY-CS Proof We use here Definition (pge 154). We first prove tht the lgorithm computes consistent solution in polynomil time. First note tht the size of the PTA ( PTA eing the numer of sttes) is polynomil in S. The function CHOOSE cn only e clled t most PTA numer of times. At ech cll, comptiility of the running BLUE stte will e checked with ech stte in RED. This gin is ounded y the numer of sttes in the PTA. And checking comptiility is lso polynomil. Then to prove tht there exists polynomil chrcteristic set we constructively dd the exmples sufficient for identifiction to tke plce. Let A =, Q, q λ, F A, F R,δ e the complete miniml cnonicl utomton for the trget lnguge L. Let< CHOOSE e the order reltion ssocited with function CHOOSE. Then compute the minimum

23 12.4 RPNI 259 Algorithm 12.13: RPNI. Input: smples = S +, S, functions COMPATIBLE, CHOOSE Output: DFAA =, Q, q λ, F A, F R,δ A BUILD-PTA(S + ); RED {q λ }; BLUE {q : PREF(S + )}; while BLUE = do CHOOSE(q BLUE); BLUE BLUE \{q }; if q r RED such tht RPNI-COMPATIBLE(RPNI-MERGE(A, q r, q ), S ) then A RPNI-MERGE(A, q r, q ); BLUE BLUE {δ(q, ) : q RED δ(q, ) RED}; else A RPNI-PROMOTE(q, A) for q r RED do /* mrk rejecting sttes */ if λ (L(A qr ) 1 S ) then F R F R {q r } return A distinguishing string etween two sttes q u, q v (MD), nd the shortest prefix of stte q u (SP): MD(q u, q v ) = min <CHOOSE {w : ( ) ( δ(q u,w) F A δ(q v,w) F R δ(qu,w) ) F R δ(q v,w) F A }. SP(q u ) = min <CHOOSE {w : δ(q λ,w)= q u }. RPNI-CONSTRUCTCS(A) (Algorithm 12.14) uses these definitions to uild chrcteristic smple for RPNI, for the order < CHOOSE, nd the trget lnguge. MD(q u, q v ) represents the minimum suffix llowing us to estlish tht sttes q u nd q v should never e merged. For exmple, if we consider the utomton in Figure 12.15, in which the sttes re numered in order to void confusion, this string is for q 1 nd. SP(q u ) is the shortest prefix in the chosen order tht leds to stte q u. Normlly this string should e u itself. For exmple, for the utomton represented in Figure 12.15, SP(q 1 ) = λ,sp( ) =,SP(q 3 ) = nd SP(q 4 ) = A run of the lgorithm To run RPNI we first hve to select function CHOOSE. In this cse we use the lex-length order over the prefixes leding to stte in the PTA. This llows us to mrk the sttes once

24 260 Informed lerners Algorithm 12.14: RPNI-CONSTRUCTCS. Input: A =, Q, q λ, F A, F R,δ Output: S = S +, S S + ; S ; for q u Q do for q v Q do for such tht L(A qv ) = nd q u = δ(q v, ) do w MD(q u, q v ); if δ(q λ, u w) F A then S + S + SP(q u ) w; S S SP(q v ) w else S S SP(q u ) w; S + S + SP(q v ) w return S +, S q 1 q 4 Fig Shortest prefixes of DFA. nd for ll. With this order, stte q 1 corresponds to q λ in the PTA, to q, q 3 to q,q 4 to q, nd so forth. The dt for the run re: S + ={,,, } S ={,,, } From this we uild PTA(S + ), depicted in Figure We now try to merge sttes q 1 nd, y using Algorithm RPNI-MERGE with vlues A, q 1,. Once trnsition δ A (q 1, ) is redirected to q 1, we rech the sitution represented in Figure This is the point where Algorithm RPNI-FOLD is clled, in order to fold the sutree rooted in into the rest of the utomton; the result is represented in Figure q 3

25 12.4 RPNI 261 q 1 q 3 q 1 q 3 q 1 q 1 q 3 q 4 q 5 q 6 q 7 q 8 Fig PTA(S + ). q 4 q 5 q 6 q 7 q 8 Fig After δ A (q 1, ) = q 1. q 9 q 9 q 11 q 10 q 9 q 11 q 10 q 3 q 8 q 11 q 5 q 10 Fig After merging nd q 1. q 4 q 5 q 6 q 7 q 8 Fig The PTA with promoted. q 9 q 11 q 10 The resulting utomton cn now e tested for comptiility ut if we try to prse the negtive exmples we notice tht counter-exmple is ccepted. The merge is thus ndoned nd we return to the PTA. Stte is now promoted to RED, nd its successor q 4 is BLUE (Figure 12.19).

26 262 Informed lerners q 1 q 3 q 1, q 1 q 4 q 5 q 6 q 7 q 8 Fig Trying to merge q 3 nd q 1. q 9 q 6 q 4 q 7 q 10 q 11 Fig After the folding. q 4 q 5 q 6 q 7 q 8 Fig Trying to merge q 3 nd. q 9 q 11 q 10 q 9 q 10 q 11 So the next BLUE stte is q 3 nd we now try to merge q 3 with q 1. The utomton in Figure is uilt y considering the trnsition δ A (q 1, ) = q 1. Then RPNI-FOLD(A, q 1, q 3 ) is clled. After folding, we get n utomton (see Figure 12.21) which gin prses counterexmple s positive. Therefore the merge {q 1, q 3 } is ndoned nd we must now check the merge etween q 3 nd. After folding, we re left with the utomton in Figure which this time prses the counter-exmple s positive. Since q 3 cnnot e merged with ny RED stte, there is gin promotion: RED = {q 1,, q 3 }, nd BLUE ={q 4, q 5 }. The updted PTA is depicted in Figure The next BLUE stte is q 4 nd the merge we try is q 4 with q 1.But (which is the distinguishing suffix) is going to e ccepted y the resulting utomton (Figure 12.24).

27 12.4 RPNI 263 q 1 q 3 q 1 q 1 q 4 q 5 q 6 q 7 q 8 Fig The PTA with q 3 promoted. q 9 q 3 q 8 q 5 Fig Merging q 4 with q 1 nd folding. q 6 q 3 q 5 q 9 q 11 q 10 q 11 q 10 q 8 q 11 q 10 Fig Automton fter merging q 4 with q 3. The merge etween q 4 nd is then tested nd fils ecuse of now eing prsed. The next merge (q 4 with q 3 ) is ccepted. The resulting utomton is shown in Figure The next BLUE stte is q 5 ; notice tht stte q 6 hs the shortest prefix t tht point, ut wht counts is the sitution in the originl PTA. The next merge to e tested is q 5 with q 1 : it is rejected ecuse of string which is counter-exmple tht would e ccepted y the resulting utomton (represented in Figure 12.26). Then the lgorithm tries merging q 5 with : this involves folding in the different sttes q 8, q 10 nd q 11. The merge is ccepted, nd the utomton depicted in Figure is constructed. Finlly, BLUE stte q 6 is merged with q 1. This merge is ccepted, resulting in the utomton represented in Figure Lst, the sttes re mrked s finl (rejecting). The finl ccepting ones re correct, ut y prsing strings from S, stte is mrked s rejecting (Figure 12.29).

28 264 Informed lerners q 1 q q q 3 q 6 Fig Automton fter merging q 5 with q 1, nd folding. q 1 Fig Automton fter merging q 5 with, nd folding. q 1 q 3 Fig Automton fter merging q 6 with q 1. q 1 q 3 Fig Automton fter mrking the finl rejecting sttes Some comments out implementtion The RPNI lgorithm, if implemented s descried ove, does not scle up. It needs lot of further work done to it efore reching stisfctory implementtion. It will e necessry to come up with correct dt structure for the PTA nd the intermedite utomt. One should consider the solutions proposed y the different uthors working in the field. q 3 q 6

29 12.5 Exercises 265 The presenttion we hve followed here voids the hevy non-deterministic formlisms tht cn e found in the literture nd tht dd n extr (nd mostly unnecessry) difficulty to the implementtion Exercises 12.1 Algorithm 12.4 (pge 245) hs complexity in O( STA 2 + STA E 2 ).Findn lterntive lgorithm, with complexity in O( STA E ) Run Gold s lgorithm for the following dt: S + ={,,, } S ={,,,,, } 12.3 Tke RED ={q λ },EXP={λ,, }. Suppose S + ={λ, } nd S ={}. Construct the corresponding DFA, with Algorithm GOLD-BUILDAUTOMATON 12.5 (pge 247). Wht is the prolem? 12.4 Build n exmple where RED is not prefix-closed nd for which Algorithm GOLD- BUILDAUTOMATON 12.5 (pge 247) fils In Algorithm GOLD the complexity seems to dep on revisiting ech cell in OT vrious times in order to decide if two lines re oviously different. Propose dt structure which llows the first phse of the lgorithm (the deductive phse) to e in O(n S ) where n is the size (numer of sttes) of the trget DFA Construct the chrcteristic smple for the utomton depicted in Figure 12.30() with Algorithm RPNI-CONSTRUCTCS (12.14) Construct the chrcteristic smple for the utomton depicted in Figure 12.30(), s defined in Algorithm RPNI-CONSTRUCTCS (12.14) Run Algorithm RPNI for the order reltions lph nd lex-length on S + ={,,, } S ={,,,,, } We consider the following definition in which lerning lgorithm is supposed to lern from its previous hypothesis nd the new exmple: q 1 () q 1 q 3 q 3 () Fig Trget utomt for the exercises.

30 266 Informed lerners Definition An lgorithm A incrementlly identifies grmmr clss G in the limit if def given ny T in G, nd ny presenttion φ of L(T ), there is rnk n such tht if i n, A(φ(i), G i ) T. Cn we sy tht RPNI is incrementl? Cn we mke it incrementl? Wht do you think of the following conjecture? Conjecture of non-incrementlity of the regulr lnguges. There exists no incrementl lgorithm tht identifies the regulr lnguges in the limit from n informnt. More precisely, let A e n lgorithm tht given DFA A k, current hypothesis, nd lelled string w k (hence pir w k, l(w k ) ), returns n utomton A k+1. In tht cse we sy tht A identifies in the limit n utomton Tif def for no rnk k ove n, A k T. Note tht l(w) is the lel of string w,i.e. 1or Devise collusive lgorithm to identify DFAs from n informnt. The lgorithm should rely on n encoding of DFAs over the inted lphet. The lgorithm checks the dt, nd, if some string corresponds to the encoding of DFA, this DFA is uilt nd the smple is reconsidered: is the DFA miniml nd comptile for this smple? If so, the DFA is returned. If not, the PTA is returned. Check tht this lgorithm identifies DFAs in POLY-CS time Conclusions of the chpter nd further reding Biliogrphicl ckground In Section 12.1 we hve tried to define in uniform wy the prolem of lerning DFAs from n informnt. The notions developed here re sed on the common notion of the prefix tree cceptor, sometimes clled the ugmented PTA, which hs een introduced y vrious uthors (Alquézr & Snfeliu, 1994, Coste & Fredouille, 2003). It is customry to present lerning in more symmetric wy s generlising from the PTA nd controlling the generlistion (i.e. voiding over-generlistion) through the negtive exmples. This pproch is certinly justified y the cpcity to define the serch spce netly (Dupont, Miclet & Vidl, 1994): we will return to it in Chpters 6 nd 14. Here the presenttion consists of viewing the prolem s clssifiction question nd giving no dvntge to one clss over nother. Among the numer of resons for preferring this ide, there is strong cse for mnipulting three types of sttes, some of which re of unknown lel. There is point to e mde here: if in idel conditions, nd when convergence is reched, the hypothesis DFA (eing exctly the trget) will only hve finl sttes (some ccepting nd some rejecting), this will not e the cse when the result is incorrect. In tht cse deciding tht ll the non-ccepting sttes re rejecting is ound to e worse thn leving the question unsettled. The prolem of identifying DFAs from n informnt hs ttrcted lot of ttention: E. Mrk Gold (Gold, 1978) nd Dn Angluin (Angluin, 1978) proved the intrctility of

31 12.6 Conclusions of the chpter nd further reding 267 finding the smllest consistent utomton. Lenny Pitt nd Mnfred Wrmuth exted this result to non-pproximility (Pitt & Wrmuth, 1993). Colin de l Higuer (de l Higuer, 1997) noticed tht the notion of polynomil smples ws non-trivil. E. Mrk Gold s lgorithm (GOLD) (Gold, 1978) ws the first grmmticl inference lgorithm with strong convergence properties. Becuse of its incpcity to do etter thn return the PTA the lgorithm is seldom used in prctice. There is nevertheless room for improvement. Indeed, the first phse of the lgorithm (the deductive step) cn e implemented with time complexity of O( S n) nd cn e used s strting point for heuristics. Algorithm RPNI is descried in Section It ws developed y Jose Oncin nd Pedro Grcí (Oncin & Grcí, 1992). Essentilly this presenttion respects the originl lgorithm. We hve only updted the nottions nd somehow tried to use terminology common to the other grmmticl inference lgorithms. There hve een other lterntive pproches sed on similr ides: work y Boris Trkhtenrot nd Y Brzdin (Trkhtenrot & Brdzin, 1973) nd y Kevin Lng (Lng, 1992) cn e checked for detils. A more importnt difference in this presenttion is tht we hve tried to void the nondeterministic steps ltogether. By replcing the symmetricl merging opertion, which requires deterministion (through cscde of merges), y the simpler symmetric folding opertion, NFAs re voided. In the sme line, n lgorithm tht doesn t construct the PTA explicitly is presented in (de l Higuer, Oncin & Vidl, 1996). The RED, BLUE terminology ws introduced in (Lng, Perlmutter & Price, 1998), even if does not coincide exctly with previous definitions: in the originl RPNI the uthors use shortest prefixes to indicte the RED sttes nd elements of the kernel for some prefixes leding to the BLUE sttes. Another nlysis of merging cn e found in (Lmeu, Dms & Dupont, 2008). Algorithm RPNI hs een successfully dpted to tree utomt (Grcí & Oncin, 1993), nd infinitry lnguges (de l Higuer & Jnodet, 2004). An essentil reference for those wishing to write their own lgorithms for this tsk is the dtsets. Links out these cn e found on the grmmticl inference wepge (vn Znen, 2003). One lterntive is to generte one s own trgets nd smples. This cn e done with the GOWACHIN mchine (Lng, Perlmutter & Coste, 1998) Some lterntive lines of reserch Both GOLD nd RPNI hve een considered s good strting points for other lgorithms. During the ABBADINGO competition, stte merging ws revisited, the order reltion eing uilt during the run of the lgorithm. This led to heuristic EDSM (evidence driven stte merging), which is descried in Section More generlly the prolem of lerning DFAs from positive nd negtive strings hs een tckled y numer of other techniques (some of which re presented in Chpter 14).

32 268 Informed lerners Open prolems nd possile new lines of reserch There re numer of questions tht still deserve to e looked into concerning the prolem of lerning DFAs from n informnt, nd the lgorithms GOLD nd RPNI: Concerning the prolem of lerning DFAs from n informnt. Both lgorithms we hve proposed re incple of dpting correctly to n incrementl setting. Even if first try ws mde y Pierre Dupont (Dupont, 1996), there is room for improvement. Moreover, one spect of incrementl lerning is tht we should e le to forget some of the dt we re lerning from during the process. This is clerly not the cse with the lgorithms we hve seen in this chpter. Concerning the GOLD lgorithm. There re two reserch directions one could recomm here. The first corresponds to the deductive phse. As mentioned in Exercise 12.5, clever dt structures should ccelerte the construction of the tle with s mny oviously different rows s possile. The second line of reserch corresponds to finding etter techniques nd heuristics to fill the holes. Concerning the RPNI lgorithm. The complexity of the RPNI lgorithm remins loosely studied. The ctul computtion which is proposed is not convincing, nd empiriclly, those tht hve consistently used the lgorithm certinly do not report cuic ehviour. A etter nlysis of the complexity (joined with proly etter dt structures) is of interest in tht it would llow us to cpture the prts where most computtionl effort is spent. Other relted topics. A tricky open question concerns the collusion issues. An lterntive lerning lgorithm could e to find in the smple string which would e the encoding of DFA, decode this string nd check if the chrcteristic smple for this utomton is included. This lgorithm would then rely on collusion: it needs techer to encode the utomton. Collusion is discussed in (Goldmn & Mthis, 1996): for the lerner-techer model to e le to resist collusion, n dversry is introduced. This, here, corresponds to the fct tht the chrcteristic smple is to e included in the lerning smple for identifiction to e mndtory. But the fct tht one cn encode the trget into unique string, which is then correctly lelled nd pssed to the lerner together with proof tht the numer of sttes is t lest n, which hs een remrked in numer of ppers (Cstro & Guijrro, 2000, Denis & Gilleron, 1997, Denis, d Hlluin & Gilleron, 1996) remins trouling.

Gold s algorithm. Acknowledgements. Why would this be true? Gold's Algorithm. 1 Key ideas. Strings as states

Gold s algorithm. Acknowledgements. Why would this be true? Gold's Algorithm. 1 Key ideas. Strings as states Acknowledgements Gold s lgorithm Lurent Miclet, Jose Oncin nd Tim Otes for previous versions of these slides. Rfel Crrsco, Pco Cscuert, Rémi Eyrud, Philippe Ezequel, Henning Fernu, Thierry Murgue, Frnck

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

13 Learning with Queries

13 Learning with Queries 13 Lerning with Queries Among the more interesting remining theoreticl questions re: inference in the presence of noise, generl strtegies for interctive presenttion nd the inference of systems with semntics.

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Name Ima Sample ASU ID

Name Ima Sample ASU ID Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CM10196 Topic 4: Functions and Relations

CM10196 Topic 4: Functions and Relations CM096 Topic 4: Functions nd Reltions Guy McCusker W. Functions nd reltions Perhps the most widely used notion in ll of mthemtics is tht of function. Informlly, function is n opertion which tkes n input

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS CS 310 (sec 20) - Winter 2003 - Finl Exm (solutions) SOLUTIONS 1. (Logic) Use truth tles to prove the following logicl equivlences: () p q (p p) (q q) () p q (p q) (p q) () p q p q p p q q (q q) (p p)

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Fundamentals of Computer Science

Fundamentals of Computer Science Fundmentls of Computer Science Chpter 3: NFA nd DFA equivlence Regulr expressions Henrik Björklund Umeå University Jnury 23, 2014 NFA nd DFA equivlence As we shll see, it turns out tht NFA nd DFA re equivlent,

More information

Learning Moore Machines from Input-Output Traces

Learning Moore Machines from Input-Output Traces Lerning Moore Mchines from Input-Output Trces Georgios Gintmidis 1 nd Stvros Tripkis 1,2 1 Alto University, Finlnd 2 UC Berkeley, USA Motivtion: lerning models from blck boxes Inputs? Lerner Forml Model

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University

Farey Fractions. Rickard Fernström. U.U.D.M. Project Report 2017:24. Department of Mathematics Uppsala University U.U.D.M. Project Report 07:4 Frey Frctions Rickrd Fernström Exmensrete i mtemtik, 5 hp Hledre: Andres Strömergsson Exmintor: Jörgen Östensson Juni 07 Deprtment of Mthemtics Uppsl University Frey Frctions

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

Worked out examples Finite Automata

Worked out examples Finite Automata Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

Some Theory of Computation Exercises Week 1

Some Theory of Computation Exercises Week 1 Some Theory of Computtion Exercises Week 1 Section 1 Deterministic Finite Automt Question 1.3 d d d d u q 1 q 2 q 3 q 4 q 5 d u u u u Question 1.4 Prt c - {w w hs even s nd one or two s} First we sk whether

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2016 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 9 1. (4pts) ((p q) (q r)) (p r), prove tutology using truth tles. p

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations. Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one

More information

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science CSCI 340: Computtionl Models Trnsition Grphs Chpter 6 Deprtment of Computer Science Relxing Restrints on Inputs We cn uild n FA tht ccepts only the word! 5 sttes ecuse n FA cn only process one letter t

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

ɛ-closure, Kleene s Theorem,

ɛ-closure, Kleene s Theorem, DEGefW5wiGH2XgYMEzUKjEmtCDUsRQ4d 1 A nice pper relevnt to this course is titled The Glory of the Pst 2 NICTA Resercher, Adjunct t the Austrlin Ntionl University nd Griffith University ɛ-closure, Kleene

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

Thoery of Automata CS402

Thoery of Automata CS402 Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...

More information

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings... Tle of contents: Lecture N0.... 3 ummry... 3 Wht does utomt men?... 3 Introduction to lnguges... 3 Alphets... 3 trings... 3 Defining Lnguges... 4 Lecture N0. 2... 7 ummry... 7 Kleene tr Closure... 7 Recursive

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

GNFA GNFA GNFA GNFA GNFA

GNFA GNFA GNFA GNFA GNFA DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with

More information

Homework Solution - Set 5 Due: Friday 10/03/08

Homework Solution - Set 5 Due: Friday 10/03/08 CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.

More information

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1 Non-Deterministic Finite Automt Fll 2018 Costs Busch - RPI 1 Nondeterministic Finite Automton (NFA) Alphbet ={} q q2 1 q 0 q 3 Fll 2018 Costs Busch - RPI 2 Nondeterministic Finite Automton (NFA) Alphbet

More information

Non-deterministic Finite Automata

Non-deterministic Finite Automata Non-deterministic Finite Automt Eliminting non-determinism Rdoud University Nijmegen Non-deterministic Finite Automt H. Geuvers nd T. vn Lrhoven Institute for Computing nd Informtion Sciences Intelligent

More information

Inductive and statistical learning of formal grammars

Inductive and statistical learning of formal grammars Inductive nd sttisticl lerning of forml grmmrs Pierre Dupont Grmmr Induction Mchine Lerning Gol: to give the lerning ility to mchine Design progrms the performnce of which improves over time pdupont@info.ucl.c.e

More information

Lecture 2 : Propositions DRAFT

Lecture 2 : Propositions DRAFT CS/Mth 240: Introduction to Discrete Mthemtics 1/20/2010 Lecture 2 : Propositions Instructor: Dieter vn Melkeeek Scrie: Dlior Zelený DRAFT Lst time we nlyzed vrious mze solving lgorithms in order to illustrte

More information

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows:

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows: Improper Integrls The First Fundmentl Theorem of Clculus, s we ve discussed in clss, goes s follows: If f is continuous on the intervl [, ] nd F is function for which F t = ft, then ftdt = F F. An integrl

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk out solving systems of liner equtions. These re prolems tht give couple of equtions with couple of unknowns, like: 6= x + x 7=

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51 Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

CSE396 Prelim I Answer Key Spring 2017

CSE396 Prelim I Answer Key Spring 2017 Nme nd St.ID#: CSE96 Prelim I Answer Key Spring 2017 (1) (24 pts.) Define A to e the lnguge of strings x {, } such tht x either egins with or ends with, ut not oth. Design DFA M such tht L(M) = A. A node-rc

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

The Evaluation Theorem

The Evaluation Theorem These notes closely follow the presenttion of the mteril given in Jmes Stewrt s textook Clculus, Concepts nd Contexts (2nd edition) These notes re intended primrily for in-clss presenttion nd should not

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information