Fast Learning of Restricted Regular Expressions and DTDs

Size: px
Start display at page:

Download "Fast Learning of Restricted Regular Expressions and DTDs"

Transcription

1 Fst Lerning of Restrite Regulr Expressions n DTDs Dominik D. Freyenerger Institut für Informtik, Goethe-Universität Frnkfurt.M. freyenerger@em.uni-frnkfurt.e Timo Kötzing Mx-Plnk-Institute for Informtis 6623 Srrüken koetzing@mpi-inf.mpg.e ABSTRACT We stuy the prolem of generlizing from finite smple to lnguge tken from preefine lnguge lss. The two lnguge lsses we onsier re susets of the regulr lnguges n hve signifine in the speifition of XML ouments (the lsses orresponing to so lle hin regulr expressions, Chres, n to single ourrene regulr expressions, Sores). The previous literture gve numer of lgorithms for generlizing to Sores proviing tre off etween qulity of the solution n spee. Furthermore, fst ut nonoptiml lgorithm for generlizing to Chres is known. For eh of the two lnguge lsses we give n effiient lgorithm returning miniml generliztion from the given finite smple to n element of the fixe lnguge lss; suh generliztions re lle esriptive. In this sense, oth our lgorithms re optiml. Keywors suregulr lnguge lerning, single ourrene regulr expression, hin regulr expression, esriptive generliztion. INTRODUCTION The present pper follows n refines n pproh for XML shem inferene from positive exmples tht ws introue y Bex et l. [3]. The si prolem setting is s follows. Given set of XML ouments, generte shem tht esries these ouments, while eing ompt n preferly humn rele. Bex et l. pproh this prolem y lerning eterministi regulr expressions from positive exmples; i. e., they onsier the following prolem: Given finite set S of positive exmples from n unknown trget lnguge L, fin eterministi regulr expression for L. These regulr expressions This work ws one while this uthor ws visiting the Mx- Plnk-Institute for Informtis in Srrüken. Permission to mke igitl or hr opies of ll or prt of this work for personl or lssroom use is grnte without fee provie tht opies re not me or istriute for profit or ommeril vntge n tht opies er this notie n the full ittion on the first pge. To opy otherwise, to repulish, to post on servers or to reistriute to lists, requires prior speifi permission n/or fee. EDBT/ICDT 3, Mrh , Geno, Itly Copyright 203 ACM /3/03...$5.00. n immeitely e use s DTDs (Doument Type Definitions), n while XSDs (XML Shem Douments) require itionl effort, lgorithms tht infer regulr expressions n lso e use s omponent of XSD inferene lgorithms (see [3, 4] for further explntions). In prtiulr, s rgue in [3], the results in [] show tht XSD inferene requires eep insights into regulr expression inferene s Bex et l. put it, one nnot hope to suessfully infer XSDs without goo lgorithms for inferring regulr expressions. Using lssil tehnique from Gol [9], Bex et l. prove in [2] tht even the lss of eterministi regulr expressions is too rih to e lernle from positive t. While, stritly speking, the lernility riterion of Gol-style lerning s efine in [9] (whih is lso lle lerning in the limit from positive t or explntory lerning) is ifferent from the setting in [2, 3], its non-lernility results still provie vlule insights into neessry restritions. In prtiulr, Gol-style lerning shows tht, when lerning from positive t, one hs to lne the nee for generliztion (s in most ses, regulr expression tht genertes extly the exmple is not onsiere goo hypothesis) with the nee to voi overgenerliztion. While there re numerous ppers on restritions on the lss of regulr lnguges tht le to lernility, prt from few exeptions (e. g. [6]), most of these restritions prior to [3] hve een se on properties of utomt. As expline in [3], this is prolemti, s even uner those restritions, onverting the inferre utomton to regulr expression n le to n exponentil size inrese. In orer to hieve lernility of onise eterministi regulr expression, Bex et l. propose single ourrene regulr expressions (short Sores), regulr expressions where eh terminl letter (or element nme) ours t most one. These Sores re eterministi y efinition, n s n itionl enefit, this restrition ensures tht the length of the inferre expressions is t most liner in the numer of ifferent terminl letters. The orresponing Sore-inferene lgorithm RWR from [3] works s follows. First, it onstruts so-lle single ourrene utomton (short So, s introue y Grí n Vil [8]). RWR then ttempts to onvert the So step y step into Sore. As the lss of Sore-lnguges is proper suset of the lss of So-lnguges, this onversion is not lwys possile. In these ses, RWR ttempts to repir the Gol-style lerning uses growing set of smples n requires tht the lerner onverges towr orret hypothesis in finite time, while this setting uses only single finite set for eh inferene instne.

2 So, n onstruts Sore tht genertes generliztion of the lnguge of the So. In orer to generlize s little s possile, [3] suggests ifferent orerings on the set of repir rules, s well s the vrint RWR 2 l, whih uses itionl heuristis n n hve n exponentil running time. Nonetheless, these vrints my still infer Sores tht re not inlusion-miniml generliztions of the input smple (within the lss of ll Sores). In orer to el with insuffiient t, Bex et l. propose further restrition on Sores, the so-lle hin regulr expressions (short: Chres), n introue the orresponing inferene lgorithm CRX. Anlogously to RWR, CRX my infer Chres tht re not inlusion-miniml generliztions. The present pper fouses on inferring Sores n Chres tht re inlusion-miniml generliztions. This pproh to regulr expression inferene is se on slightly ifferent ngle thn Gol-style lerning, nmely on the lerning prigm of esriptive generliztion tht ws introue y Freyenerger n Reienh [7]. While Gol-style lerning ssumes tht n ext representtion of the trget lnguge is present in the hypothesis spe, n tht the lerner is provie with suffiient positive informtion to orretly reognize the trget lnguge, esriptive generliztion views the hypothesis spe n the spe of trget lnguges s istint. For lss D of lnguge representtion mehnisms (e. g., lss of utomt, regulr expressions, or grmmrs 2 ), lnguge representtion δ D is lle D-esriptive of smple S if L(δ) is n inlusion-miniml generliztion of S, i. e., S L(δ) n there is no γ D with S L(γ) L(δ). This onept llows us to efine D-esriptive generliztion s nturl extension of Gol-style lerning: Inste of ttempting to lern n ext representtion of the trget lnguge L from smple S, the lerner hs to infer representtion δ D tht is D-esriptive of L. In other wors, δ is generliztion of S tht is s inlusion-miniml s possile within D. Desriptive generliztion expliitly seprtes the hypothesis spe from the lss of trget lnguges, while still proviing nturl qulity riterion for generliztion from positive exmples. In the present pper, we onsier the lss of Sores n the lss of Chres s hypothesis spes D, n exmine the prolem of inferring D-esriptive generliztions from finite smples. We pproh this prolem y first omputing So-esriptive So. As we shll see, this pproh hs the vntges tht the esriptive So is uniquely efine, n e ompute effiiently, n its lnguge is inlue in the lnguge of every esriptive Sore or Chre. The min ontriution of the present pper re two lgorithms, So2Sore n So2Chre, tht n e use to trnsform ny given So into Sore (resp. Chre) tht is Sore-esriptive (resp. Chre-esriptive) of the lnguge of tht So. Tht is, given smple S, these lgorithms n e use to ompute generliztion of S tht is inlusionminiml (or, in the terminology of [3], optiml) within the lss of Sores or Chres (respetively). In ition to this, So2Chre n So2Sore re effiient: So2Chre runs in time O(m) (ompre to O(m + n 3 ) for 2 The nonil lss D is the lss of NE-ptterns, where esriptive ptterns were introue y Angluin [] in the ontext of ext lerning from positive t. See [2] for survey on the influene of pttern lnguges in this re. CRX), So2Sore in time O(nm) (ompre to O(n 5 ) for RWR), where m is the numer of eges n n the numer of noes in the So. The pper is struture s follows. Setion 2 ontins some mthemtil preliminries, followe y some informtive properties of the lnguge lsses onsiere. Setion 3 isusses CRX s well s RWR n its vrints in the ontext of esriptive regulr expressions. In prtiulr, we show tht for eh of these lgorithms, there re smples over smll lphets where the lgorithm oes not ompute esriptive Chre or Sore. Setions 4 n 5 ontin the lgorithms So2Chre n So2Sore, respetively, s well s proofs of their orretness n running time. Finlly, Setion 6 onlues the pper. For spe resons, some of the proofs were omitte. 2. PRELIMINARIES Let enote the empty set, let ε enote the empty wor. With x, we enote the length of x if x is wor, or the numer of elements in x if x is set. We use (n ) to enote the inlusion (respetively proper inlusion) of sets. The ifferene of two sets A, B is enote y A \ B n efine s { A / B}. A wor v is ftor of wor x Σ if there exist u, w Σ suh tht x = uvw. A 2-grm is ftor of length 2. Let lph(w) enote the set of ll letters ourring in wor w, n exten this to lnguges y efining lph(l) := w L lph(l). 2. Introuing SORE, CHARE, SOA This setion introues the lsses of regulr expressions n utomt tht re use in the present pper. We mostly follow the nottions introue in [3]. In prtiulr, we use the following vrint of regulr expressions. Definition. Let Σ e finite lphet (the set of terminl letters, lso lle element nmes). Every letter Σ is regulr expression, s re ε n, n L(x) = {x} for x Σ {ε}, while L( ) =. If α is regulr expression, then α + n α? re regulr expressions, where L(α + ) = (L(α)) + n L(α?) = L(α) {ε}. Furthermore, if α n β re regulr expressions, then α β n α β re lso regulr expressions, with L(α β) = L(α) L(β) n L(α β) = {uv u L(α), v L(β)}. For ske of onveniene, we sometimes omit the ontention opertor (i. e., we write αβ inste of α β), n or omit prentheses. For regulr expression α, we use lph(α) to enote the set of terminl letters tht our in α. We ll two regulr expressions α, β lphet-isjoint if lph(α) lph(β) =. Two regulr expressions α n β re equivlent if L(α) = L(β). For ny set A Σ, we use the nottion ALT (A) to enote the regulr expression ALT (A) := ( n), with ALT ( ) = ε (ALT stns for lterntion). In strit sense, this efinition requires n orering on the letters to e soun, ut for the purpose of this pper, this is of no onern, n we ssume tht ALT (A) = ALT (B) if A = B. The full lss of regulr expressions is too strong oth for DTDs (whih llow only eterministi regulr expressions) n for lerning from positive t (whih requires lnguge lsses tht re suffiiently sprse, f. [9]). As proven in [2], even the lss of eterministi regulr expressions is still too lrge to e lernle from positive t. Hene, [3] proposes the following sulsses of eterministi regulr expressions.

3 Definition 2 (Sore/Chre). A single ourrene regulr expression (or Sore) is regulr expression in whih eh terminl letter ours (t most) one. A hin regulr expression (or Chre) is Sore of the form f... f n (n 0), where eh f i is hin ftor, i. e., Sore of the form ( k ), ( k )?, ( k ) +, or ( k ) +?, where k, n eh j is terminl letter. In other wors, Chre onsists of ontention of lphet-isjoint hin ftors. We illustrte these efinitions with few short exmples. Exmple 3. Consier the regulr expressions given s α = ()?( ) +, β = () +, n γ =. Here, α is Chre (n, hene, lso Sore), s it onsists of two lphet-isjoint hin-ftors. On the other hn, β is Sore (every letter ours only one), ut not Chre (s it is not ompose of hinftors). One n esily prove tht L(β) is not Chre lnguge: Assume there exists Chre β with L(β ) = L(β). By efinition, β must ontin n. if n re not in the sme hin-ftor of β, then t lest one of the 2-grms or nnot our in ny wor of L(β ). But if n re in the sme hin-ftor of β, the sme line of resoning implies tht this hin-ftor must e followe y + or +?. Therefore, there re wors in L(β ) tht ontin the 2-grms or, whih ontrits L(β ) = L(β). Finlly, γ is not Sore (n therefore not Chre), n one n prove tht L(γ) is not Sore-lnguge (this is est proven using tehniques tht shll e introue right fter this exmple, hene we efer the proof of this lim to Remrk 7 further own in this setion). While the fous of this pper is on lerning regulr expressions, most of our tehnil resoning uses the following lss of utomt. Definition 4 (So). Let Σ e finite lphet, n let snk, sr e istint symols tht o not our in Σ. A single ourrene utomton (short: SOA) over Σ is finite irete grph A = (V, E) suh tht () {sr, snk} V, n V Σ {sr, snk}, (2) sr hs only outgoing eges, snk hs only inoming eges, n every v V lies on pth from sr to snk. We ll lph(a) := V \{sr, snk} the set of terminl letters in A. We efine the reltion A on V y A:= E, n use + A n A to enote the trnsitive n reflexive-trnsitive hull of A. The lnguge L(A) tht is epte y A is the set of ll wors w = n (n 0) suh tht sr A A A n A snk. As usul, strongly onnete omponent of So A is non-empty n inlusion-mximl set C of verties of A suh tht for ll, C, A n A hols. A strongly onnete loope omponent of So A is nonempty n inlusion-mximl set C of verties of A suh tht for ll, C, + A n + A hols. In other wors, every strongly onnete loope omponent ontins extly those verties tht re mutully rehle. Thus, strongly onnete omponent my e singleton, while singleton strongly onnete loope omponent must hve self-loop. By efinition, ll strongly onnete loope omponents of So re isjoint, n sr n snk nnot e prt of ny strongly onnete loope omponent. Although their efinition is somewht ifferent, it is esy to see tht Sos re sulss of DFAs. In prtiulr, So n e unerstoo s DFA where for eh Σ, there exists hrteristi stte q suh tht δ(q, ) {q, q trp} for ll sttes q Q (where q trp is trp stte). This is illustrte y the following exmple. Exmple 5. In the piture elow, we hve So on the left sie, n the orresponing DFA to the right sie. Both utomt generte the sme lnguge s the regulr expression α = (( +?)(( +?) ( + )) +?)?. Note tht α is not Sore. In ft, L(α) is not Sore-lnguge, ut proving this using only tehniques tht hve een introue t this point requires onsierle effort. (The most strightforwr wy to prove this is to use tehniques tht re introue in Setion 5: Apply the lgorithm So2Sore to the So, whih returns the Sore (? +?) +?, whih is not equivlent to α. By Theorem 25, this mens tht L(α) is not Sore lnguge.) In this pper, we frequently use Sos to pproximte lnguges. For this, we rely on the following efinition. Definition 6. For every w Σ, let first(w) n lst(w) enote the first resp. lst letter of w, n let grm 2 (w) e the set of ll 2-grms in w. We exten these funtions on wors to funtions on lnguges y efining first(l) := {first(w) w L}, lst(l) := {lst(w) w L}, n grm 2 (L) := w L grm 2(w). For every lnguge L Σ, we efine the So-pproximtion of L, SOA(L), y SOA(L) := (V L, E L), where V L := lph(l) {sr, snk}, n E L ontins the eges (sr, ) for every first(l), (, snk) for every lst(l), (, ) for ll, Σ with grm 2 (L), (sr, snk) if ε L. Using this terminology, the pproh for So-lerning presente in [8] n e summrize s follows. Given finite set S, ompute SOA(S). In [3], the resulting lgorithm is lle 2T-INF. Furthermore, s omputing SOA(L) is only s hr s omputing first(l), lst(l), n grm 2 (L), note tht SOA(L) n e onstrute for lnguge from lsses tht re lrger thn the lsses of finite or regulr lnguges, e. g., for ontext-free lnguges. It is esy to see from the efinition tht L(SOA(L)) L hols for every lnguge L (in ft, we shll see in Proposition 4 tht L(SOA(L)) is lwys the lest generl pproximtion of L tht is possile with So). This inlusion n e proper s follows.

4 Remrk 7. Note tht even for finite lnguges L, the equlity L(SOA(L)) = L is not neessry; e. g., onsier L = {} (from Exmple 3). Then SOA(L) ontins n ege from sr to, from to, from to, from to itself, n from to snk. Hene, L(SOA(L)), while / L. This lso proves tht L is not So-lnguge (n, s lime in Exmple 3, not Sore-lnguge.) As Exmple 5 illustrtes, there re So-lnguges tht re not Sore-lnguges. On the other hn, we hve tht every Sore-lnguge is So-lnguge (in other wors, the Sopproximtion of Sore-lnguge is ext). Lemm 8 ([3], proof of Proposition 9). Given ny Sore α, we hve L(SOA(L(α)) = L(α). It is esy to see tht SOA(L(α)) n e erive from every Sore α (in ft, even every regulr expression α) y eriving the sets of first letters, lst letters, n 2-grms in L(α) from the expression α (we lrey i this in Remrk 7). Lemm 8 llows us to efine SOA(α) s nottionl shorthn for SOA(L(α)). Similrly, we use α to enote the reltion SOA(α). More importntly, we shll use Lemm 8 to evelop hny syntti hrteriztion of the inlusion for Sores (n Chres), whih is se on the inlusion of Sos. We sy tht So A overs So B if A is supergrph of B in other wors, lph(a) lph(b) hols, n B implies A for ll, lph(b). This efinition les to the following hrteriztion of So-inlusion. Lemm 9 ([8], Theorem 3.). For every pir A, B of Sos, L(A) L(B) hols if n only if A is overe y B. Although Lemm 9 is stte in [8] without proof (the uthors just ite Grí s PhD thesis), it is esily proven onsiering the efinition of SOA(L). Comining Lemm 9 with Lemm 8, we re le to hrterize inlusion of Sores s follows. Lemm 0. For every pir α, β of Sores, L(α) L(β) hols if n only if SOA(α) is overe y SOA(β). This oviously implies tht two Sores (or Chres) re equivlent if their orresponing Sos re equivlent. More importntly, Lemm 0 provies simple syntti n hrteristi riterion for inlusion. While the lgorithms in Setions 4 n 5 o not hek for inlusion, their orretness proofs mke hevy use of the ft tht Sore-inlusion epens on the presene of eges in the orresponing So. Before we introue the other entrl efinition of this pper in Setion 2.2, we isuss some onepts whih will e useful, lthough not quite s signifint. One n verify with little effort tht the lsses of So-, Sore-, or Chre-lnguges re not lose uner mny of the opertions tht re ommonly stuie in forml lnguge theory (e. g., ontention, union, omplementtion, intersetion with regulr lnguges, morphism, inverse morphism). One of the few opertions uner whih eh of these lsses is lose, n whih we shll use, is projetion. Let Σ e n lphet. A projetion from Σ to T Σ is morphism π T : Σ T tht is efine y π T (x) := x for ll x T, n π T (x) := ε for ll x Σ\T. We extene this to lnguges nonilly, i. e., π T (L) := {π T (w) w L}. Lemm. The lsses of Sore-, Chre-, n Solnguges re lose uner projetion. The proof ws omitte for spe resons. The min pproh in the present pper (s well s in [3]) is onverting Sos into Sores or Chres. During this proess, it is osionlly onvenient to work with moel tht n e viewe s n intermeiry step etween So n regulr expression. Definition 2. A generlize single ourrene utomton (or generlize So) is finite grph A = (V, E) suh tht () {sr, snk} V, n ll verties in V \ {sr, snk} re pirwise lphet-isjoint Sore; n (2) the ege reltion E is suh tht sr hs only outgoing eges; snk hs only inoming eges, n every v V lies on pth from sr to snk. The reltions A, A, + A on V re efine nlogously to (non-generlize) So. We exten lph to generlize Sos y efining lph(a) := v V \{sr,snk} lph(v). The lnguge L(A) is efine to e the set of ll w lph(a) for whih there exist n 0, noes v,..., v n V \ {sr, snk}, n wors w,..., w n lph(a) suh tht sr A v A A v n A snk, w = w w n, n w i L(v i) hols for every i n. Note tht generlize Sos ept the sme lss of lnguges s Sos. 2.2 Desriptivity This setion introues the notion of esriptive expressions n utomt, whih is one of the entrl spets of the present pper. Definition 3. Let D e lss of regulr expressions or finite utomt over some lphet Σ. A δ D is lle D-esriptive of non-empty lnguge S Σ if L(δ) S, n there is no γ D suh tht L(δ) L(γ) S. In other wors, n expression or utomtion tht is D- esriptive of lnguge S genertes lnguge tht is generliztion of S tht is -miniml within lnguges esrie y elements of D. If the lss D is ler from the ontext, we simply write esriptive inste of D-esriptive. As stte in [8] (using quite ifferent terminology), for every finite lnguge S, SOA(S) is So-esriptive of S. This extens to infinite lnguges s well; for Sores n Chres, we n lso prove the existene of esriptive regulr expressions: Proposition 4. Let Σ e finite lphet. For every lnguge L Σ, SOA(L) is So-esriptive of L, n there exist Sore-esriptive Sore δ s n Chre-esriptive Chre δ. The proof ws omitte for spe resons. In prtiulr, this mens tht the lgorithm 2T-INF from [3] tht ws mentione in the previous setion n e use to ompute So-esriptive Sos for finite smple sets. Moreover, this shows tht onstruting esriptive So for n ritrry lnguge L is s merely s hr s omputing the sets first(l), lst(l), n grm 2 (L). As we shll see, omputing esriptive Sores or Chres is less strightforwr. First, note tht the first prt of the proof of Proposition 4 implies the following oservtion:

5 Clss num of lnguges mx num esriptive for smple mx num eges to for esr Chre n! 2 2n (n) n! 2 3n n! Θ(n 2 ) Sore n! 2 3n r log n s(n) n! 2 7n 2 n Θ(n 2 ) So 2 n2 +O(n) Tle : A summry of the numers presente in Proposition 6. For eh of the lsses of lnguges generte y Chres, Sores, n Sos, the tle lists the numer of ifferent lnguges in the lss, the mximum numer of esriptive expressions or utomt for given smple S Σ, n the mximum numer of eges tht nee to e e to SOA(S) in orer to otin So tht orrespons to esriptive Chre or Sore. In ll ses, n enotes the size of Σ. Corollry 5. Let Σ e finite lphet, n let L Σ. For every Sore (or Chre) δ tht is Sore-esriptive (resp. Chre-esriptive) of L, L(δ) L(SOA(L)) hols. Hene, if some Sore (or Chre) is esriptive of lnguge L, it must e esriptive of L(SOA(L)) s well. This llows us to ompute esriptive Sores n Chres not from smple L, ut from its So-pproximtion SOA(L). Furthermore, if L(SOA(L)) is not Sore-lnguge (or not Chre-lnguge), So for some Sore esriptive of L n e otine in priniple from SOA(L) y ing new eges. As only finite numer of eges nee to e e, n So-inlusion n e eie esily (f. Lemm 9), the min question is whether this n e one effiiently. But s it n e neessry to sustntil numer of new eges in orer to turn So into So tht orrespons to esriptive expression (see Proposition 6 just elow), rute fore pproh is proly not visle. The next proposition lists these n other numers out ounting n esriptive Sores n Chres. These results re summrize in Tle. Rell tht regulr expressions re lle equivlent if they ept the sme lnguge. Proposition 6. Let n e the numer of lphet symols. We hve the following, for some onstnt r. () The numer of pirwise non-equivlent Chres is (n) with n! 2 2n (n) n! 2 3n. (2) The numer of pirwise non-equivlent Sores is s(n) with n! 2 3n r log n s(n) n! 2 7n. 3 (3) There is smple S Σ suh tht S hs 2 n pirwise non-equivlent esriptive Sores. (4) There is smple S Σ suh tht S hs n! pirwise non-equivlent esriptive Chres. (5) There is So with Θ(n) eges suh tht esriptive Sore with miniml numer of eges in the orresponing So hs Θ(n 2 ) eges. (6) There is So with Θ(n) eges suh tht esriptive Chre with miniml numer of eges in the orresponing So hs Θ(n 2 ) eges. The proof ws omitte for spe resons. In prtiulr, note tht Proposition 6 lso emonstrtes tht given smple n hve numerous ifferent esriptive Sores (or Chres). Note tht the numer of ifferent Chre- n Sore-lnguges n e etter pproximte 3 Note tht [2, Proof of Theorem 3.] gives tht ny Sorelnguge hs Sore of length t most 0n 4, whih gives oun of 2 O(n log n). using more vne tools from omintoris. Finlly, if we re only intereste in the numer of ifferent suh lnguge moulo renming of the terminl letters, then the sme ouns without the ftor n! hol. 3. DESCRIPTIVITY VS. CRX AND RWR Proposition 6 emonstrtes tht the numer of non-equivlent esriptive Sores (or Chres) for smple n e exponentil in the size of the lphet. Therefore, the present pper only exmines the question how single esriptive Sore (or Chre) n e foun for smple, inste of looking for n enumertion of ll these expressions. As expline in Setion 2.2 (in prtiulr, Corollry 5), esriptive Chres n Sores n e otine from the esriptive So, n moreover, for every lnguge L n every Sore α, L(α) L(SOA(L)) must hol. This oservtion motivtes our inferene pproh for Sores n Chres: Given smple S, first ompute the So-esriptive singleourrene utomton SOA(S), using 2T-INF. As expline in [8], this n e one in time O(ln), where l := s S s, n n := lph(s). Using the lgorithm So2Chre (Setion 4) or So2Sore (Setion 5), SOA(S) is then turne into esriptive Chre or Sore (respetively). Before we isuss these lgorithms n the respetive proofs in etil, we oserve tht the lgorithms CRX n RWR n its vrints from [3] o not lwys ompute esriptive Chres or Sores. For the Chre-lgorithm CRX, this is quite esy to see: As pointe out in [3] (s remrk fter Theorem 35), on the smple S = {, e, e}, the lgorithm CRX returns the Chre????e?, while δ := ( )( e) is etter pproximtion of S. (In ft, we shll e le to see tht δ is not only etter, ut Chre-esriptive. This n e verifie y oserving tht δ is the output of So2Chre on SOA(S), n referring to Theorem 9 further own.) The proofs for the non-esriptivity of the Sore-lgorithm RWR n its vrints require more effort, n n e foun in the following setion. 3. RWR-Vrints n Desriptivity In this setion we give theorems regring properties of RWR-vrints (we refer the reer to [3] for etils on ll vrints). In prtiulr, we show tht every vrint fils to fin esriptive Sore on some input. In [3, Algorithm 3] n Algorithm RWR ( Rewrite with Repirs ) ws given to turn given So (erive in the nonil wy from n input smple) into generlizing Sore, y rewriting the So step y step. This lgorithm ws proven in [3] to turn ny So in n equivlent Sore, if existent; if, t some point in the run of RWR, no rewrite rules re p-

6 plile, the lgorithm will mke generliztion step y pplying repir rule. The four repir rules of RWR re s follows, given the urrent So A. For simpliity, we give moifition of the rules, where less eges re e. However, for the ses we use in this setion, these rules re equivlent to the originl set. Repir r s If there re two noes r n s of A whih shre suessor or preeessor, eges to A to mke ll suessors of r or s suessors of oth r n s; similrly with the preeessors. Repir r s? If there re two noes r n s of A suh tht r is the only preeessor of s, eges to A to mke ll suessors of r or s (exept s) suessors of oth r n s. Repir r? s If there re two noes r n s of A suh tht s is the only suessor of r, eges to A to mke ll preeessors of r or s (exept r) preeessors of oth r n s. Repir r? s? Let r n s e noes of A suh tht s is suessor of r; eges to A to mke ll suessors of r or s suessors of oth r n s; similrly with the preeessors. Furthermore, for ll preeessors u of r n ll suessors v of s, n ege from u to v. The uthors of [3] prove tht RWR (with the originl repir rules) lwys termintes in O(n 5 ) steps (where n = Σ ) n gives Sore whih generlizes the input So. They lso suggest tht these rules re heke for ppliility in the given orer, ut mit tht ifferent situtions might ll for ifferent rules (in prtiulr, they note tht the outome of RWR is not lwy esriptive). Next, we formlly show tht RWR oes not lwys return esriptive Sore. Theorem 7. For Σ finite lphet with Σ 3 n ll orerings of the repir rules of RWR, there is (finite) set of smples S Σ suh tht RWR on S proues Sore whih is not Sore-esriptive. Proof. Let,, Σ e three ifferent symols from Σ. First, onsier the smple {, }. The orresponing So oes not llow rewrite rules n requires repir; elow this So is epite, long with two possile repirs, orresponing to the two possile repirs Repir n Repir??. The Sos resulting from the two repirs ept ( ) + n ( ), respetively, whih is not esriptive of {, }, s witnesse y δ := ((?)) + ( Sore whih epts the given smple n, ut not, for exmple,, whih is epte y ny of the Sos erive from repir rules ove). Seon, onsier the smple S = {,, }. The orresponing So A is epite s follows. A esriptive Sore for S is δ 2 := (( )) +, whih we prove s follows. In omprison to A, the So tht orrespons to δ 2 s only single ege, the ege from to. So the only possiility for Sore-lnguge L(γ) with L(A) L(γ) L(δ 2) is L(A) itself. However, L(A) is not Sore-lnguge, whih n e seen, just s in Proposition 6, y pplying either the Sore-onstrution lgorithm RWR from [3] or our lgorithm So2Sore from Setion 5 (whih oth ompute Sore equivlent to given So, if existent) n oserving strit generliztion. Hene, δ 2 is Sore-esriptive of S. (We note without proof tht (?) +? is nother Sore tht is esriptive of S. Neessrily, its lnguge is inomprle to L(δ 2).) An pplition of Repir? on A n then, fter rewriting, of Repir [?]? gives the following. [?]? This So orrespons to the Sore (??) +, n its lnguge is strit superset of L(δ 2) (for exmple is epte y the former n not the ltter). Deeiving the rule Repir r? s is symmetri to eeiving Repir r s?. In [3], Bex et l. lso propose vrint of RWR tht is lle RWR 2 l, whih uses nturl numer l s rnhing prmeter. The lgorithm explores the (reursive) outomes of the est l nites for repir rule, hoosing the ones tht le to miniml numer of wors of length t most 2n (= 2 Σ ) in the lnguge epte y the resulting Sore. Theorem 8. For ll l > 0 there is finite lphet Σ with Σ = 3l n finite set of smples S Σ suh tht RWR 2 l on S proues Sore whih is not Sore-esriptive. Proof. We first ssume l = ; onsier gin the smple {,, } with the following orresponing So.?? The three pplile repir rules re,? n? (plus some rules of the type r? s?, whih exploe the numer of epte wors). This les to the following Sos.

7 RegEx α L(α) 6 exp growth sis reurrene se ses n {, 2, 3} ( ) fα(n 2) 0, 2, 0 ( ) f α(n 2) + f α(n 3) 0,, ( ) +? 5 ( + 5)/2.62 f α(n ) + f α(n 2), 3, 5 Tle 2: Properties of the lnguges isusse in the proof of Theorem 8. For eh regulr expression α, f α(n) enotes the numer of wors in L(α) of length n; given in the tle re the numer of wors of length t most 6 epte y α, the onstnt suh tht f α grows roughly s n, the reurrene reltion for the (f α(n)) n N, s well s f α(n) for n {, 2, 3}. Tle 2 gives n overview of the properties of these three Sos. Thus, we see tht seon possiility epts miniml numer of wors of length t most 6 (= 2 Σ ), whih mens tht only this option will e explore, the first n the thir will e isre. After rewriting y RWR, this results in the following So.? The miniml repir for this results in (??) +, whih is not esriptive s witnesse y (( )) + s in the proof of Theorem 7. For l >, we use l inepenent opies of the smple use for l = (i.e., using ifferent lphet symols). Thus, RWR 2 l will fil on t lest one of these opies. 4. DESCRIPTIVE CHARES In this setion, we give the first min lgorithm of this pper, So2Chre, whih effiiently omputes esriptive Chres for given Sos. 4. The CHARE lgorithm The lgorithm So2Chre uses numer of suroutines, whih re written with ot-nottion similr to some moern ojet oriente progrmming lnguges. For exmple A.ontrt(U, l) enotes the pplition of the suroutine ontrt to the So A with prmeters U n l. For given So A, we let A.sr n A.snk enote the soure n the sink of A, respetively. The following suroutines re use in So2Chre. ontrt on So A tkes suset U of verties of A n lel l. The proeure moifies A suh tht ll verties of U re ontrte to single vertex n lele l (eges re move oringly). onstrutlevelorer on So A = (V, E) ssumes tht A is yli n ssigns level numer to every vertex v V, where the level numer of noe v V is efine to e the length of the longest pth from A.sr to v. Hene, A.sr is on level numer 0, n for every other noe v, the level numer is one more thn the highest level numer of the immeite suessors of v. isskiplevel on So A n level numer i returns true if level i is skip level. A level i is skip level if there exist noes u, v V with (respetive) level numers j u < i n j v > i suh tht u A v. In other wors, one n skip level i y trnsitioning from u to v. Algorithm : So2Chre Input: SOA A = (V, E); 2 while A hs yle o 3 Let U e strongly onnete loope omponent of A; 4 A.ontrt(U, ALT (U) + ); 5 A.onstrutLevelOrer(); 6 result ε; 7 for i = to (level numer of A.snk) o 8 B ll verties with level numer i n + ; 9 C ll verties with level numer i n no + ; 0 foreh α B o if A.isSkipLevel(i) or B + C > then result result α?; 2 else result result α; 3 if C > 0 then 4 if A.isSkipLevel(i) or B >0 then 5 result result ALT (C)?; 6 else result result ALT (C); 7 return result; Note tht the use of ontrt n turn the So into generlize So. Intuitively speking, the lgorithm So2Chre works s follows: () Reple eh strongly onnete loope omponent A V with vertex tht is lele with the regulr expression ALT (A) +. This turns A into (possily generlize) So tht is DAG. (2) Every noe in the DAG is ssigne level numer. (3) Every level is turne into one or more hin-ftors. If level ontins more thn one non-letter noe, or if level is skip level,? is ppene to every hin-ftor on tht level. The following theorem sttes tht So2Chre n e use to ompute Chre-esriptive Chres in highly effiient mnner. Theorem 9. For ny given So A, So2Chre fins Chre tht is Chre-esriptive of L(A) in time O(m), where m is the numer of trnsitions of A.

8 Before we isuss the proof of Theorem 9 in Setion 4.2, we illustrte the ehvior of So2Chre with n Exmple. The orre- Exmple 20. Let S = {f, ef, f}. sponing So, SOA(S), is epite s follows. e f First, So2Chre removes ll yles y ontrting strongly onnete loope omponents. This les to the following generlize So. ( ) + e () + Aprt from the levels for A.sr n A.snk, this generlize So hs three levels: The first level with the noes ( ) + n () +, the seon level with the noes n e, n the thir level with the noe f. As there is n ege etween ( ) + n f, the seon level is skip level. Thus, the levels le to the respetive Chres ( ) +?() +?, (e )?, n f, whih re ontente to ( ) +?() +?(e )?f. By Theorem 9, this Chre is Chre-esriptive of S. 4.2 Proof of Theorem 9 Proof. We first prove termintion n running time, followe y the proof of orretness. Note tht in this Setion, for simpliities ske n in ontrst to Setion 5, we o not istinguish etween noe its lel. Termintion n running time. Termintion is ovious, s the two loops (in lines 2 n 7) re exeute only oune numer of times. Let n enote the numer of verties n m enote the numer of eges in the input So. In the while-loop in line 2, the input So is trnsforme into n yli generlize So. Using Trjn s lgorithm (f. [5]), this prt n e relize in time O(m + n). Computing the level orer n nnotting, for eh level, whether tht level is skip level, n lso e one in time O(m + n), nlogously to topologil sorting. Finlly, eh noe in the generlize So is turne into hin ftor. This tkes time O(n). Hene, the iniviul steps sum up to time of O(m + n), whih results in totl time of O(m), s n m hols y efinition. Corretness. First, it is quite esy to see tht So2Chre omputes Chre. Note tht, in orer to prove tht this Chre is esriptive of the smple S, we o not nee to rgue out every Chre γ with L(γ) S, ut only out those with L(So2Chre(SOA(S))) L(γ) S. This llows us to use Lemm 0 from two iretions: On the one hn, every ege (n hene, every pth) tht is present in SOA(S) must e present in SOA(γ), on the other hn, SOA(γ) must not ontin ny eges tht o not our in SOA(δ). f Before we onsier the min prt of the proof, we first evelop some tehnil tools tht el with strongly onnete loope omponents. Lemm 2. Let α e Chre. A set A lph(α) is strongly onnete loope omponent in SOA(α) if n only if α ontins hin ftor of the form ALT (A) + or ALT (A) +?. The proof ws omitte for spe resons. As So2Chre turns every strongly onnete loope omponent A into hin ftor ALT (A), we oserve tht So2Chre oes not hnge these omponents. Corollry 22. Let Σ e n lphet. For every finite n nonempty set S Σ, n every set A lph(s), the following hols. A is strongly onnete loope omponent in SOA(S) if n only if A is strongly onnete loope omponent in SOA(So2Chre(SOA(S)). Finlly, oring to Lemm 0, this immeitely les to the following oservtion: Corollry 23. Let S Σ e finite set, n let δ := So2Chre(So(S)). For every Chre γ with L(δ) L(γ) S, SOA(γ) must ontin extly the sme strongly onnete loope omponents s SOA(S) n SOA(δ). We now posses ll the tools we nee to exeute the min element of the proof of orretness of So2Chre. Lemm 24. Let Σ e n lphet, let S Σ e nonempty set, n let δ := So2Chre(SOA(S)). Then L(δ) = L(γ) hols for every Chre γ with L(δ) L(γ) S. The proof ws omitte for spe resons. Lemm 24 implies tht there is no Chre γ suh tht L(So2Chre(SOA(S))) L(γ) S. As we hve, y efinition, L(So2Chre(SOA(S))) S, we get tht the result of So2Chre on SOA(S) is Chre-esriptive of S, whih onlues the proof of orretness. 5. DESCRIPTIVE SORES In this setion, we give the seon min lgorithm of this pper, whih effiiently omputes esriptive Sores for given Sos. 5. SORE Algorithm As in Setion 4, we use ot-nottion to enote the pplition of suroutines. As in Setion 4, for given So A, we let A.sr n A.snk enote the soure n the sink of A, respetively. ontrt on So A tkes suset U of verties of A n lel l. The proeure moifies A suh tht ll verties of U re ontrte to single vertex n lele l (eges re move oringly). The proeure returns the newly rete vertex. extrt on So A tkes s rgument set of verties U (of A); it oes not moify A, ut returns new So with opies of ll verties of U s well s two new verties for soure n sink; ll eges etween verties of U re opie, ll verties in U hving n inoming

9 ege (in A) from outsie of U hve now n inoming ege from the new soure, n ll verties in U hving n outgoing ege (in A) to outsie of U hve now n outgoing ege to the new sink. first returns ll verties v suh tht the only preeessor of v is the soure. Epsilon on So A s new vertex lele ε; ll outgoing eges from the soure to verties tht hve more thn one preeessor (verties, tht re not in the first-set) re reirete vi this new vertex. exlusive on So A on rgument v ( vertex of A) returns the set of ll verties u suh tht, on ny pth from the soure to the sink tht visits u, v is neessrily visite previously. Intuitively, the exlusive set of vertex v is the set of ll verties exlusively rehle from v, not from ny other vertex inomprle to v. Finlly, the most iffiult suroutine is lle en n is use to prepre the tretment of strongly onnete loope omponents of the input So A. First, it omputes the set W of ll verties rehle from the soure without pssing through (or ening with) verties whih re preeessors of the sink. Then it reirets (ens) ll trnsitions irete from n element outsie of W to suessor of the soure to point to the sink inste. With other wors, we reiret ll trnsitions from n element to n element A.su(A.sr) to now trnsition to A.snk iff for ll wors u suh tht u is pth in A, we hve tht u ontins n element A.pre(A.snk). In prtiulr, ll elements of A.pre(A.snk) o not trnsition to elements from A.su(A.snk). See Exmple 26 for n illustrtion. Furthermore, we use the following three suroutines for the retion of lels. plus on lel l returns (l) +. ontente on lels l n l returns l l. or on lels l n l returns l l. The lgorithm So2Sore is given in Algorithm 2. On more intuitive level, the lgorithm performs the following phses. () Reurse on ll strongly onnete loope omponents; reple eh with vertex, lele with the result of the reursion. (2) After the So is irete yli grph (DAG), fous on the set F of ll verties whih n e rehe from the soure iretly, ut not vi other verties; mke sure tht there re no verties whih n e rehe iretly n vi other verties (if neessry, n uxiliry noe lele ε). (3) Reurse on the sets of verties exlusively rehle from vertex in F n ontrt these sets to verties lele with the result of the reursion. (4) Comine verties of F with or, reurse gin on wht is exlusively rehle from this new vertex. (5) One only one item is left in F, split it off n reurse on the reminer. Algorithm 2: So2Sore Input: So A = (V, E); 2 Output: Sore minimlly generlizing L(A); 3 if V = 2 then return ε; 4 else if A hs yle then 5 Let U e strongly onnete loope omponent of A; 6 B 0 A. extrt(u). en(); 7 A. ontrt(u, plus(so2sore(b 0))); 8 else if A.su(A.sr) A. first() then 9 A. Epsilon(); 0 else if A. first() = then Let v e the only suessor of sr; 2 U V \ {A.sr, v, A.snk}; 3 l v.lel(); 4 l So2Sore(A. extrt(u)); 5 return ontente(l, l ); 6 else if v A. first(): A. exlusive(v) {v} then 7 Let v e suh tht A. exlusive(v) {v}; 8 U A. exlusive(v); 9 A. ontrt(u, So2Sore(A. extrt(u))); 20 else 2 Let u, v A. first() with u v s.t. A. reh(u) A. reh(v) is -mximl; 22 A. ontrt({u, v}, or(u. lel(), v. lel())); 23 return So2Sore(A); Note tht the lgorithm introues? y wy of onstruting or ε. This n e lene up y postproessing the resulting Sore. The following theorem sttes the orretness n the running time of the lgorithm. Theorem 25. The lgorithm So2Sore, given So A s input, fins esriptive Sore for L(A) in time O(nm), where n is the numer of lphet symols use in A, n m is the numer of trnsitions in A. Furthermore, this lgorithm proues Sore suh tht the orresponing So hs the sme strongly onnete omponents s the input So, n the sme set of suessors of the soure. Before we get to the proof of Theorem 25, we give two exmples of So2Sore. The first exmple illustrtes how strongly onnete loope omponents re trete. The seon illustrtes the use of exlusive. Exmple 26. Consier the following So. The lele verties of this So onsist of single strongly onnete loope omponent, n pplition of en omputes the set W = {, }, whih les to the following So.

10 After resolving the strongly onnete loope omponent ontining n (ll other re not loope ) n ontrt, we get the following. () + We n split off the first noe twie now (s line 0 pplies twie), reursing finlly on the remining So s follows. Epsilon This results in ε, or, equivlently,?. Going k through the reursions, we get (() +?) +. Exmple 27. Consier now the following So. For this So, line 6 pplies n reurses on the upper r; fter ontrtion, this gives whih results in () s esire (no generliztions were me). 5.2 Proof of Theorem 25 In this setion we re onerne with proving Theorem 25. We strt with lemm whih is use in its proof. Lemm 28. There is funtion f on Sores suh tht, for eh Sore α, L(f(α) + ) = L(α + ) \ {ε} n, for ll α.su(α.sr) n α.pre(α.snk) we hve f(α). The proof ws omitte for spe resons. We re now rey to prove Theorem 25. Proof. Let So A e given. We proee y first resoning out termintion n running time. After tht, we will inutively show orretness, y ssuming ll reursive lls to e orret. Termintion n running time. We refer to [5] for stnr grph lgorithms, suh s fining strongly onnete (loope) omponents. As the lgorithm never introues self-loops, it is esy to see tht the running time on So A is t most the running of A with ll self-loops remove plus n. Thus, it suffies to show tht So2Sore hs running time of O(nm) on selfloop free Sos. ε We first oun the running time on yli Sos. We topologilly sort the verties of A (this tkes O(m) time). We will now itertively onstrut n nnottion of ll the verties of G with susets of A. first(), orresponing to wht verties they re rehle from. We strt y nnotting eh vertex of G tht orrespons to vertex v A. first() with {v} n ll others with (in time O(n)). We now iterte through ll verties u from first to lst in the topologil sort of G n, for eh suessor w of u, we to the urrent nnottion of w the nnottion of u (ssuming unit time for this kin of set opertions; overll, this will then tke O(m) time). This results in the esire nnottion of A, in totl of O(m) time. Extrting the exlusive sets for ll elements of A. first() n now e one in O(m) time. From these nnottions we n lso fin pir of verties with -mximl reh-sets in time O(m). Any two itions of ε-noes re lne in etween y splitting off of strting noe, s given in line 0. As for ll other opertions, the lgorithm n mke t most n ontrtions; hene, there n e only O(n) reursive lls. This results in n overll time of O(nm) for yli Sos. We now turn to the generl se. Fining strongly onnete loope omponents tkes time O(m), using well-known lgorithms, for exmple Trjn s lgorithm. So2Sore first reurses on ll strongly onnete loope omponents, n then on the irete yli grph otine y ontrting ll strongly onnete loope omponents. The en opertion on strongly onnete loope omponent splits this omponent, s no vertex linke to the sink n now reh ny of the elements of the first set. The running time is mximize when the reursions re s unlne s possile; this hppens, when eh en opertion splits off only one vertex, n the remining So is still strongly onnete. This results in splitting off n times, with time of O(m) for fining strongly onnete loope omponents eh time, plus the finl work on yli Sos. This shows tht the overll running time is O(nm). Corretness. The sttements out strongly onnete omponents n the suessors of the soure re strightforwr. Furthermore, it is ler tht the result is Sore. We show the following sttement out So2Sore y inution (y termintion, we ssume the inution hypothesis to hol for ll reursive lls). Let generlize So A e given, let A e opy of the struture of A where ll lels re reple with single istint symols. Suppose tht the lim hols for ll reursive lls tht So2Sore mkes on A. Let δ = So2Sore(A) n let γ e Sore suh tht L(A) L(γ) L(δ). We istinguish numer of ifferent ses, epening on whih luse ws use for So2Sore(A). We will show L(δ) = L(γ) in eh se. Cse : The luse in line 3 ws use. This se is trivil. Cse 2: The luse in line 4 ws use. Let U e s hosen in line 4. Let B 0 = A. extrt(u). en(); let z e symol not in lph(a) n B = A. ontrt(u, z). Let ˆδ 0 = So2Sore(B 0) n let δ 0 = ˆδ 0 +. We let δ e So2Sore(B ).

11 Let T e the syntx tree of γ. For eh vertex x of T, we ll x plusse iff inserting + in T t x oes not hnge the lnguge epte y T. Clim. There is plusse vertex x in T suh tht, for the sutree γ 0 roote t x, we hve lph(γ 0) = lph(u). The proof ws omitte for spe resons. Let f e s shown existent in Lemm 28, n let x e the plusse vertex highest up in T suh tht lph(x) = U. Let ˆγ 0 e the sutree of γ roote t x; let γ e erive from γ y sustituting the sutree t x with lef lele z if ε L(γ 0) n (z ε) otherwise. Let γ 0 = f( ˆγ 0). Clerly, it suffies to show tht L(γ 0) = L(δ 0) n L(γ ) = L(δ ). Clim 2. L(B ) L(γ ) L(δ ). Proof of Clim 2. In orer to voi unneessry se istintions, we first introue two new n istint terminl symols n, where is use s wor-strt symol, n s wor-en symol. To this en, we efine γ := γ (δ, δ, n γ re efine nlogously). In ition to this, we efine So B with L(B ) = L(B ) n So B with L(B ) = L(B). (This is esily one y inserting new noes lel or etween the soure n its suessors, or the sink n its preeessors, respetively). We first prove L(B ) L(γ ) L(δ ). After this is estlishe, the lim follows y oserving tht projetion preserves inlusion. L(B ) L(γ ) : Let, lph(b ) \ {z} n suppose B. We hve B, n, hene, γ. From the efinition of γ it is now esy to see tht γ. Let lph(b ) \ {z} n suppose B z. Thus, there is U suh tht B, n, hene, γ. From the efinition of γ it is now esy to see tht γ z. Let lph(b ) \ {z} n suppose z B. Thus, there is n U suh tht B, n, hene, γ. From the efinition of γ it is now esy to see tht z γ. L(γ ) L(δ ) : Let, lph(γ ) \ {z} n suppose γ. From the efinition of γ it is now esy to see tht γ, n, hene, δ. Thus, we get δ. Let lph(γ ) \ {z} n suppose γ z. Thus, there is U suh tht γ, n, hene, δ. We hve now δ z. Let lph(γ ) \ {z} n suppose z γ. Thus, there is n U suh tht γ, n, hene, δ. We hve now z δ. Hene, L(B ) L(γ ) L(δ ), whih is equivlent to L(B ) L(γ ) L(δ ). As inlusion is preserve uner projetion, this implies π T (L(B )) π T (L(γ )) π T (L(δ )) whih proves the lim (for T := Σ \ {, }). (for Clim 2) Thnks to the lim we n now pply the inution hypothesis to see tht L(γ ) = L(δ ). Similrly, we now show γ 0 n δ 0 to e equivlent y showing L(B 0) L(γ 0) L(δ 0). From the inution hypothesis we know tht B 0.su(B 0.sr) = δ 0.su(δ 0.sr); this shows tht γ 0.su(γ 0.sr) hs to oinie with these sets. Clim 3. We hve tht B 0.pre(B 0.snk) γ 0.pre(γ 0.snk) δ 0.pre(δ 0.snk). The proof ws omitte for spe resons. Lstly, we turn to pirs of elements from U. Clim 4. On lph(u), B0 is sureltion of γ0, whih in turn is sureltion of δ0. Proof of Clim 4. This is strightforwr, using the properties of f tken from Lemm 28. (for Clim 4) This finishes showing L(B 0) L(γ 0) L(δ 0); thus, using the inution hypothesis, L(γ 0) = L(δ 0). This finishes the resoning for this se. Cse 3: The luse in line 8 ws use. This se is trivil from the inution hypothesis, s the lnguge is not hnge y the Epsilon() metho. Cse 4: The luse in line 0 ws use. Let v e the only suessor of A.sr; let = v. lel(). Note tht is the only suessor of γ.sr. Let U = lph(δ) \ {}. As A oes not hve strongly onnete loope omponent, neither oes So(γ); thus, we hve L(γ) = π U (L(γ)). Let γ equl γ with reple y ε n δ = So2Sore(A. extrt(u)). Then we hve L(A. extrt(u)) L(γ ) L(δ ) n the lim follows y inution. Cse 5: The luse in line 6 ws use. We now know tht A is yle free n, thus, δ oes not ontin +. Therefore, without loss of generlity, γ oes not ontin + either. Let v e s hosen in line 6 n = v. lel(). Let U = A. exlusive(v). Let B 0 = A. extrt(u); let z e symol not in lph(a) n B = A. ontrt(u, z). Let δ 0 = So2Sore(B 0) n let δ = So2Sore(B ). By the inution hypothesis, we hve tht δ 0. first() = {}. Thus, ny wor in L(γ) L(δ) tht ontins n element of U hs to strt with n. Clim 5. There is sutree γ 0 of γ suh tht lph(γ 0) = U. The proof ws omitte for spe resons. Let γ 0 e sutree of γ suh tht lph(γ 0) = U; let γ e erive from γ y sustituting the γ 0 with lef lele z. Note tht ε L(γ 0) euse of A.su(A.snk) = A. first(). We now lerly get L(B 0) L(γ 0) L(δ 0) n L(B ) L(γ ) L(δ ). Thus, this se follows from the inution hypothesis, similrly to Cse 2. Cse 6: The luse in line 20 ws use. In this se we know tht A. first() >, s no other se pplies. Furthermore, we will use without mention tht A is yle free. Let u, v s hosen in line 20. Let z e symol not in lph(a). Let B = A. ontrt({u, v}, z). Let δ 0 e So2Sore(B). Let = u. lel() n = v. lel(). From u, v δ. first() we hve tht there is sutree β of γ with or t the root n n re in ifferent hil trees. Clim 6. L(β) is set of letters. The proof ws omitte for spe resons. From the lim we get, without loss of generlity, tht ( ) is suexpression of γ; thus, β = ( ). Let γ 0 e erive from γ y sustituting β with z. Clerly, we now hve L(B) L(γ 0) L(δ 0). From the inution hypothesis we get L(γ 0) = L(δ 0); thus, L(γ) = L(δ).

12 6. CONCLUSIONS AND FURTHER WORK This pper proposes strtegy for inferring esriptive Sores n esriptive Chres: First, use 2T-INF to ompute esriptive So, then use So2Sore or So2Chre to turn this utomton into Sore or Chre. In [3], Bex et l. stte tht their shem inferene lgorithms outperform existing lgorithms in ury, oniseness, n spee. Consiering the results presente in Setions 3 to 5, the uthors of the present pper feel onfient to suggest tht their new strtegies outperform the lgorithms from [3] t lest with respet to oth ury n spee. An experimentl evlution of the lgorithms is plnne for the ner future. This will lso give the opportunity to evlute the qulity of the results of the lgorithms, for exmple with respet to ifferent oniseness mesures or how well they esrie the trget lnguge. We now isuss possile extensions, n possile iretions for further work. In orer to overome the prolem tht Sores n Chres nnot ount (eyon the trivil se of istinguishing etween 0 n ), Bex et l. [3] propose extening those moels with numeril preites, whih n e otine y postproessing. It is esy to see tht this extension n lso e pte to the pprohes in the present pper. If one is willing to fix proility istriution on the smple, the lerning lgorithms oul e pte to feture vrint of stohsti finite lerning (introue y Rossmnith n Zeugmnn [3]). This oul le to inferene lgorithms tht o not nee to proess the whole input, whih might e interesting for very lrge tsets. From the uthors point of view, the following prolem is proly the most interesting: In [2], Bex et l. exmine the inferene of k-ourene regulr expressions (short k- Ores); regulr expressions where eh terminl letter ours t most k times. (Hene, Sores re -Ores). Is it possile to exten So2Sore to eterministi k-ores for some k 2, or So2Chre to the orresponing extension of Chres (where letters re llowe to our up to k times)? It seems tht one woul nee to evelop not only goo generliztion of Sos, ut lso goo inlusion riterion, preferly syntti. This onjeture is se on the following oservtion: While the results in the present pper mke no iret use of the results n tehniques tht Freyenerger n Reienh [7] evelope for esriptive generliztion of pttern lnguges, oth ppers rely hevily on the ft tht the inlusion prolem for the respetive lnguge lsses hs syntti riterion for inlusion. The proofs on esriptive generliztion of pttern lnguges in [7] rely on the ft tht inlusion for terminl-free E-pttern lnguges is hrterize y the existene of morphism whih mps the pttern tht genertes the superlnguge to the pttern tht genertes the sulnguge. This riterion is verstile tool to prove the nonexistene of (pttern) lnguge etween the trget lnguge n the lnguge of esriptive pttern. While the proofs of the present pper nnot mke ny iret use of the proofs from [7], the pprohes re similr oneptully. In prtiulr, the line of resoning in whih the orretness proofs of So2Chre n So2Sore use the ft tht the inlusion prolem for Sores (n Chres) is hrterize y the overing of the respetive Sos is struturlly similr to the proofs for pttern lnguges. Moreover, lthough eiing whether suh pttern morphism exists is NP-omplete, the tehniques in [7] re not ffete y the omputtionl hrness. Hene, the hrness results on the eiility of the k-ore-inlusion prolem presente y Mrtens et l. [0] o not exlue the existene of suh riterion. This leves room for hope tht So2Sore n e extene to k-ores with k 2. Aknowlegements The uthors wish to thnk the nonymous referees for their helpful remrks. 7. REFERENCES [] D. Angluin. Fining ptterns ommon to set of strings. Journl of Computer n System Sienes, 2():46 62, 980. [2] G. J. Bex, W. Gele, F. Neven, n S. Vnsummeren. Lerning eterministi regulr expressions for the inferene of shems from XML t. ACM Trnstions on the We, 4(4):4: 4:32, 200. [3] G. J. Bex, F. Neven, T. Shwentik, n S. Vnsummeren. Inferene of onise regulr expressions n DTDs. ACM Trnstions on Dtse Systems, 35(2):: :47, 200. [4] G. J. Bex, F. Neven, n S. Vnsummeren. Inferring XML shem efinitions from XML t. In Pro. VLDB 2007, pges , [5] T. H. Cormen, C. E. Leiserson, R. L. Rivest, n C. Stein. Introution to Algorithms. MGrw Hill, 2n eition, 200. [6] H. Fernu. Algorithms for lerning regulr expressions from positive t. Informtion n Computtion, 207(4):52 54, [7] D. D. Freyenerger n D. Reienh. Inferring esriptive generlistions of forml lnguges. In Pro. COLT 200, pges , 200. [8] P. Grí n E. Vil. Inferene of k-testle lnguges in the strit sense n pplition to syntti pttern reognition. IEEE Trnstions on Pttern Anlysis n Mhine Intelligene, 2(9): , 990. [9] E. M. Gol. Lnguge ientifition in the limit. Informtion n Control, 0(5): , 967. [0] W. Mrtens, F. Neven, n T. Shwentik. Complexity of eision prolems for XML shems n hin regulr expressions. SIAM Journl on Computing, 39(4): , [] W. Mrtens, F. Neven, T. Shwentik, n G. J. Bex. Expressiveness n omplexity of XML shem. ACM Trnstions on Dtse Systems, 3(3):770 83, [2] Y. K. Ng n T. Shinohr. Developments from enquiries into the lernility of the pttern lnguges from positive t. Theoretil Computer Siene, 397( 3):50 65, [3] P. Rossmnith n T. Zeugmnn. Stohsti finite lerning of the pttern lnguges. Mhine Lerning, 44( 2):67 9, 200.

CS 491G Combinatorial Optimization Lecture Notes

CS 491G Combinatorial Optimization Lecture Notes CS 491G Comintoril Optimiztion Leture Notes Dvi Owen July 30, August 1 1 Mthings Figure 1: two possile mthings in simple grph. Definition 1 Given grph G = V, E, mthing is olletion of eges M suh tht e i,

More information

2.4 Theoretical Foundations

2.4 Theoretical Foundations 2 Progrmming Lnguge Syntx 2.4 Theoretil Fountions As note in the min text, snners n prsers re se on the finite utomt n pushown utomt tht form the ottom two levels of the Chomsky lnguge hierrhy. At eh level

More information

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs Isomorphism of Grphs Definition The simple grphs G 1 = (V 1, E 1 ) n G = (V, E ) re isomorphi if there is ijetion (n oneto-one n onto funtion) f from V 1 to V with the property tht n re jent in G 1 if

More information

Lecture 6: Coding theory

Lecture 6: Coding theory Leture 6: Coing theory Biology 429 Crl Bergstrom Ferury 4, 2008 Soures: This leture loosely follows Cover n Thoms Chpter 5 n Yeung Chpter 3. As usul, some of the text n equtions re tken iretly from those

More information

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6 CS311 Computtionl Strutures Regulr Lnguges nd Regulr Grmmrs Leture 6 1 Wht we know so fr: RLs re losed under produt, union nd * Every RL n e written s RE, nd every RE represents RL Every RL n e reognized

More information

NON-DETERMINISTIC FSA

NON-DETERMINISTIC FSA Tw o types of non-determinism: NON-DETERMINISTIC FS () Multiple strt-sttes; strt-sttes S Q. The lnguge L(M) ={x:x tkes M from some strt-stte to some finl-stte nd ll of x is proessed}. The string x = is

More information

Compression of Palindromes and Regularity.

Compression of Palindromes and Regularity. Compression of Plinromes n Regulrity. Kyoko Shikishim-Tsuji Center for Lierl Arts Eution n Reserh Tenri University 1 Introution In [1], property of likstrem t t view of tse is isusse n it is shown tht

More information

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite! Solutions for HW9 Exerise 28. () Drw C 6, W 6 K 6, n K 5,3. C 6 : W 6 : K 6 : K 5,3 : () Whih of the following re iprtite? Justify your nswer. Biprtite: put the re verties in V 1 n the lk in V 2. Biprtite:

More information

CIT 596 Theory of Computation 1. Graphs and Digraphs

CIT 596 Theory of Computation 1. Graphs and Digraphs CIT 596 Theory of Computtion 1 A grph G = (V (G), E(G)) onsists of two finite sets: V (G), the vertex set of the grph, often enote y just V, whih is nonempty set of elements lle verties, n E(G), the ege

More information

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution Tehnishe Universität Münhen Winter term 29/ I7 Prof. J. Esprz / J. Křetínský / M. Luttenerger. Ferur 2 Solution Automt nd Forml Lnguges Homework 2 Due 5..29. Exerise 2. Let A e the following finite utomton:

More information

Mid-Term Examination - Spring 2014 Mathematical Programming with Applications to Economics Total Score: 45; Time: 3 hours

Mid-Term Examination - Spring 2014 Mathematical Programming with Applications to Economics Total Score: 45; Time: 3 hours Mi-Term Exmintion - Spring 0 Mthemtil Progrmming with Applitions to Eonomis Totl Sore: 5; Time: hours. Let G = (N, E) e irete grph. Define the inegree of vertex i N s the numer of eges tht re oming into

More information

CS 573 Automata Theory and Formal Languages

CS 573 Automata Theory and Formal Languages Non-determinism Automt Theory nd Forml Lnguges Professor Leslie Lnder Leture # 3 Septemer 6, 2 To hieve our gol, we need the onept of Non-deterministi Finite Automton with -moves (NFA) An NFA is tuple

More information

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of: 22: Union Fin CS 473u - Algorithms - Spring 2005 April 14, 2005 1 Union-Fin We wnt to mintin olletion of sets, uner the opertions of: 1. MkeSet(x) - rete set tht ontins the single element x. 2. Fin(x)

More information

Nondeterministic Automata vs Deterministic Automata

Nondeterministic Automata vs Deterministic Automata Nondeterministi Automt vs Deterministi Automt We lerned tht NFA is onvenient model for showing the reltionships mong regulr grmmrs, FA, nd regulr expressions, nd designing them. However, we know tht n

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS The University of ottinghm SCHOOL OF COMPUTR SCIC A LVL 2 MODUL, SPRIG SMSTR 2015 2016 MACHIS AD THIR LAGUAGS ASWRS Time llowed TWO hours Cndidtes my omplete the front over of their nswer ook nd sign their

More information

CSE 332. Sorting. Data Abstractions. CSE 332: Data Abstractions. QuickSort Cutoff 1. Where We Are 2. Bounding The MAXIMUM Problem 4

CSE 332. Sorting. Data Abstractions. CSE 332: Data Abstractions. QuickSort Cutoff 1. Where We Are 2. Bounding The MAXIMUM Problem 4 Am Blnk Leture 13 Winter 2016 CSE 332 CSE 332: Dt Astrtions Sorting Dt Astrtions QuikSort Cutoff 1 Where We Are 2 For smll n, the reursion is wste. The onstnts on quik/merge sort re higher thn the ones

More information

6.5 Improper integrals

6.5 Improper integrals Eerpt from "Clulus" 3 AoPS In. www.rtofprolemsolving.om 6.5. IMPROPER INTEGRALS 6.5 Improper integrls As we ve seen, we use the definite integrl R f to ompute the re of the region under the grph of y =

More information

Nondeterministic Finite Automata

Nondeterministic Finite Automata Nondeterministi Finite utomt The Power of Guessing Tuesdy, Otoer 4, 2 Reding: Sipser.2 (first prt); Stoughton 3.3 3.5 S235 Lnguges nd utomt eprtment of omputer Siene Wellesley ollege Finite utomton (F)

More information

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Project 6: Minigoals Towards Simplifying and Rewriting Expressions MAT 51 Wldis Projet 6: Minigols Towrds Simplifying nd Rewriting Expressions The distriutive property nd like terms You hve proly lerned in previous lsses out dding like terms ut one prolem with the wy

More information

A Disambiguation Algorithm for Finite Automata and Functional Transducers

A Disambiguation Algorithm for Finite Automata and Functional Transducers A Dismigution Algorithm for Finite Automt n Funtionl Trnsuers Mehryr Mohri Cournt Institute of Mthemtil Sienes n Google Reserh 51 Merer Street, New York, NY 1001, USA Astrt. We present new ismigution lgorithm

More information

Automata and Regular Languages

Automata and Regular Languages Chpter 9 Automt n Regulr Lnguges 9. Introution This hpter looks t mthemtil moels of omputtion n lnguges tht esrie them. The moel-lnguge reltionship hs multiple levels. We shll explore the simplest level,

More information

The DOACROSS statement

The DOACROSS statement The DOACROSS sttement Is prllel loop similr to DOALL, ut it llows prouer-onsumer type of synhroniztion. Synhroniztion is llowe from lower to higher itertions sine it is ssume tht lower itertions re selete

More information

CS 2204 DIGITAL LOGIC & STATE MACHINE DESIGN SPRING 2014

CS 2204 DIGITAL LOGIC & STATE MACHINE DESIGN SPRING 2014 S 224 DIGITAL LOGI & STATE MAHINE DESIGN SPRING 214 DUE : Mrh 27, 214 HOMEWORK III READ : Relte portions of hpters VII n VIII ASSIGNMENT : There re three questions. Solve ll homework n exm prolems s shown

More information

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106 8. Problem Set Due Wenesy, Ot., t : p.m. in - Problem Mony / Consier the eight vetors 5, 5, 5,..., () List ll of the one-element, linerly epenent sets forme from these. (b) Wht re the two-element, linerly

More information

Subsequence Automata with Default Transitions

Subsequence Automata with Default Transitions Susequene Automt with Defult Trnsitions Philip Bille, Inge Li Gørtz, n Freerik Rye Skjoljensen Tehnil University of Denmrk {phi,inge,fskj}@tu.k Astrt. Let S e string of length n with hrters from n lphet

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Lecture 2: Cayley Graphs

Lecture 2: Cayley Graphs Mth 137B Professor: Pri Brtlett Leture 2: Cyley Grphs Week 3 UCSB 2014 (Relevnt soure mteril: Setion VIII.1 of Bollos s Moern Grph Theory; 3.7 of Gosil n Royle s Algeri Grph Theory; vrious ppers I ve re

More information

Finite State Automata and Determinisation

Finite State Automata and Determinisation Finite Stte Automt nd Deterministion Tim Dworn Jnury, 2016 Lnguges fs nf re df Deterministion 2 Outline 1 Lnguges 2 Finite Stte Automt (fs) 3 Non-deterministi Finite Stte Automt (nf) 4 Regulr Expressions

More information

On a Class of Planar Graphs with Straight-Line Grid Drawings on Linear Area

On a Class of Planar Graphs with Straight-Line Grid Drawings on Linear Area Journl of Grph Algorithms n Applitions http://jg.info/ vol. 13, no. 2, pp. 153 177 (2009) On Clss of Plnr Grphs with Stright-Line Gri Drwings on Liner Are M. Rezul Krim 1,2 M. Siur Rhmn 1 1 Deprtment of

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Necessary and sucient conditions for some two. Abstract. Further we show that the necessary conditions for the existence of an OD(44 s 1 s 2 )

Necessary and sucient conditions for some two. Abstract. Further we show that the necessary conditions for the existence of an OD(44 s 1 s 2 ) Neessry n suient onitions for some two vrile orthogonl esigns in orer 44 C. Koukouvinos, M. Mitrouli y, n Jennifer Seerry z Deite to Professor Anne Penfol Street Astrt We give new lgorithm whih llows us

More information

Maximum size of a minimum watching system and the graphs achieving the bound

Maximum size of a minimum watching system and the graphs achieving the bound Mximum size of minimum wthing system n the grphs hieving the oun Tille mximum un système e ontrôle minimum et les grphes tteignnt l orne Dvi Auger Irène Chron Olivier Hury Antoine Lostein 00D0 Mrs 00 Déprtement

More information

Logic, Set Theory and Computability [M. Coppenbarger]

Logic, Set Theory and Computability [M. Coppenbarger] 14 Orer (Hnout) Definition 7-11: A reltion is qusi-orering (or preorer) if it is reflexive n trnsitive. A quisi-orering tht is symmetri is n equivlene reltion. A qusi-orering tht is nti-symmetri is n orer

More information

Solutions to Problem Set #1

Solutions to Problem Set #1 CSE 233 Spring, 2016 Solutions to Prolem Set #1 1. The movie tse onsists of the following two reltions movie: title, iretor, tor sheule: theter, title The first reltion provies titles, iretors, n tors

More information

Lecture 8: Abstract Algebra

Lecture 8: Abstract Algebra Mth 94 Professor: Pri Brtlett Leture 8: Astrt Alger Week 8 UCSB 2015 This is the eighth week of the Mthemtis Sujet Test GRE prep ourse; here, we run very rough-n-tumle review of strt lger! As lwys, this

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

GNFA GNFA GNFA GNFA GNFA

GNFA GNFA GNFA GNFA GNFA DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with

More information

Separable discrete functions: recognition and sufficient conditions

Separable discrete functions: recognition and sufficient conditions Seprle isrete funtions: reognition n suffiient onitions Enre Boros Onřej Čepek Vlimir Gurvih Novemer 21, 217 rxiv:1711.6772v1 [mth.co] 17 Nov 217 Astrt A isrete funtion of n vriles is mpping g : X 1...

More information

Graph Theory. Simple Graph G = (V, E). V={a,b,c,d,e,f,g,h,k} E={(a,b),(a,g),( a,h),(a,k),(b,c),(b,k),...,(h,k)}

Graph Theory. Simple Graph G = (V, E). V={a,b,c,d,e,f,g,h,k} E={(a,b),(a,g),( a,h),(a,k),(b,c),(b,k),...,(h,k)} Grph Theory Simple Grph G = (V, E). V ={verties}, E={eges}. h k g f e V={,,,,e,f,g,h,k} E={(,),(,g),(,h),(,k),(,),(,k),...,(h,k)} E =16. 1 Grph or Multi-Grph We llow loops n multiple eges. G = (V, E.ψ)

More information

arxiv: v2 [math.co] 31 Oct 2016

arxiv: v2 [math.co] 31 Oct 2016 On exlue minors of onnetivity 2 for the lss of frme mtrois rxiv:1502.06896v2 [mth.co] 31 Ot 2016 Mtt DeVos Dryl Funk Irene Pivotto Astrt We investigte the set of exlue minors of onnetivity 2 for the lss

More information

Now we must transform the original model so we can use the new parameters. = S max. Recruits

Now we must transform the original model so we can use the new parameters. = S max. Recruits MODEL FOR VARIABLE RECRUITMENT (ontinue) Alterntive Prmeteriztions of the pwner-reruit Moels We n write ny moel in numerous ifferent ut equivlent forms. Uner ertin irumstnes it is onvenient to work with

More information

= state, a = reading and q j

= state, a = reading and q j 4 Finite Automt CHAPTER 2 Finite Automt (FA) (i) Derterministi Finite Automt (DFA) A DFA, M Q, q,, F, Where, Q = set of sttes (finite) q Q = the strt/initil stte = input lphet (finite) (use only those

More information

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version

A Lower Bound for the Length of a Partial Transversal in a Latin Square, Revised Version A Lower Bound for the Length of Prtil Trnsversl in Ltin Squre, Revised Version Pooy Htmi nd Peter W. Shor Deprtment of Mthemtil Sienes, Shrif University of Tehnology, P.O.Bo 11365-9415, Tehrn, Irn Deprtment

More information

CS 360 Exam 2 Fall 2014 Name

CS 360 Exam 2 Fall 2014 Name CS 360 Exm 2 Fll 2014 Nme 1. The lsses shown elow efine singly-linke list n stk. Write three ifferent O(n)-time versions of the reverse_print metho s speifie elow. Eh version of the metho shoul output

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides. 1 PYTHAGORAS THEOREM 1 1 Pythgors Theorem In this setion we will present geometri proof of the fmous theorem of Pythgors. Given right ngled tringle, the squre of the hypotenuse is equl to the sum of the

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

I 3 2 = I I 4 = 2A

I 3 2 = I I 4 = 2A ECE 210 Eletril Ciruit Anlysis University of llinois t Chigo 2.13 We re ske to use KCL to fin urrents 1 4. The key point in pplying KCL in this prolem is to strt with noe where only one of the urrents

More information

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE M. STISSING, C. N. S. PEDERSEN, T. MAILUND AND G. S. BRODAL Bioinformtis Reserh Center, n Dept. of Computer Siene, University

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

Section 2.1 Special Right Triangles

Section 2.1 Special Right Triangles Se..1 Speil Rigt Tringles 49 Te --90 Tringle Setion.1 Speil Rigt Tringles Te --90 tringle (or just 0-60-90) is so nme euse of its ngle mesures. Te lengts of te sies, toug, ve very speifi pttern to tem

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Metaheuristics for the Asymmetric Hamiltonian Path Problem

Metaheuristics for the Asymmetric Hamiltonian Path Problem Metheuristis for the Asymmetri Hmiltonin Pth Prolem João Pero PEDROSO INESC - Porto n DCC - Fule e Ciênis, Universie o Porto, Portugl jpp@f.up.pt Astrt. One of the most importnt pplitions of the Asymmetri

More information

Lecture Notes No. 10

Lecture Notes No. 10 2.6 System Identifition, Estimtion, nd Lerning Leture otes o. Mrh 3, 26 6 Model Struture of Liner ime Invrint Systems 6. Model Struture In representing dynmil system, the first step is to find n pproprite

More information

Lecture 11 Binary Decision Diagrams (BDDs)

Lecture 11 Binary Decision Diagrams (BDDs) C 474A/57A Computer-Aie Logi Design Leture Binry Deision Digrms (BDDs) C 474/575 Susn Lyseky o 3 Boolen Logi untions Representtions untion n e represente in ierent wys ruth tle, eqution, K-mp, iruit, et

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

Data Structures LECTURE 10. Huffman coding. Example. Coding: problem definition

Data Structures LECTURE 10. Huffman coding. Example. Coding: problem definition Dt Strutures, Spring 24 L. Joskowiz Dt Strutures LEURE Humn oing Motivtion Uniquel eipherle oes Prei oes Humn oe onstrution Etensions n pplitions hpter 6.3 pp 385 392 in tetook Motivtion Suppose we wnt

More information

On the Spectra of Bipartite Directed Subgraphs of K 4

On the Spectra of Bipartite Directed Subgraphs of K 4 On the Spetr of Biprtite Direte Sugrphs of K 4 R. C. Bunge, 1 S. I. El-Znti, 1, H. J. Fry, 1 K. S. Kruss, 2 D. P. Roerts, 3 C. A. Sullivn, 4 A. A. Unsiker, 5 N. E. Witt 6 1 Illinois Stte University, Norml,

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Particle Physics. Michaelmas Term 2011 Prof Mark Thomson. Handout 3 : Interaction by Particle Exchange and QED. Recap

Particle Physics. Michaelmas Term 2011 Prof Mark Thomson. Handout 3 : Interaction by Particle Exchange and QED. Recap Prtile Physis Mihelms Term 2011 Prof Mrk Thomson g X g X g g Hnout 3 : Intertion y Prtile Exhnge n QED Prof. M.A. Thomson Mihelms 2011 101 Rep Working towrs proper lultion of ey n sttering proesses lnitilly

More information

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18

Global alignment. Genome Rearrangements Finding preserved genes. Lecture 18 Computt onl Biology Leture 18 Genome Rerrngements Finding preserved genes We hve seen before how to rerrnge genome to obtin nother one bsed on: Reversls Knowledge of preserved bloks (or genes) Now we re

More information

On the existence of a cherry-picking sequence

On the existence of a cherry-picking sequence On the existene of herry-piking sequene Jnosh Döker, Simone Linz Deprtment of Computer Siene, University of Tüingen, Germny Deprtment of Computer Siene, University of Aukln, New Zeln Astrt Reently, the

More information

Numbers and indices. 1.1 Fractions. GCSE C Example 1. Handy hint. Key point

Numbers and indices. 1.1 Fractions. GCSE C Example 1. Handy hint. Key point GCSE C Emple 7 Work out 9 Give your nswer in its simplest form Numers n inies Reiprote mens invert or turn upsie own The reiprol of is 9 9 Mke sure you only invert the frtion you re iviing y 7 You multiply

More information

Bisimulation, Games & Hennessy Milner logic

Bisimulation, Games & Hennessy Milner logic Bisimultion, Gmes & Hennessy Milner logi Leture 1 of Modelli Mtemtii dei Proessi Conorrenti Pweł Soboiński Univeristy of Southmpton, UK Bisimultion, Gmes & Hennessy Milner logi p.1/32 Clssil lnguge theory

More information

INTRODUCTION TO AUTOMATA THEORY

INTRODUCTION TO AUTOMATA THEORY Chpter 3 INTRODUCTION TO AUTOMATA THEORY In this hpter we stuy the most si strt moel of omputtion. This moel els with mhines tht hve finite memory pity. Setion 3. els with mhines tht operte eterministilly

More information

CSC2542 State-Space Planning

CSC2542 State-Space Planning CSC2542 Stte-Spe Plnning Sheil MIlrith Deprtment of Computer Siene University of Toronto Fll 2010 1 Aknowlegements Some the slies use in this ourse re moifitions of Dn Nu s leture slies for the textook

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Lecture 4: Graph Theory and the Four-Color Theorem

Lecture 4: Graph Theory and the Four-Color Theorem CCS Disrete II Professor: Pri Brtlett Leture 4: Grph Theory n the Four-Color Theorem Week 4 UCSB 2015 Through the rest of this lss, we re going to refer frequently to things lle grphs! If you hen t seen

More information

XML and Databases. Exam Preperation Discuss Answers to last year s exam. Sebastian Maneth NICTA and UNSW

XML and Databases. Exam Preperation Discuss Answers to last year s exam. Sebastian Maneth NICTA and UNSW XML n Dtses Exm Prepertion Disuss Answers to lst yer s exm Sestin Mneth NICTA n UNSW CSE@UNSW -- Semester 1, 2008 (1) For eh of the following, explin why it is not well-forme XML (is WFC or the XML grmmr

More information

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE

COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE COMPUTING THE QUARTET DISTANCE BETWEEN EVOLUTIONARY TREES OF BOUNDED DEGREE M. STISSING, C. N. S. PEDERSEN, T. MAILUND AND G. S. BRODAL Bioinformtis Reserh Center, n Dept. of Computer Siene, University

More information

POSITIVE IMPLICATIVE AND ASSOCIATIVE FILTERS OF LATTICE IMPLICATION ALGEBRAS

POSITIVE IMPLICATIVE AND ASSOCIATIVE FILTERS OF LATTICE IMPLICATION ALGEBRAS Bull. Koren Mth. So. 35 (998), No., pp. 53 6 POSITIVE IMPLICATIVE AND ASSOCIATIVE FILTERS OF LATTICE IMPLICATION ALGEBRAS YOUNG BAE JUN*, YANG XU AND KEYUN QIN ABSTRACT. We introue the onepts of positive

More information

CS261: A Second Course in Algorithms Lecture #5: Minimum-Cost Bipartite Matching

CS261: A Second Course in Algorithms Lecture #5: Minimum-Cost Bipartite Matching CS261: A Seon Course in Algorithms Leture #5: Minimum-Cost Biprtite Mthing Tim Roughgren Jnury 19, 2016 1 Preliminries Figure 1: Exmple of iprtite grph. The eges {, } n {, } onstitute mthing. Lst leture

More information

Laboratory for Foundations of Computer Science. An Unfolding Approach. University of Edinburgh. Model Checking. Javier Esparza

Laboratory for Foundations of Computer Science. An Unfolding Approach. University of Edinburgh. Model Checking. Javier Esparza An Unfoling Approh to Moel Cheking Jvier Esprz Lbortory for Fountions of Computer Siene University of Einburgh Conurrent progrms Progrm: tuple P T 1 T n of finite lbelle trnsition systems T i A i S i i

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b

where the box contains a finite number of gates from the given collection. Examples of gates that are commonly used are the following: a b CS 294-2 9/11/04 Quntum Ciruit Model, Solovy-Kitev Theorem, BQP Fll 2004 Leture 4 1 Quntum Ciruit Model 1.1 Clssil Ciruits - Universl Gte Sets A lssil iruit implements multi-output oolen funtion f : {0,1}

More information

Prefix-Free Regular-Expression Matching

Prefix-Free Regular-Expression Matching Prefix-Free Regulr-Expression Mthing Yo-Su Hn, Yjun Wng nd Derik Wood Deprtment of Computer Siene HKUST Prefix-Free Regulr-Expression Mthing p.1/15 Pttern Mthing Given pttern P nd text T, find ll sustrings

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points: Eidgenössishe Tehnishe Hohshule Zürih Eole polytehnique fédérle de Zurih Politenio federle di Zurigo Federl Institute of Tehnology t Zurih Deprtement of Computer Siene. Novemer 0 Mrkus Püshel, Dvid Steurer

More information

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA Common intervls of genomes Mthieu Rffinot CNRS LIF Context: omprtive genomis. set of genomes prtilly/totlly nnotte Informtive group of genes or omins? Ex: COG tse Mny iffiulties! iology Wht re two similr

More information

Surds and Indices. Surds and Indices. Curriculum Ready ACMNA: 233,

Surds and Indices. Surds and Indices. Curriculum Ready ACMNA: 233, Surs n Inies Surs n Inies Curriulum Rey ACMNA:, 6 www.mthletis.om Surs SURDS & & Inies INDICES Inies n surs re very losely relte. A numer uner (squre root sign) is lle sur if the squre root n t e simplifie.

More information

The vertex leafage of chordal graphs

The vertex leafage of chordal graphs The vertex lefge of horl grphs Steven Chplik, Jurj Stho b Deprtment of Physis n Computer Siene, Wilfri Lurier University, 75 University Ave. West, Wterloo, Ontrio N2L 3C5, Cn b DIMAP n Mthemtis Institute,

More information

Chapter 4 State-Space Planning

Chapter 4 State-Space Planning Leture slides for Automted Plnning: Theory nd Prtie Chpter 4 Stte-Spe Plnning Dn S. Nu CMSC 722, AI Plnning University of Mrylnd, Spring 2008 1 Motivtion Nerly ll plnning proedures re serh proedures Different

More information

Obstructions to chordal circular-arc graphs of small independence number

Obstructions to chordal circular-arc graphs of small independence number Ostrutions to horl irulr-r grphs of smll inepenene numer Mthew Frnis,1 Pvol Hell,2 Jurj Stho,3 Institute of Mth. Sienes, IV Cross Ro, Trmni, Chenni 600 113, Ini Shool of Comp. Siene, Simon Frser University,

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

Outline Data Structures and Algorithms. Data compression. Data compression. Lossy vs. Lossless. Data Compression

Outline Data Structures and Algorithms. Data compression. Data compression. Lossy vs. Lossless. Data Compression 5-2 Dt Strutures n Algorithms Dt Compression n Huffmn s Algorithm th Fe 2003 Rjshekr Rey Outline Dt ompression Lossy n lossless Exmples Forml view Coes Definition Fixe length vs. vrile length Huffmn s

More information

Graph Algorithms. Vertex set = { a,b,c,d } Edge set = { {a,c}, {b,c}, {c,d}, {b,d}} Figure 1: An example for a simple graph

Graph Algorithms. Vertex set = { a,b,c,d } Edge set = { {a,c}, {b,c}, {c,d}, {b,d}} Figure 1: An example for a simple graph Inin Institute of Informtion Tehnology Design n Mnufturing, Knheepurm, Chenni 00, Ini An Autonomous Institute uner MHRD, Govt of Ini http://www.iiitm..in COM 0T Design n Anlysis of Algorithms -Leture Notes

More information

System Validation (IN4387) November 2, 2012, 14:00-17:00

System Validation (IN4387) November 2, 2012, 14:00-17:00 System Vlidtion (IN4387) Novemer 2, 2012, 14:00-17:00 Importnt Notes. The exmintion omprises 5 question in 4 pges. Give omplete explntion nd do not onfine yourself to giving the finl nswer. Good luk! Exerise

More information

AP Calculus BC Chapter 8: Integration Techniques, L Hopital s Rule and Improper Integrals

AP Calculus BC Chapter 8: Integration Techniques, L Hopital s Rule and Improper Integrals AP Clulus BC Chpter 8: Integrtion Tehniques, L Hopitl s Rule nd Improper Integrls 8. Bsi Integrtion Rules In this setion we will review vrious integrtion strtegies. Strtegies: I. Seprte the integrnd into

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

arxiv: v1 [cs.dm] 24 Jul 2017

arxiv: v1 [cs.dm] 24 Jul 2017 Some lsses of grphs tht re not PCGs 1 rxiv:1707.07436v1 [s.dm] 24 Jul 2017 Pierluigi Biohi Angelo Monti Tizin Clmoneri Rossell Petreshi Computer Siene Deprtment, Spienz University of Rome, Itly pierluigi.iohi@gmil.om,

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

6. Suppose lim = constant> 0. Which of the following does not hold?

6. Suppose lim = constant> 0. Which of the following does not hold? CSE 0-00 Nme Test 00 points UTA Stuent ID # Multiple Choie Write your nswer to the LEFT of eh prolem 5 points eh The k lrgest numers in file of n numers n e foun using Θ(k) memory in Θ(n lg k) time using

More information

Factorising FACTORISING.

Factorising FACTORISING. Ftorising FACTORISING www.mthletis.om.u Ftorising FACTORISING Ftorising is the opposite of expning. It is the proess of putting expressions into rkets rther thn expning them out. In this setion you will

More information