Symbolic Automata for Representing Big Code

Size: px
Start display at page:

Download "Symbolic Automata for Representing Big Code"

Transcription

1 Nonme mnuscript No. (will e inserted y the editor) Symolic Automt for Representing Big Code Hil Peleg Shron Shohm Ern Yhv Hongseok Yng Received: dte / Accepted: dte Astrct Anlysis of mssive codeses ( ig code ) presents n opportunity for drwing insights out progrmming prctice nd enling code reuse. One of the min chllenges in nlyzing ig code is finding representtion tht cptures sufficient semntic informtion, cn e constructed efficiently, nd is menle to meningful comprison opertions. We present forml frmework for representing code in lrge codeses. In our frmework, the semntic descriptor for ech code snippet is prtil temporl specifiction tht cptures the sequences of method invoctions on n API. The min ide is to represent prtil temporl specifictions s symolic utomt utomt where trnsitions my e leled y vriles, nd vrile cn e sustituted y letter, word, or regulr lnguge. Using symolic utomt, we construct n strct domin for sttic nlysis of ig code, cpturing oth the prtilness of specifiction nd the precision of specifiction. We show interesting reltionships etween lttice opertions of this domin nd common opertors for mnipulting prtil temporl specifictions, such s uilding more informtive specifiction y consolidting two prtil specifictions, nd compring prtil temporl specifictions. A preliminry version of this pper ppered in [0]. H. Peleg Tel Aviv University, Isrel, E-mil: hil.peleg@gmil.com S. Shohm Tel Aviv-Yffo Acdemic College, Isrel E-mil: shron.shohm@gmil.com E. Yhv Technion, Isrel E-mil: yhve@gmil.com H. Yng University of Oxford, UK E-mil: hongseok00@gmil.com

2 Hil Peleg et l. 1 Introduction Anlysis of mssive codeses ( ig code ) presents n opportunity for drwing insights out progrmming prctice nd enling code reuse. One of the min chllenges in nlyzing ig code is finding representtion tht cptures sufficient semntic informtion, cn e constructed efficiently, nd is menle to meningful comprison opertions. In this pper, we present forml frmework for representing ig code lrge numer of prtil progrms. The min ide is to extrct temporl specifictions from code. To nlyze ig code, it is criticl to llow nlysis of code snippets, i.e., code frgments with unknown prts. A nturl pproch for cpturing gps in the snippets is to use gps in the specifiction. For exmple, when the code contins n invoction of n unknown method, this pproch reflects this fct in the extrcted specifiction s well (we elorte on this point lter). These prtil specifictions serve s sis for semntic index of the code snippets. Such prtil specifictions cn e used to construct proilistic model of the oserved ehviors (similr to the PFSA used in [,16] or the lnguge model of []). Techniclly, we represent prtil temporl specifictions s symolic utomt, where trnsitions my e leled y vriles representing unknown informtion. Using symolic utomt, we present n strct domin for representing ig code, nd show interesting reltionships etween the prtilness nd the precision of specifiction. In this pper, we focus on representtion, nd opertions over symolic utomt, nd do not ddress the proilistic spects of the prolem. Representing Prtil Specifictions using Symolic Automt We focus on generlized typestte specifictions [, 16]. Such specifictions cpture legl sequences of method invoctions on given API, nd re usully expressed s finite-stte utomt where stte represents n internl stte of the underlying API, nd trnsitions correspond to API method invoctions. Our symolic utomton is conceived in order to represent prtil informtion in specifictions. It is finite-stte mchine where trnsitions my e leled y vriles nd vrile cn e sustituted y letter, word, or regulr lnguge in context sensitive mnner when vrile ppers in multiple strings ccepted y the stte mchine, it cn e replced y different words in ll these strings. An Astrct Domin for Representing Prtil Specifictions One chllenge for forming n strct domin with symolic utomt is to find pproprite opertions tht cpture the sutle interply etween the prtilness nd the precision of specifiction. Let us explin this chllenge using preorder over symolic utomt. When considering non-symolic utomt, we typiclly use the notion of lnguge inclusion to model precision we cn sy tht n utomton A 1 overpproximtes n utomton A when its lnguge includes tht of A. However, this stndrd pproch is not sufficient for symolic utomt, ecuse the use of vriles introduces prtilness s nother dimension for relting the (symolic) utomt. Intuitively, in preorder over symolic utomt, we would like to cpture the notion of symolic utomton A 1 eing more complete thn symolic utomton A when A 1 hs fewer vriles tht represent unknown informtion. In Section, we descrie

3 Symolic Automt for Representing Big Code n interesting interply etween precision nd prtilness, nd define preorder etween symolic utomt, which we lter use s sis for n strct domin of symolic utomt. Consolidting Prtil Specifictions After mining lrge numer of prtil specifictions from code snippets, it is desirle to comine consistent prtil informtion to yield consolidted temporl specifictions. This leds to the question of comining consistent symolic utomt. In Section 7, we show how the join opertion of our strct domin leds to n opertor for consolidting prtil specifictions. Completion of Prtil Specifictions Hving constructed consolidted specifictions, we cn use symolic utomt s queries for code completion nd serch. Treting one symolic utomton s query eing mtched ginst dtse (index) of consolidted specifictions, we show how to use simultion over symolic utomt to find utomt tht mtch the query (Section 5), nd how to use unknown elimintion to find completions of the query utomton (Section 6). Min Contriutions The contriutions of this pper re s follows: We present forml frmework for the nlysis nd representtion of lrge scle codeses. The ide is to provide representtion tht cptures rich semntic informtion ut cn e still constructed efficiently nd enle comprison, indexing, nd other common opertions (e.g., completion) of code. A centrl spect of nlyzing ig code is the prtilness of the code snippets eing nlyzed. To cpture this spect, we formlly define the notion of prtil temporl specifiction sed on new notion of symolic utomt. We explore reltionships etween prtil specifictions long two dimensions: (i) precision of symolic utomt, notion tht roughly corresponds to continment of non-symolic utomt; nd (ii) prtilness of symolic utomt, notion tht roughly corresponds to n utomt hving fewer vriles, which represent unknown informtion. We present n strct domin of symolic utomt where opertions of the domin correspond to key opertors for mnipulting prtil temporl specifictions. We define the opertions required for lgorithms for consolidting two prtil specifictions expressed in terms of our symolic utomt, for compring them, nd for completing certin prtil prts of such specifictions. 1.1 Relted Work Semntic Representtions of Code There hs een lot of work on semntic representtions of code for detecting similrities [11, 5, 1] nd for semntic code serch [16]. Using progrm dependence grphs ws shown effective in compring procedures [1]. Our pproch insted focuses on API clls nd on hndling prtil informtion. Recently, Dvid nd Yhv presented technique for mesuring similrity etween inries using trcelets [8]. Their trcelets re (prtil) sequences of instructions, ut cn e extended to e trcelets of function clls, similr to the liner cse of our prtil utomt. Rychev et l. [] present n elegnt nd efficient semntic represen-

4 Hil Peleg et l. ttion sed on sequences of clls including cll prmeters. Their representtion is prcticl nd efficient, nd generlizes our symolic utomt y including method prmeters. However, it uses explicit priori ounds on the numer nd length of trcked sequences, nd loses the soundness gurntee s result. They lso present code completion technique tht is sed on sttisticl lnguge models. Other semntic notions of code similrity focus on integer progrms [18, 19]. In contrst, we focus on progrms using lirry APIs. Specifiction Mining Sttic specifiction mining techniques (e.g., [, 16,, 7]) hve emerged s wy to otin succinct description of usge scenrios when working with lirry. However, lthough they demonstrted gret prcticl vlue, these techniques do not ddress mny interesting nd chllenging technicl questions. In prticulr, they do not ddress the question of nlyzing lrge numer of code snippets, where ech snippet is (potentilly) prtil progrm. Mishne et l. [16] present prcticl frmework for sttic specifiction mining nd query mtching sed on utomt. Their frmework imposes restrictions on the structure of utomt nd could e viewed s specil cse of the forml frmework introduced in this pper. In contrst to their informl tretment, this pper presents the notion of symolic utomt with n pproprite strct domin. Shohm et l. [] use whole-progrm nlysis to stticlly nlyze clients using lirry. Their pproch is not pplicle in the setting of prtil progrms nd prtil specifiction since they rely on the ility to nlyze the complete progrm for complete lis nlysis nd for type informtion. Mny sttic works on specifiction mining lern specifictions consisting of pirs of events,, where nd re method clls, nd do not consider lrger utomt; Weimer nd Necul [6] use lightweight sttic nlysis to infer such simple specifictions from given codese. Wsylkowski et l. [5] use n intrprocedurl sttic nlysis to utomticlly mine oject usge ptterns nd identify usge nomlies. Their pproch is sed on identifying usge ptterns, in the restricted form of pirs of events, reflecting the order in which events should e used. Grusk et l. [10] considers limited specifictions tht re only pirs of events. The work [] lso mines pirs of events in n ttempt to mine prtil order etween events. Monperrus et l. [17] ttempt to identify missing method clls when using n API y mining codese. Their pproch dels with identicl histories minus k method clls, where histories re modeled s (unordered) sets of method clls. Unlike our pproch, it cnnot hndle incomplete progrms, method cll sequences where the order is importnt, nd histories tht exhiit rnching ehvior Mny dynmic specifiction mining pproches lso extrct vrious forms of temporl specifictions (e.g. [6,,1,7,9,7,15]). Unlike sttic nlysis of prtil code snippets, dynmic pproches do not need to hndle unknown sequences of events, which re key ingredient in our pproch. On the other hnd, ecuse our focus on this pper is on nlysis of code snippets, employing dynmic nlysis would e extremely chllenging. Word Equtions Plndowski [1] solves word equtions, which define equlity conditions on strings with vrile portions. Gnesh et l. [9] extend this work with quntifiers nd dditionl predictes (such s length constrints on the ssignment

5 Symolic Automt for Representing Big Code 5 1 void process(file f) { f.open(); dosomething(f); f.close(); 5 } open X close () () Fig. 1 () Simple code snippet using File. The methods open nd close re API methods, nd the method dosomething is unknown. () Symolic utomton mined from this snippet. The trnsition corresponding to dosomething is represented using the vrile X. Trnsitions corresponding to API methods re leled with method nme. size). Solving such equtions cn e thought of s performing unknown elimintion on symolic words. In comprison to our symolic utomt, word equtions use vriles to represent unknown prts of words rther thn utomt. Unlike our utomt, they cnnot cpture rnching ehvior. In ddition, while word equtions llow ll predicte rguments to hve symolic components, the eqution is solved y completely concrete ssignment, disllowing the concept of ssigning symolic lnguge. More importntly, the ssignments considered re context-free. They mp vriles to words, mking the set of solutions define reltion over the set of words. In contrst, in our work ssignments re context-sensitive. Nmely, the sme vrile cn e ssigned different words (or lnguges) depending on the word tht it ppers in. As result, in our cse set of ssignments (solutions) does not represent simple reltion on words. Insted, it defines set of lnguges, ech of which results from one ssignment. Overview We strt with n informl overview of our pproch y using simple File exmple..1 Illustrtive Exmple Consider the exmple snippet of Fig. 1(). We would like to extrct temporl specifiction tht descries how this snippet uses the File component. The snippet invokes open nd then n unknown method dosomething(f) the code of which is not ville s prt of the snippet. Finlly, it clls close on the component. Anlyzing this snippet using our pproch yields the prtil specifiction of Fig. 1(). Techniclly, this is symolic utomton, where trnsitions corresponding to API methods re leled with method nme, nd the trnsition corresponding to the unknown method dosomething is leled with vrile X. The use of vrile indictes tht some opertions my hve een invoked on the File component etween open nd close, ut tht this opertion or sequence of opertions is unknown. Now consider the specifictions of Fig., otined s the result of nlyzing similr frgments using the File API. Both of these specifictions re more complete thn the specifiction of Fig. 1(). In fct, oth of these utomt do not contin vriles, nd they represent non-prtil temporl specifictions. These three seprte

6 6 Hil Peleg et l. open cnred red close open write close () () Fig. Automt mined from progrms using File to () red fter cnred check; () write. open X cnred write 7 close red close 5 8 close 6 open cnred write 5 red close 6 close () () Fig. () Automton resulting from comining ll known specifictions of the File API, nd () the File API specifictions fter prtil pths hve een susumed y more concrete ones. X red Y X open,cnred Y close () () Fig. () Symolic utomton representing the query for the ehvior round the method red nd () the ssignment to its symolic trnsitions which nswers the query. specifictions come from three pieces of code, ut ll contriute to our knowledge of how the File API is used. As such, we would like to e le to compre them to ech other nd to comine them, nd in the process to eliminte s mny of the unknowns s possile using other, more complete exmples. Our first step is to consolidte the specifictions into more comprehensive specifiction, descriing s much of the API s possile, while losing no ehvior represented y the originl specifictions. Next, we would like to eliminte unknown opertions sed on the extr informtion tht the new temporl specifiction now contins with regrd to the full API. For instnce, where in Fig. 1 we hd no knowledge of wht might hppen etween open nd close, the specifiction in Fig. () suggests it might e either cnred nd red, or write. Thus, the symolic plceholder for the unknown opertion is now no longer needed, nd the pth leding through X ecomes redundnt (s shown in Fig. ()). We my now note tht ll three originl specifictions re still included in the specifiction in Fig. (), even fter the unknown opertion ws removed; the concrete pths re fully there, wheres the pth with the unknown opertion is represented y oth the remining pths. The ility to find the inclusion of one specifiction with unknowns within nother is useful for performing queries. A user my wish to use the File oject in order to red, ut e unfmilir with it. He cn query the specifiction, mrking ny portion he does not know s n unknown opertion, s in Fig. (). As this very prtil specifiction is included in the API s extrcted specifiction (Fig. ()), there will e mtch. Furthermore, we cn deduce wht should replce

7 Symolic Automt for Representing Big Code 7 X red write Y Z 5 (*,X,red) open,cnred (*,X,write) open Y close Z close () () Fig. 5 () Symolic utomton representing the query for the ehvior round red nd write methods nd () the ssignment with contexts to its symolic trnsitions which nswers the query. the symolic portions of the query. This mens the user cn get the reply to his query tht X should e replced y open,cnred nd Y y close. Fig. 5 shows more complex query nd its ssignment. The ssignment to the vrile X is mde up of two different ssignments for the different contexts surrounding X: when followed y write, X is ssigned open, nd when followed y red, X is ssigned the word open,cnred. Even though the rnching point in Fig. () is not identicl to the one in the query, the query cn still return correct result using contexts.. An Astrct Domin of Symolic Automt To provide forml ckground for the opertions we demonstrted here informlly, we define n strct domin sed on symolic utomt. Opertions in the domin correspond to nturl opertors required for effective specifiction mining nd nswering code serch queries. Our strct domin serves dul gol: (i) it is used to represent prtil temporl specifiction during the nlysis of ech individul code snippet; (ii) it is used for consolidtion nd nswering code serch queries cross multiple snippets. In its first role used in the nlysis of single snippet the strct domin cn further employ quotient strction to gurntee tht symolic utomt do not grow without ound due to loops or recursion []. In Section., we show how to otin lttice sed on symolic utomt. In the second role used for consolidtion nd nswering code-serch queries query mtching cn e understood in terms of unknown elimintion in symolic utomt (explined in Section 6), nd consolidtion cn e understood in terms of join in the strct domin, followed y minimiztion (explined in Section 7). Symolic Automt We represent prtil typestte specifictions using symolic utomt defined elow. Definition 1 A deterministic symolic utomton (DSA) is tuple Σ, Q, δ, ι, F, Vrs where: Σ is finite lphet,, c,...; Q is finite set of sttes q, q,...;

8 8 Hil Peleg et l. δ is prtil function from Q (Σ Vrs) to Q, representing trnsition reltion; ι Q is n initil stte; F Q is set of finl sttes; Vrs is finite set of vriles x, y, z,... Our definition mostly follows the stndrd notion of deterministic finite utomt. Two differences re tht trnsitions cn e leled not just y lphet symols ut y vriles, nd tht they re prtil functions, insted of totl ones. Hence, n utomton might get stuck t letter in stte, ecuse the trnsition for the letter t the stte is not defined. We write (q, l, q ) δ for trnsition δ(q, l) = q where q, q Q nd l Σ Vrs. If l Vrs, the trnsition is clled symolic. We extend δ to words over Σ Vrs in the usul wy. Note tht this extension of δ over words is prtil function, ecuse of the prtility of the originl δ. When we write δ(q, s) Q 0 for such words s nd stte set Q 0 in the rest of the pper, we men tht δ(q, s) is defined nd elongs to Q 0. A pth π of DSA is sequence of sttes q 0,..., q n such tht q i Q for every 0 i n, nd for every 0 i n 1, there exists l i Σ Vrs such tht δ(q i, l i ) = q i+1. We then sy tht the symolic word l 0... l n 1 lels the pth π. If, in ddition, q 0 = ι nd q n F, we sy tht π is ccepting. From now on, we fix Σ nd Vrs nd omit them from the nottion of DSA..1 Semntics For DSA A, we define its symolic lnguge, denoted SL(A), to e the set of ll words over Σ Vrs ccepted y A, i.e., SL(A) = {s (Σ Vrs) δ(ι, s) F }. Words over Σ Vrs re clled symolic words, wheres words over Σ re clled concrete words. Similrly, lnguges over Σ Vrs re symolic, wheres lnguges over Σ re concrete. The symolic lnguge of DSA cn e interpreted in different wys, depending on the semntics of vriles: (i) vrile represents sequence of letters from Σ; (ii) vrile represents regulr lnguge over Σ; (iii) vrile represents different sequences of letters from Σ under different contexts. All ove interprettions of vriles, except for the lst, ssign some vlue to vrile while ignoring the context in which the vrile lies. This is not lwys desirle. For exmple, consider the DSA in Fig. 6(). We wnt to e le to interpret x s d when it is followed y, nd to interpret it s e when it is followed y c (Fig. 6()). Motivted y this exmple, we focus here on the lst possiility of interpreting vriles, which lso considers their context. Formlly, we consider the following definitions. Definition A context-sensitive ssignment, or in short ssignment, σ is function from (Σ Vrs) Vrs (Σ Vrs) to the set of nonempty regulr lnguges on (Σ Vrs).

9 Symolic Automt for Representing Big Code 9 x c d e 5 c () () Fig. 6 DSAs () nd (). x d c g c d x 5 6 () () Fig. 7 DSA efore nd fter ssignment When σ mps (s 1, x, s ) to SL, we refer to (s 1, s ) s the context of x. The mening is tht n occurrence of x in the context (s 1, s ) is to e replced y SL (i.e., y ny word from SL). Thus, it is possile to ssign multiple words to the sme vrile in different contexts. The context used in n ssignment is the full context preceding nd following x. In prticulr, it is not restricted in length nd it cn e symolic, i.e., it cn contin vriles. Note tht these ssignments consider liner context of vrile. A more generl definition would consider the rnching context of vrile (or symolic trnsition). Formlly, pplying σ to symolic word ehves s follows. For symolic word s = l 1 l... l n, where l i Σ Vrs for every 1 i n, σ(s) = SL 1 SL... SL n where (i) SL i = {l i } if l i Σ; nd (ii) SL i = SL if l i Vrs is vrile x nd σ(l 1...l i 1, x, l i+1...l n ) = SL. Accordingly, for symolic lnguge SL, σ(sl) = {σ(s) s SL}. Definition An ssignment σ is concrete if its imge consists of concrete lnguges only. Otherwise, it is symolic. If σ is concrete, then σ(sl) is concrete lnguge; wheres if σ is symolic, then σ(sl) cn still e symolic. In the sequel, when σ mps some x to the sme lnguge in severl contexts, we sometimes write σ(c 1, x, C ) = SL s n revition for σ(s 1, x, s ) = SL for every (s 1, s ) C 1 C. We lso write s n revition for (Σ Vrs). Exmple 1 Consider the DSA A from Fig. 6(). Its symolic lnguge is {x, xc}. Now consider the concrete ssignment σ : (, x, ) d, (, x, c ) e. Then σ(x) = {d} nd σ(xc) = {ec}, which mens tht σ(sl(a)) = {d, ec}. If we consider σ : (, x, ) d, (, x, c ) (e ), then σ(x) = d nd σ(xc) = (e ) c, which mens tht σ(sl(a)) = (d ) ((e ) c).

10 10 Hil Peleg et l. Exmple Consider the DSA A depicted in Fig. 7() nd consider the symolic ssignment σ which mps (, x, ) to g, nd mps x in ny other context to x. The ssignment is symolic since in ny incoming context other thn, x is ssigned x. Then Fig. 7() presents DSA for σ(sl(a)). Completions of DSA Ech concrete ssignment σ to DSA A results in some completion of SL(A) into lnguge over Σ (c.f. Exmple 1). We define the semntics of DSA A, denoted A, s the set of ll lnguges over Σ otined y concrete ssignments: A = {σ(sl(a)) σ is concrete ssignment}. We cll A the set of completions of A. For exmple, for the DSA from Fig. 6(), {d, ec} A (see Exmple 1). Note tht if DSA A hs no symolic trnsition, i.e. SL(A) Σ, then A = {SL(A)}. Remrk 1 The interested reder might wonder out the expressive power of the lnguges ccepted y symolic utomt, nd out their closure properties. We note, however, tht when tlking out symolic utomt, our min interest is not their lnguges ut their sets of completions. The lnguge of DSA is simply regulr lnguge (over n lphet extended y the set of vriles). The set of completions of DSA, on the other hnd, is not single lnguge, ut set of lnguges. In this sense, the discussion of expressive power nd closure properties in their trditionl formultion is not nturl. An Astrct Domin for Specifiction Mining In this section we ly the ground for defining common opertions on DSAs y defining preorder on DSAs. In lter sections, we use this preorder to define n lgorithm for query mtching (Section 5), completion of prtil specifiction (Section 6), nd consolidtion of multiple prtil specifiction (Section 7). The definition of preorder over DSAs is motivted y two concepts. The first is precision. We re interested in cpturing tht one DSA is n overpproximtion of nother, in the sense of descriing more ehviors (sequences) of n API. When DFAs re considered, lnguge inclusion is suitle for cpturing precision (strction) reltion etween utomt. The second is prtilness. We would like to cpture tht DSA is more complete thn nother in the sense of hving less vriles tht stnd for unknown informtion..1 Preorder on DSAs Our preorder comines precision nd prtilness. Since the notion of prtilness is less stndrd, we first explin how it is cptured for symolic words. The considertion of symolic words rther thn DSAs llows us to ignore the dimension of precision nd focus on prtilness, efore we comine the two.

11 Symolic Automt for Representing Big Code 11 x 0 1 x c () () x d e (c) (d) Fig. 8 Dimensions of the preorder on DSAs.1.1 Preorder on Symolic Words Definition Let s 1, s e symolic words. s 1 s if for every concrete ssignment σ to s, there is concrete ssignment σ 1 to s 1 such tht σ 1 (s 1 ) = σ (s ). This definition cptures the notion of symolic word eing more concrete or more complete thn nother: Intuitively, the property tht no mtter how we fill in the unknown informtion in s (using concrete ssignment), the sme completion cn lso e otined y filling in the unknowns of s 1, ensures tht every unknown of s is lso unknown in s 1 (which cn e filled in the sme wy), ut s 1 cn hve dditionl unknowns. Thus, s hs no more unknowns thn s 1. In prticulr, {σ(s 1 ) σ is concrete ssignment} {σ(s ) σ is concrete ssignment}. Note tht when considering two concrete words w 1, w Σ (i.e., without ny vrile), w 1 w iff w 1 = w. In this sense, the definition of over symolic words is relxtion of equlity over words. For exmple, xcd yd ccording to our definition. Intuitively, this reltionship holds ecuse xcd is more complete (crries more informtion) thn yd..1. Symolic Inclusion of DSAs We now define the preorder over DSAs tht comines precision with prtilness. On the one hnd, we sy tht DSA A is igger thn A 1, if A descries more possile ehviors of the API, s cptured y stndrd utomt inclusion. For exmple, see the DSAs () nd () in Fig. 8. On the other hnd, we sy tht DSA A is igger thn A 1, if A descries more complete ehviors, in terms of hving less unknowns. For exmple, see the DSAs (c) nd (d) in Fig. 8. However, these exmples re simple in the sense of seprting the precision nd the prtilness dimensions. Ech of these exmples demonstrtes one dimension only. We re lso interested in hndling cses tht comine the two, such s cses where A 1 nd A represent more thn one word, thus the notion of completeness of symolic words lone is not pplicle, nd in ddition the lnguge of A 1 is not included in the lnguge of A per se, e.g., since some of the words in A 1 re less complete thn those of A. This leds us to the following definition.

12 1 Hil Peleg et l. Definition 5 (symolic-inclusion) A DSA A 1 is symoliclly-included in DSA A, denoted y A 1 A, if for every concrete ssignment σ of A there exists concrete ssignment σ 1 of A 1, such tht σ 1 (SL(A 1 )) σ (SL(A )). The ove definition ensures tht for ech concrete lnguge L tht is completion of A, A 1 cn e ssigned in wy tht will result in its lnguge eing included in L. This mens tht the concrete prts of A 1 nd A dmit the inclusion reltion, nd A is more concrete thn A 1. Note tht A 1 is symoliclly-included in A iff for every L A there exists L 1 A 1 such tht L 1 L. Exmple The DSA depicted in Fig. 6() is symoliclly-included in the one depicted in Fig. 6(), since for ny ssignment σ to (), the ssignment σ 1 to () tht will yield lnguge included in the lnguge of () is σ : (, x, ) d, (, x, c ) e. Note tht if we hd considered ssignments to vrile without context, the sme would not hold: If we ssign to x the sequence d, the word dc from the ssigned () will remin unmtched. If we ssign e to x, the word e will remin unmtched. If we ssign to x the lnguge d e, then oth of the ove words will remin unmtched. Therefore, when considering context-free ssignments, there is no suitle ssignment σ 1. Theorem 1 is reflexive nd trnsitive. Proof Reflexivity: Let A e DSA. Then A A since for every concrete ssignment σ to A, σ will fulfill σ(sl(a)) σ(sl(a)). Trnsitivity: Let A 1, A, A e DSAs such tht A 1 A nd A A. We show tht A 1 is symoliclly-included in A. Let σ e concrete ssignment to A. Then there exists concrete ssignment σ to A such tht σ (SL(A )) σ (SL(A )) (since A A ). Similrly, since A 1 A, then in prticulr for σ there exists concrete ssignment σ 1 to A 1 such tht σ 1 (SL(A 1 )) σ (SL(A )). By trnsitivity of lnguge inclusion, we get tht σ 1 (SL(A 1 )) σ (SL(A ))..1. Structurl Inclusion As sis for n lgorithm for checking if symolic-inclusion holds etween two DSAs, we note tht provided tht ny lphet Σ cn e used in ssignments, the following definition is equivlent to Definition 5. Definition 6 A 1 is structurlly-included in A if there exists symolic ssignment σ to A 1 such tht σ(sl(a 1 )) SL(A ). We sy tht σ witnesses the structurl inclusion of A 1 in A. Theorem Let A 1, A e DSAs. Then A 1 in A. A iff A 1 is structurlly-included Before we turn to prove Theorem, we note tht if there is witnessing ssignment for structurl inclusion, then in prticulr there exists one tht ssigns single sequence to ech vrile in every context, s formlized y the following lemm. It will e used in the proof of Theorem.

13 Symolic Automt for Representing Big Code 1 Lemm 1 A 1 is structurlly-included in A iff there exists symolic ssignment σ to A 1, such tht σ(sl(a 1 )) SL(A ), nd the imge of σ consists of singleton lnguges only. Proof Suppose there exists symolic ssignment σ to A 1, such tht σ(sl(a 1 )) SL(A ). We show tht there is lso such n ssignment σ whose imge consists of singleton lnguges only. We define σ y ritrrily keeping one sequence from ech lnguge prticipting in the imge of σ. Clerly, σ (SL(A 1 )) σ(sl(a 1 )) SL(A ). Proof (Theorem ) We first show tht structurl inclusion implies symolic-inclusion. Suppose A 1 is structurlly included in A. Then there exists symolic ssignment σ such tht σ(sl(a 1 )) SL(A ), i.e., for every s SL(A 1 ), σ(s) SL(A ). Without loss of generlity (y Lemm 1), σ ssigns to ech vrile single sequence under ech context. To show tht A 1 A, consider some ssignment σ of A. To find corresponding ssignment σ 1 to A 1 such tht σ 1 (SL(A 1 )) σ (SL(A )), we consider the composition of σ on σ, defined s follows. Let 1,,..., n, 1,,..., m Σ Vrs nd x Vrs. Then σ 1 ( 1... n, x, 1... m ) = {w 1 }L 1 {w }... {w k }L k {w k+1 } where w 1, w,..., w k+1 Σ, L 1,..., L k Σ, nd there exist x 1,..., x k Vrs nd s 1,..., s n, s 1,... s m (Σ Vrs) such tht: σ( 1... n, x, 1... m ) = {w 1 x 1 w... w k x k w k+1 }. For every 1 i n such tht i Vrs: σ( 1... i 1, i, i+1... n x 1... m ) = {s i }. For every 1 i n such tht i Σ: s i = i. For every 1 i m such tht i Vrs: σ( 1... n x 1... i 1, i, i+1... m ) = {s i }. For every 1 i m such tht i Σ: s i = i. For every 1 i k: σ (s 1... s n w 1 x 1... w i, x i, w i+1... x k w k+1 s 1... s m) = L i. In other words, if σ( 1... n x 1... m ) = {s 1... s n w 1 x 1 w... w k x k w k+1 s 1... s m} σ (s 1... s n w 1 x 1 w... w k x k w k+1 s 1... s m) = L {w 1 }L 1 {w }... {w k }L k {w k+1 }L, then σ 1 ( 1... n x 1... m ) is lso equl to L {w 1 }L 1 {w }... {w k }L k {w k+1 }L. Therefore, the definition of σ 1 ensures for every symolic word s SL(A 1 ), σ 1 (s) = σ (σ(s)). Since σ(s) SL(A ) (y the choice of σ), we conclude tht σ 1 (s) = σ (σ(s)) σ (SL(A )), nd hence σ 1 (SL(A 1 )) σ (SL(A )). We now show tht symolic-inclusion implies structurl inclusion. Suppose A 1 A. For ech vrile x Vrs, we introduce new letter x Σ. We now define n

14 1 Hil Peleg et l. x x () () Fig. 9 Exmple for cse where symolic-inclusion does not imply structurl inclusion f y 9 10 x c g f g y 9 10 d e 5 c f g 7 8 () () d e h z h c d e c 6 c 10 f g f g d 8 17,c 18 g f 19 0 (c) Fig. 10 Exmple for cse where there is no ssignment to either () or () to show () () or () (), nd where there is such n ssignment for () so tht () (c). ssignment σ to A such tht for every x Vrs, nd for every s 1, s (Σ Vrs), σ : (s 1, x, s ) x. Since A 1 A, there exists n ssignment σ 1 such tht σ 1 (SL(A 1 )) σ (SL(A )). To otin symolic ssignment σ to A 1 such tht σ(sl(a 1 )) SL(A ), we replce every occurrence of x in σ 1 y x. To understnd why we need to consider ssignments over n extended lphet Σ, consider the following exmple. Exmple Consider the DSAs in Fig. 9. Structurl inclusion does not hold etween () nd (). However, when considering ssignments over Σ = {} only, () is symoliclly-included in (). Note tht if we lso consider ssignments over, sy {, }, then, symolic-inclusion ceses to hold s well, regining the reltion etween structurl inclusion nd symolic-inclusion. The corollry elow provides nother sufficient condition for symolic-inclusion: Corollry 1 If SL(A 1 ) SL(A ), then A 1 A.

15 Symolic Automt for Representing Big Code 15 d d e 5 c e x 5 6 c 7 () () Fig. 11 Equivlent DSAs w.r.t. symolic-inclusion Exmple 5 The DSA depicted in Fig. 10() is not symoliclly-included in the one depicted in Fig. 10() since no symolic ssignment to () will sustitute the symolic word xg y (symolic) word (or set of words) in (). This is ecuse ssignments cnnot drop ny of the contexts of vrile (e.g., the outgoing g context of x). Such ssignments re undesirle since removl of contexts mounts to removl of oserved ehviors. On the other hnd, the DSA depicted in Fig. 10() is symoliclly-included in the one depicted in Fig. 10(c), since there is witnessing ssignment tht mintins ll the contexts of x, nmely, σ : (, x, ) d, (, x, cf ) h, (, x, cg ) eh e, (y, x, ) d, (, y, ) zd. Assigning σ to () results in DSA whose symolic lnguge is strictly included in the symolic lnguge of (c). Note tht symolic-inclusion holds despite of the fct tht in (c) there is no longer stte with n incoming c event nd oth n outgoing f nd n outgoing g events while eing rechle from the stte 1. This exmple demonstrtes our interest in liner ehviors, rther thn in rnching ehvior. Note tht in this exmple, symolic-inclusion would not hold if we did not llow to refer to contexts of ny length (nd in prticulr length > 1).. A Lttice for Specifiction Mining As stted in Theorem 1, is reflexive nd trnsitive, nd therefore preorder. However, it is not ntisymmetric. This is not surprising, since for DFAs collpses into stndrd utomt inclusion, which is lso not ntisymmetric (due to the existence of different DFAs with the sme lnguge). In the cse of DSAs, symolic trnsitions re n dditionl source of prolem, s demonstrted y the following exmple. Exmple 6 The DSAs in Fig. 11 stisfy in oth directions even though their symolic lnguges re different. DSA () is trivilly symoliclly-included in () since the symolic lnguge of () is suset of the symolic lnguge of () (see Corollry 1). For the other direction, symolic inclusion holds y Lemm 1 when ssigning d to x in () in ny context. Exmining the exmple closely shows tht the reson tht symolic-inclusion holds in oth directions even though the DSAs hve different symolic lnguges is tht the symolic lnguge of DSA () contins the symolic word x, s well s the concrete word d, which is completion of x. In this sense, x is susumed y the rest of the DSA, which mounts to DSA ().

16 16 Hil Peleg et l. In order to otin prtil order, we follow stndrd construction of turning preordered set to prtilly ordered set. We first define the following equivlence reltion sed on : Definition 7 DSAs A 1 nd A re symoliclly-equivlent, denoted y A 1 A, iff A 1 A nd A A 1. Theorem is n equivlence reltion over the set DSA of ll DSAs. We now lift the discussion to the quotient set DSA/, which consists of the equivlence clsses of DSA w.r.t. the equivlence reltion. Definition 8 Let [A 1 ], [A ] DSA/. Then [A 1 ] [A ] if A 1 A. Theorem is prtil order over DSA/. Proof Reflexivity nd trnsitivity follow immeditely from the properties of over DSAs. We prove tht unlike the ltter, over DSA/ is lso ntisymmetric. Suppose [A 1 ] [A ] nd [A ] [A 1 ]. Then A 1 A nd vice vers, hence A 1 A, nd thus [A 1 ] = [A ]. Definition 9 For DSAs A 1 nd A, we use union(a 1, A ) to denote union DSA for A 1 nd A, which is defined similrly to the definition of union of DFAs. Tht is, union(a 1, A ) is DSA such tht SL(union(A 1, A )) = SL(A 1 ) SL(A ). Theorem 5 Let [A 1 ], [A ] DSA/ nd let union(a 1, A ) e union DSA for A 1 nd A. Then [union(a 1, A )] is the lest upper ound of [A 1 ] nd [A ] w.r.t.. Proof We first show tht [union(a 1, A )] [A i ] for every i {1, }. This follows since SL(union(A 1, A )) = SL(A 1 ) SL(A ) SL(A i ) nd hence y Corollry 1, A i union(a 1, A ). We now show tht if [A] [A i ] for every i {1, }, then [A] [union(a 1, A )]. It suffices to show tht union(a 1, A ) A. Since [A] [A i ], we conclude tht A i A. Consider concrete ssignment σ to A. Since [A] [A i ], A i A, thus there exists n ssignment σ i such tht σ i (SL(A i )) σ(sl(a)). This mens tht σ 1 (SL(A 1 )) σ (SL(A )) σ(sl(a)). Consider the ssignment σ to union(a 1, A ) otined y σ 1 σ, where σ 1 is identicl to σ 1, except tht it is undefined for symolic words in SL(A ), nd σ is identicl to σ, except tht it is defined only for symolic words in SL(A ). This ensures tht the ssignment is well defined. In ddition, σ (SL(union(A 1, A )) = σ (SL(A 1 ) SL(A )) We conclude tht union(a 1, A ) A. = σ (SL(A 1 )) σ (SL(A )) Corollry (DSA/, ) is join semi-lttice. = σ 1 (SL(A 1 ) \ SL(A )) σ (SL(A )) σ 1 (SL(A 1 )) σ (SL(A )) σ(sl(a)). The element in the lttice is the equivlence clss of DSA for. The element is the equivlence clss of DSA for Σ.

17 Symolic Automt for Representing Big Code 17 5 Query Mtching Using Symolic Simultion Given query in the form of DSA, nd dtse of other DSAs, query mtching ttempts to find DSAs in the dtse tht symoliclly include the query DSA. In this section, we descrie notion of simultion for DSAs, which precisely cptures the preorder on DSAs nd serves sis of core lgorithms for mnipulting symolic utomt. In prticulr, in Section 5., we provide n lgorithm for computing symolic simultion tht cn e directly used to determine when symolic inclusion holds. 5.1 Symolic Simultion Let A 1 nd A e DSAs Q 1, δ 1, ι 1, F 1 nd Q, δ, ι, F, respectively. Definition 10 A reltion H Q 1 ( Q \ { }) is symolic simultion from A 1 to A if it stisfies the following conditions: () (ι 1, {ι }) H; () for every (q, B) H, if q is finl stte, some stte in B is finl; (c) for every (q, B) H nd q Q 1, if q = δ 1 (q, ) for some Σ, B s.t. (q, B ) H B {q q B s.t. q = δ (q, )}; (d) for every (q, B) H nd q Q 1, if q = δ 1 (q, x) for x Vrs, B s.t. (q, B ) H B {q q B s.t. q is rechle from q }. We sy tht (q, B ) in the third or fourth item ove is witness for ((q, B), l), or n l-witness for (q, B) for l Σ Vrs. Finlly, A 1 is symoliclly simulted y A if there exists symolic simultion H from A 1 to A. In this definition, stte q of A 1 is simulted y nonempty set B of sttes from A, with the mening tht their union overpproximtes ll of its outgoing ehviors. In other words, the role of q in A 1 is split mong the sttes of B in A. A split rises from symolic trnsitions, ut the split of the trget of symolic trnsition cn e propgted forwrd for ny numer of steps, thus llowing sttes to e simulted y sets of sttes even if they re not the trget of symolic trnsition. This ccounts for splitting tht is performed y n ssignment with context longer thn one. Note tht since we consider deterministic symolic utomt, the sizes of the sets used in the simultion re monotoniclly decresing, except for when trget of symolic trnsition is considered, in which cse the set increses in size. Note tht stte q 1 of A 1 cn prticipte in more thn one simultion pir in the computed simultion, s demonstrted y the following exmple. Exmple 7 Consider the DSAs in Fig. 10() nd (c). In this cse, the simultion will e H = { (0, {0}), (1, {1}), (, {, 6, 9}), (, {}), (, {, 10}), (5, {7}), (6, {1}) (7, {11}), (8, {8}), (9, {1}), (10, {15}), (1, {16}), (, {17}), (, {18}), (7, {0}), (8, {19}), (, {18}), (5, {0}), (6, {19}) }.

18 18 Hil Peleg et l. One cn see tht stte in (), which is the trget of the trnsition leled x, is split etween sttes, 6 nd 9 of (c). In the next step, fter seeing from stte in (), the trget stte reched (stte ) is simulted y singleton set. On the other hnd, fter seeing c from stte in (), the trget stte reched (stte ), is still split, however this time to only two sttes: nd 10 in (c). In the next step, no more splitting occurs. Note tht the stte 1 in () is simulted oth y {1} nd y {16}. Intuitively, ech of these sets simultes the stte 1 in nother incoming context ( nd, respectively). Theorem 6 (Soundness) For ll DSAs A 1 nd A, if there is symolic simultion H from A 1 to A, then A 1 A. Our proof of this theorem uses Theorem nd constructs desired symolic ssignment σ tht witnesses structurl inclusion of A 1 in A explicitly from H. This construction shows, for ny symolic word in SL(A 1 ), the ssignment (completion) to ll vriles in it (in the corresponding context). Tken together with our next completeness theorem (Theorem 7), this construction supports view tht symolic simultion serves s finite representtion of symolic ssignment in the preorder. We develop this further in Section 6. Proof Let H e symolic simultion from A 1 to A. We show tht there exists symolic ssignment σ such tht σ(sl(a 1 )) SL(A ). Recll tht σ(sl(a 1 )) = {σ(s) s SL(A1 )}. Hence, it suffices to find σ such tht for every s SL(A 1 ), σ(s) SL(A ). Let s = l 1... l n SL(A 1 ) where ech l i Σ Vrs. First, we define sequence of simultion pirs h = h 0... h n, where h 0 = (ι 1, {ι }), nd for every 1 i n, h i H is n l i -witness for h i 1. The sequence is well defined, since y the definition of H, corresponding witness lwys exists. Note tht if severl witnesses exist, one of them is chosen ritrrily. The ide is to trck symolic word in A tht mtches s, up to vriles, y following the simultion pirs in h. For this purpose, we first need to minimize the simultion pirs to include only sttes tht re relevnt to the simultion of the prticulr word s. Once this is done, ny pth in A through h defines symolic word tht mtches s up to vriles, nd ccordingly defines possile ssignment for the vriles in s. We therefore first pply the following minimiztion lgorithm on h: The lgorithm updtes the second component of the h i s from i = n 1 to i = 0. For h n = (q n, B n ), we remove ll non-finl sttes from B n, nd set B n to e the set of the remining ones in B n. Suppose h i = (q i, B i ) nd h i+1 = (q i+1, B i+1 ), where h i+1 is n l i+1 -witness for h i. Let B i+1 denote the updted B i+1. Then we remove from B i ll the sttes tht contriuted no sttes to B i+1. More specificlly: If l i+1 Σ, we let B i = {q B i q B i+1 s.t. δ (q, l i+1 ) = q }. If l i+1 Vrs, we let B i = {q B i q B i+1 s.t. q is rechle from q }. Importntly, no set ecomes empty s result of the updte, since B i+1 B i+1, which ensures tht every stte in B i+1 is contriuted y t lest one stte in B i. Once h is minimized s ove, we greedily choose sttes q 0 = ι B 0, q 1 B 1,..., q n B n

19 Symolic Automt for Representing Big Code 19 such tht if l i Σ, then δ (qi 1, l i) = qi, nd otherwise, q i is rechle from qi 1. This choice of sttes defines symolic word s tht mtches s up to vriles. Moreover, given this choice, for l i Vrs, ech symolic word s i such tht δ (qi 1, s i) = qi is possile mtch for l i. We denote such s i y mtch(l i ). Now we re redy to define the desired symolic ssignment σ. Suppose tht s = w 1 x 1 w... w k x k w k+1 where x 1,..., x k Vrs nd w 1,..., w k+1 Σ. We let s j nd s j e the following prefix nd suffix of s: s j = w 1 x 1 w... w j 1 x j 1 w j, s j = w j+1 x j+1 w j+... w k x k w k+1. Then for every 1 j n, σ(s j, x j, s j) = {mtch(x j )}. We now get tht σ(s) = {s } SL(A ), s required. Theorem 7 (Completeness) For ll DSAs A 1 nd A, if A 1 A, then there is symolic simultion H from A 1 to A. Proof Let σ e symolic ssignment such tht σ(sl(a 1 )) SL(A ). Without loss of generlity, we cn ssume tht σ mps ll vriles nd contexts to singleton lnguges. This is ecuse otherwise, we cn reduce σ such tht the resulting ssignment stisfy this singleton-lnguge requirement. Since σ stisfies this singleton-lnguge requirement, we hve tht for every s SL(A 1 ), σ(s) = {s } for some s SL(A ). We define symolic simultion H from A 1 to A sed on σ. We strt y removing from A 1 nd A ll sttes tht do not lie on ny pth from n initil to n ccepting stte. In the rest of the proof, y A 1 nd A, we refer to the pruned DSAs. Clerly, SL(A 1 ) nd SL(A ) remin unchnged, thus σ(sl(a 1 )) SL(A ) still holds. The ide is to consider for ech stte q 1 Q 1, ll the symolic words tht led to it in A 1, nd for ech of these words s, let q 1 e simulted y the set of ll sttes in A reched when trversing the symolic words resulting from ssigning σ to s when considering s s prefix of s, for every s SL(A 1 ) tht extends s. The considertion of s is needed since the result of pplying σ on vrile depends on the (full) context. Techniclly, for symolic word s = l 1 l... l n, nd for 0 k n we define σ k (s ), which descries the result of ssigning σ to the prefix of s of length k while tking into considertion the full context sed on s. The symolic word σ k (s ) is defined s follows. σ k (l 1 l... l n ) = L 1 L... L k where L i = {l i } if l i Σ, nd L i = L if l i is vrile x nd σ(l 1...l i 1, x, l i+1...l n ) = L. In prticulr, for k = 0, σ 0 (l 1 l... l n ) = {ɛ}, nd for k = n, σ n (l 1 l... l n ) = σ(l 1 l... l n ). Since we consider n ssignment σ whose imge consists of singleton lnguges, we use the nottion nd write σ(s) or σ k (s) s shorthnd for the single word in the set. For symolic word s, we define ext(s) = {s SL(A 1 ) s is prefix of s },

20 0 Hil Peleg et l. which denotes the set of extensions of s in SL(A 1 ). Also, if s is of length k, we define B(s) = {δ (ι, σ k (s )) Q s ext(s)} to e the set of sttes reched in A when following the symolic word otined y pplying σ to some extension s of s up to the k th symol. Let H = {(δ 1 (ι 1, s), B(s)) s (Σ Vrs) δ 1 (ι 1, s) is defined}. In the rest of the proof, we will show tht H is symolic simultion from A 1 to A. The first requirement to check is tht H Q 1 ( Q \{ }). Pick s (Σ Vrs) such tht δ 1 (ι 1, s) is defined. Since we keep only those sttes in A 1 tht lie in pth from the initil stte to n ccepting stte, there exists t lest one s ext(s) SL(A 1 ), which in turn implies tht σ(s ) SL(A ). Then, δ (ι, σ k (s )) is defined, nd elongs to B(s). Hence, B(s). The next requirement is tht (ι 1, {ι }) H. This holds ecuse δ 1 (ι 1, ɛ) = ι 1 B(ɛ) = {δ (ι, σ 0 (s )) Q s ext(ɛ)} = {ι }. To prove the remining requirements, consider (q 1, B ) H. We show tht ll the remining requirements of symolic simultion hold. Let s (Σ Vrs) e symolic word such tht q 1 = δ 1 (ι 1, s), nd B = B(s). Also, let k e the length of s. If q 1 is finl stte, the word s is in SL(A 1 ). Hence, in this cse, σ(s) SL(A ), so δ(ι, σ(s)) is finl stte. But, δ(ι, σ(s)) B(s). It mens tht B(s) contins finl stte, s required. Suppose δ 1 (q 1, ) = q 1 for some Σ. Then (δ 1 (ι 1, s), B(s)) H is n - witness for (q 1, B ). First, δ 1 (ι 1, s) = δ 1 (δ 1 (ι 1, s), ) = δ 1 (q 1, ) = q 1, nd hence it is defined. It remins to show tht B(s) {δ (q, ) q B } = {δ (q, ) q B(s)}. Let q B(s). We need to show tht q {δ (q, ) q B(s)}, i.e. tht there exists q B(s) such tht q = δ (q, ). Since q B(s), there exists s ext(s) such tht q = δ (ι, σ k+1 (s )) = δ (ι, σ k (s )) = δ (δ (ι, σ k (s )), ). Moreover, s ext(s) since ext(s) ext(s). So, δ (ι, σ k (s )) B(s). Thus for q = δ (ι, σ k (s )) B(s), we hve tht q = δ (q, ). Suppose δ 1 (q 1, x) = q 1 for some x Vrs. Then (δ 1 (ι 1, sx), B(sx)) H is n x-witness for (q 1, B ). First, δ 1 (ι 1, sx) = δ 1 (δ 1 (ι 1, s), x) = δ 1 (q 1, x) = q 1. Hence it is defined. It remins to show tht every stte q B(sx) is rechle in A from some q B(sx). Let q B(sx). Since q B(sx), there exist s ext(sx) nd s x such tht q = δ (ι, σ k+1 (s )) = δ (ι, σ k (s )s x ) = δ (δ (ι, σ k (s )), s x ).

21 Symolic Automt for Representing Big Code 1 Moreover, s ext(s) since ext(sx) ext(s). So, δ (ι, σ k (s )) B(s). Thus for q = δ (ι, σ k (s )) B(s), we hve tht q = δ (q, s x ), which mens q is rechle from q. 5. Algorithm for Checking Simultion A mximl symolic simultion reltion cn e computed using gretest fixpoint lgorithm (similrly to the stndrd simultion). A nive implementtion would consider ll sets in Q, mking it exponentil. More efficiently, we otin symolic simultion reltion H y n lgorithm tht trverses oth DSAs simultneously, strting from (ι 1, {ι }), similrly to computtion of product utomton (see Algorithm 1). For ech pir (q 1, B ) tht we explore, we mke sure tht if q 1 F 1, then B F. If this is not the cse, the pir cnnot e prt of the simultion. Otherwise, we trverse ll the outgoing trnsitions of q 1, nd for ech one, we look for witness in the form of nother simultion pir, s required y Definition 10 (see elow). If witness is found, it is dded to the list of simultion pirs tht need to e explored. If no witness is found, the pir (q 1, B ) cnnot prticipte in the simultion. Consider cndidte simultion pir (q 1, B ). When looking for witness for some trnsition of q 1, crucil oservtion is tht if some set B Q simultes stte q 1 Q 1, then ny superset of B lso simultes q 1. Therefore, s witness we dd the mximl set tht fulfills the requirement: if we fil to prove tht q 1 is simulted y the mximl cndidte for B, then we will lso fil with ny other cndidte, mking it unnecessry to check. Specificlly, for n -trnsition, where Σ, from q 1 to q 1, the witness is (q 1, B ) where B = {q q B s.t. q = δ (q, )}. If B =, then no witness exists. For symolic trnsition from q 1 to some q 1, the witness is (q 1, B ) where B is the set of ll sttes rechle from the sttes in B (note tht B s it contins t lest the sttes of B ). If it turns out tht some explored pir (q 1, B ) cnnot prticipte in the simultion (either due to the finl sttes or ecuse no witness ws found for some trnsition), we conclude tht no simultion exists. This is ecuse the potentil witnesses tht we consider in ech step re mximl. As result, when it turns out tht witness cnnot e in the simultion, we conclude tht no witness exists. This goes ck ll the wy to the initil pir (ι 1, {ι }). If no more pirs re to e explored, the lgorithm concludes tht there is symolic simultion, nd it is returned. As n optimiztion, if simultion pir (q 1, B ) is to e dded s witness for some pir where B B nd (q 1, B ) H, we cn utomticlly conclude tht (q 1, B ) will lso e verified. 1 We therefore do not need to explore it. The witnesses of (q 1, B ) will lso serve s witnesses for (q 1, B ). Note tht in this cse, the otined witnesses re not mximl. Alterntively, it is possile to simply use (q 1, B ) insted of (q 1, B ). Since this optimiztion dmges the mximlity of the witnesses, it is not used when mximl witnesses re desired (e.g., when looking for ll possile 1 Similr optimiztions hve een used in the ntichin-sed lgorithms for checking utomt inclusion [8,1].

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz University of Southern Cliforni Computer Science Deprtment Compiler Design Fll Lexicl Anlysis Smple Exercises nd Solutions Prof. Pedro C. Diniz USC / Informtion Sciences Institute 4676 Admirlty Wy, Suite

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Model Reduction of Finite State Machines by Contraction

Model Reduction of Finite State Machines by Contraction Model Reduction of Finite Stte Mchines y Contrction Alessndro Giu Dip. di Ingegneri Elettric ed Elettronic, Università di Cgliri, Pizz d Armi, 09123 Cgliri, Itly Phone: +39-070-675-5892 Fx: +39-070-675-5900

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Converting Regular Expressions to Discrete Finite Automata: A Tutorial Converting Regulr Expressions to Discrete Finite Automt: A Tutoril Dvid Christinsen 2013-01-03 This is tutoril on how to convert regulr expressions to nondeterministic finite utomt (NFA) nd how to convert

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016 CS125 Lecture 12 Fll 2016 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

The size of subsequence automaton

The size of subsequence automaton Theoreticl Computer Science 4 (005) 79 84 www.elsevier.com/locte/tcs Note The size of susequence utomton Zdeněk Troníček,, Ayumi Shinohr,c Deprtment of Computer Science nd Engineering, FEE CTU in Prgue,

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

Regular expressions, Finite Automata, transition graphs are all the same!!

Regular expressions, Finite Automata, transition graphs are all the same!! CSI 3104 /Winter 2011: Introduction to Forml Lnguges Chpter 7: Kleene s Theorem Chpter 7: Kleene s Theorem Regulr expressions, Finite Automt, trnsition grphs re ll the sme!! Dr. Neji Zgui CSI3104-W11 1

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

State Minimization for DFAs

State Minimization for DFAs Stte Minimiztion for DFAs Red K & S 2.7 Do Homework 10. Consider: Stte Minimiztion 4 5 Is this miniml mchine? Step (1): Get rid of unrechle sttes. Stte Minimiztion 6, Stte is unrechle. Step (2): Get rid

More information

1 From NFA to regular expression

1 From NFA to regular expression Note 1: How to convert DFA/NFA to regulr expression Version: 1.0 S/EE 374, Fll 2017 Septemer 11, 2017 In this note, we show tht ny DFA cn e converted into regulr expression. Our construction would work

More information

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont. NFA DFA Exmple 3 CMSC 330: Orgniztion of Progrmming Lnguges NFA {B,D,E {A,E {C,D {E Finite Automt, con't. R = { {A,E, {B,D,E, {C,D, {E 2 Equivlence of DFAs nd NFAs Any string from {A to either {D or {CD

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

CM10196 Topic 4: Functions and Relations

CM10196 Topic 4: Functions and Relations CM096 Topic 4: Functions nd Reltions Guy McCusker W. Functions nd reltions Perhps the most widely used notion in ll of mthemtics is tht of function. Informlly, function is n opertion which tkes n input

More information

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers 80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES 2.6 Finite Stte Automt With Output: Trnsducers So fr, we hve only considered utomt tht recognize lnguges, i.e., utomt tht do not produce ny output on ny input

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Bases for Vector Spaces

Bases for Vector Spaces Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything

More information

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3 2 The Prllel Circuit Electric Circuits: Figure 2- elow show ttery nd multiple resistors rrnged in prllel. Ech resistor receives portion of the current from the ttery sed on its resistnce. The split is

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

Lecture 3: Equivalence Relations

Lecture 3: Equivalence Relations Mthcmp Crsh Course Instructor: Pdric Brtlett Lecture 3: Equivlence Reltions Week 1 Mthcmp 2014 In our lst three tlks of this clss, we shift the focus of our tlks from proof techniques to proof concepts

More information

DFA minimisation using the Myhill-Nerode theorem

DFA minimisation using the Myhill-Nerode theorem DFA minimistion using the Myhill-Nerode theorem Johnn Högerg Lrs Lrsson Astrct The Myhill-Nerode theorem is n importnt chrcteristion of regulr lnguges, nd it lso hs mny prcticl implictions. In this chpter,

More information

GNFA GNFA GNFA GNFA GNFA

GNFA GNFA GNFA GNFA GNFA DFA RE NFA DFA -NFA REX GNFA Definition GNFA A generlize noneterministic finite utomton (GNFA) is grph whose eges re lele y regulr expressions, with unique strt stte with in-egree, n unique finl stte with

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

Lecture 9: LTL and Büchi Automata

Lecture 9: LTL and Büchi Automata Lecture 9: LTL nd Büchi Automt 1 LTL Property Ptterns Quite often the requirements of system follow some simple ptterns. Sometimes we wnt to specify tht property should only hold in certin context, clled

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science CSCI 340: Computtionl Models Kleene s Theorem Chpter 7 Deprtment of Computer Science Unifiction In 1954, Kleene presented (nd proved) theorem which (in our version) sttes tht if lnguge cn e defined y ny

More information

Centrum voor Wiskunde en Informatica REPORTRAPPORT. Supervisory control for nondeterministic systems

Centrum voor Wiskunde en Informatica REPORTRAPPORT. Supervisory control for nondeterministic systems Centrum voor Wiskunde en Informtic REPORTRAPPORT Supervisory control for nondeterministic systems A. Overkmp Deprtment of Opertions Reserch, Sttistics, nd System Theory BS-R9411 1994 Supervisory Control

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

Tutorial Automata and formal Languages

Tutorial Automata and formal Languages Tutoril Automt nd forml Lnguges Notes for to the tutoril in the summer term 2017 Sestin Küpper, Christine Mik 8. August 2017 1 Introduction: Nottions nd sic Definitions At the eginning of the tutoril we

More information

Homework Solution - Set 5 Due: Friday 10/03/08

Homework Solution - Set 5 Due: Friday 10/03/08 CE 96 Introduction to the Theory of Computtion ll 2008 Homework olution - et 5 Due: ridy 10/0/08 1. Textook, Pge 86, Exercise 1.21. () 1 2 Add new strt stte nd finl stte. Mke originl finl stte non-finl.

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages 5//6 Grmmr Automt nd Lnguges Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Prof. Mohmed Hmd Softwre Engineering L. The University of Aizu Jpn Regulr Lnguges Context Free Lnguges Context Sensitive

More information

CSE396 Prelim I Answer Key Spring 2017

CSE396 Prelim I Answer Key Spring 2017 Nme nd St.ID#: CSE96 Prelim I Answer Key Spring 2017 (1) (24 pts.) Define A to e the lnguge of strings x {, } such tht x either egins with or ends with, ut not oth. Design DFA M such tht L(M) = A. A node-rc

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

Worked out examples Finite Automata

Worked out examples Finite Automata Worked out exmples Finite Automt Exmple Design Finite Stte Automton which reds inry string nd ccepts only those tht end with. Since we re in the topic of Non Deterministic Finite Automt (NFA), we will

More information

Name Ima Sample ASU ID

Name Ima Sample ASU ID Nme Im Smple ASU ID 2468024680 CSE 355 Test 1, Fll 2016 30 Septemer 2016, 8:35-9:25.m., LSA 191 Regrding of Midterms If you elieve tht your grde hs not een dded up correctly, return the entire pper to

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-* Regulr Expressions (RE) Regulr Expressions (RE) Empty set F A RE denotes the empty set Opertion Nottion Lnguge UNIX Empty string A RE denotes the set {} Alterntion R +r L(r ) L(r ) r r Symol Alterntion

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

2.4 Linear Inequalities and Interval Notation

2.4 Linear Inequalities and Interval Notation .4 Liner Inequlities nd Intervl Nottion We wnt to solve equtions tht hve n inequlity symol insted of n equl sign. There re four inequlity symols tht we will look t: Less thn , Less thn or

More information

Formal Languages and Automata Theory. D. Goswami and K. V. Krishna

Formal Languages and Automata Theory. D. Goswami and K. V. Krishna Forml Lnguges nd Automt Theory D. Goswmi nd K. V. Krishn Novemer 5, 2010 Contents 1 Mthemticl Preliminries 3 2 Forml Lnguges 4 2.1 Strings............................... 5 2.2 Lnguges.............................

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

Exercises with (Some) Solutions

Exercises with (Some) Solutions Exercises with (Some) Solutions Techer: Luc Tesei Mster of Science in Computer Science - University of Cmerino Contents 1 Strong Bisimultion nd HML 2 2 Wek Bisimultion 31 3 Complete Lttices nd Fix Points

More information

How Deterministic are Good-For-Games Automata?

How Deterministic are Good-For-Games Automata? How Deterministic re Good-For-Gmes Automt? Udi Boker 1, Orn Kupfermn 2, nd Mich l Skrzypczk 3 1 Interdisciplinry Center, Herzliy, Isrel 2 The Herew University, Isrel 3 University of Wrsw, Polnd Astrct

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

CS 330 Formal Methods and Models

CS 330 Formal Methods and Models CS 330 Forml Methods nd Models Dn Richrds, George Mson University, Spring 2017 Quiz Solutions Quiz 1, Propositionl Logic Dte: Ferury 2 1. Prove ((( p q) q) p) is tutology () (3pts) y truth tle. p q p q

More information

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation Strong Bisimultion Overview Actions Lbeled trnsition system Trnsition semntics Simultion Bisimultion References Robin Milner, Communiction nd Concurrency Robin Milner, Communicting nd Mobil Systems 32

More information

Lecture 2: January 27

Lecture 2: January 27 CS 684: Algorithmic Gme Theory Spring 217 Lecturer: Év Trdos Lecture 2: Jnury 27 Scrie: Alert Julius Liu 2.1 Logistics Scrie notes must e sumitted within 24 hours of the corresponding lecture for full

More information

Thoery of Automata CS402

Thoery of Automata CS402 Thoery of Automt C402 Theory of Automt Tle of contents: Lecture N0. 1... 4 ummry... 4 Wht does utomt men?... 4 Introduction to lnguges... 4 Alphets... 4 trings... 4 Defining Lnguges... 5 Lecture N0. 2...

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings... Tle of contents: Lecture N0.... 3 ummry... 3 Wht does utomt men?... 3 Introduction to lnguges... 3 Alphets... 3 trings... 3 Defining Lnguges... 4 Lecture N0. 2... 7 ummry... 7 Kleene tr Closure... 7 Recursive

More information

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows:

Improper Integrals. The First Fundamental Theorem of Calculus, as we ve discussed in class, goes as follows: Improper Integrls The First Fundmentl Theorem of Clculus, s we ve discussed in clss, goes s follows: If f is continuous on the intervl [, ] nd F is function for which F t = ft, then ftdt = F F. An integrl

More information

1B40 Practical Skills

1B40 Practical Skills B40 Prcticl Skills Comining uncertinties from severl quntities error propgtion We usully encounter situtions where the result of n experiment is given in terms of two (or more) quntities. We then need

More information

Designing Information Devices and Systems I Spring 2018 Homework 7

Designing Information Devices and Systems I Spring 2018 Homework 7 EECS 16A Designing Informtion Devices nd Systems I Spring 2018 omework 7 This homework is due Mrch 12, 2018, t 23:59. Self-grdes re due Mrch 15, 2018, t 23:59. Sumission Formt Your homework sumission should

More information

Chapter 4: Techniques of Circuit Analysis. Chapter 4: Techniques of Circuit Analysis

Chapter 4: Techniques of Circuit Analysis. Chapter 4: Techniques of Circuit Analysis Chpter 4: Techniques of Circuit Anlysis Terminology Node-Voltge Method Introduction Dependent Sources Specil Cses Mesh-Current Method Introduction Dependent Sources Specil Cses Comprison of Methods Source

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk out solving systems of liner equtions. These re prolems tht give couple of equtions with couple of unknowns, like: 6= x + x 7=

More information

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations. Lecture 3 3 Solving liner equtions In this lecture we will discuss lgorithms for solving systems of liner equtions Multiplictive identity Let us restrict ourselves to considering squre mtrices since one

More information

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014 CS125 Lecture 12 Fll 2014 12.1 Nondeterminism The ide of nondeterministic computtions is to llow our lgorithms to mke guesses, nd only require tht they ccept when the guesses re correct. For exmple, simple

More information

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers Speech Recognition Lecture 2: Finite Automt nd Finite-Stte Trnsducers Eugene Weinstein Google, NYU Cournt Institute eugenew@cs.nyu.edu Slide Credit: Mehryr Mohri Preliminries Finite lphet, empty string.

More information

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science CSCI 340: Computtionl Models Trnsition Grphs Chpter 6 Deprtment of Computer Science Relxing Restrints on Inputs We cn uild n FA tht ccepts only the word! 5 sttes ecuse n FA cn only process one letter t

More information

Handout: Natural deduction for first order logic

Handout: Natural deduction for first order logic MATH 457 Introduction to Mthemticl Logic Spring 2016 Dr Json Rute Hndout: Nturl deduction for first order logic We will extend our nturl deduction rules for sententil logic to first order logic These notes

More information