Machines and their languages (G51MAL) Lecture notes Spring 2003

Size: px
Start display at page:

Download "Machines and their languages (G51MAL) Lecture notes Spring 2003"

Transcription

1 Mchines nd their lnguges (G51MAL) Lecture notes Spring 2003 Contents Thorsten Altenkirch April 23, Introduction Exmples on syntx Wht is this course bout? Applictions History The Chomsky Hierrchy Turing mchines Lnguges Pushdown Automt Wht is Pushdown Automton? How does PDA work? The lnguge of PDA Deterministic PDAs Context free grmmrs nd push-down-utomt How to implement recursive descent prser Wht is LL(1) grmmr? How to clculte First nd Follow Constructing n LL(1) grmmr How to implement the prser Beyond LL(1) - use LR(1) genertors Turing mchines nd the rest Wht is Turing mchine? Grmmrs nd context-sensitivity The hlting problem Bck to Chomsky Finite Automt Deterministic finite utomt Wht is DFA? The lnguge of DFA Nondeterministic Finite Automt Wht is n NFA? The lnguge ccepted by n NFA The subset construction Regulr expressions Wht re regulr expressions? The mening of regulr expressions Trnslting regulr expressions to NFAs Summing up Showing tht lnguge is not regulr The pumping lemm Applying the pumping lemm Context free grmmrs Wht is context-free grmmr? The lnguge of grmmr More exmples Prse trees Ambiguity

2 1 Introduction Most references refer to the course text [HMU01]. Plese, note tht the 2nd edition is quite different to the first one which ppered in 1979 nd is clssicl reference on the subject. I hve lso been using [Sch99] for those notes, but note tht this book is written in germn. The online version of this text contins some hyperlinks to webpges which contin dditionl informtion. 1.1 Exmples on syntx In PR1 nd PR2 you re lerning the lnguge JAVA. Which of the following progrms re syntcticlly correct, i.e. will be ccepted by the JAVA compiler without error messges? Hello-World.jv } A.jv } public clss Hello-World { public sttic void min(string rgc[3]) { System:out.println( Hello World ); } clss A { clss B { void C () { {} ; {{}} } } I hope tht you re ble to spot ll the errors in the first progrm. It my be ctully suprising but the 2nd (strnge looking) progrm is ctully correct. How do we know whether progrm is syntcticlly correct? We would hope tht this doesn t depend on the compiler we re using. 1.2 Wht is this course bout? 1. Mthemticl models of computtion, such s: Finite utomt, Pushdown utomt, Turing mchines 2. How to specify forml lnguges? Regulr expressions Context free grmmrs Context sensitive grmmrs 3. The reltion between 1. nd Applictions Regulr expressions Regulr expressions re convenient wys to express ptterns (i.e. for serch). There re number of tools which use regulr expressions: grep pttern mtching progrm (UNIX) sed strem editor (UNIX) lex A genertor for lexicl nlyzers (UNIX) Grmmrs for progrmming lnguges. The ppendix of [GJSB00] contins context free grmmr specifying the syntx of JAVA. YACC is tool which cn be used to generte C progrm ( prser) from grmmr. Prser genertors now lso exist for other lnguges, like Jv CUP for Jv. Specifying protocols See section 2.1 (pp 38-45) contins simple exmple of how protocol for electronic csh cn be specified nd nlyzed using finite utomt. 4 3

3 1.4 History The Chomsky Hierrchy All lnguges Turing mchines Type 0 or recursively enumerble lnguges Decidble lnguges Turing mchines Type 1 or context sensitive lnguges Type 2 or context free lnguges pushdown utomt Type 3 or regulr lnguges finite utomt Nom Chomsky introduced the Chomsky hierrchy which clssifies grmmrs nd lnguges. This hierrchy cn be mended by different types of mchines (or utomt) which cn recognize the pproprite clss of lnguges. Chomsky is lso well known for his unusul views on society nd politics. Aln Turing ( ) introduced n bstrct model of computtion, which we cll Turing mchines, to give precise definition which problems cn be solved by computer. All the mchines we re introducing cn be viewed s restricted versions of Turing mchines. I recommend Andrew Hodges biogrphy Aln Turing: the Enigm. 1.5 Lnguges In this course we will use the terms lnguge nd word different thn in everydy lnguge: A lnguge is set of words. A word is sequence of symbols. This leves us with the question: wht is symbol? The nswer is: nything, but it hs to come from n lphbet Σ which is finite set. A common (nd importnt) instnce is Σ = {0, 1}. More mthemticlly we sy: Given n lphbet Σ we define the set Σ s set of words (or sequences) over Σ: the empty word ɛ Σ nd given symbol x Σ nd word w Σ we cn form new word xw Σ. These re ll the wys elements on Σ cn be constructed (this is clled n inductive definition). E.g. in the exmple Σ = {0, 1}, typicl elements of Σ re 0010, ,ɛ. Note, tht we only write ɛ if it ppers on its own, insted of 00ɛ we just write 0ɛ. It is lso importnt to relzie tht lthough there re infinite mny words, ech word hs finite length. An importnt opertion on Σ is conctention. Confusingly, this is denoted by n invisible opertor: given w, v Σ we cn construct new word wv Σ simply by conctenting the two words. We cn define this opertion by 6 5

4 primitive recursion: ɛv = v (xw)v = x(wv) A lnguge L is set of words, hence L Σ or equivlently L P(Σ). Here re some informl exmples of lnguges: The set {0010, , ɛ} is lnguge over Σ = {0, 1}. This is n exmple of finite lnguge. The set of words with odd length over Σ = {1}. The set of words which contin the sme number of 0s nd 1s is lnguge over Σ = {0, 1}. The set of words which contin the sme number of 0s nd 1s modulo 2 (i.e. both re even or odd) is lnguge over Σ = {0, 1}. The set of plindroms using the english lphbet, e.g. words which red the sme forwrds nd bckwrds like bb. This is lnguge over {, b,..., z}. The set of correct Jv progrms. This is lnguge over the set of UNICODE chrcters (which correspond to numbers between 0 nd ). The set of progrms, which if executed on Windows mchine, will print the text Hello World! in window. This is lnguge over Σ = {0, 1}. 2 Finite Automt Finite utomt correspond to computer with fixed finite mount of memory. We will introduce deterministic finite utomt (DFA) first nd then move to nondeterministic finite utomt (NFA). An utomton will ccept certin words (sequences of symbols of given lphbet Σ) nd reject others. The set of ccepted words is clled the lnguge of the utomton. We will show tht the clss of lnguges which re ccepted by DFAs nd NFAs is the sme. 2.1 Deterministic finite utomt Wht is DFA? A deterministic finite utomton (DFA) A = (Q, Σ, δ, q 0, F ) is given by: 1. A finite set of sttes Q, 2. A finite set of input symbols Σ, 3. A trnsition function δ Q Σ Q, 4. An initil stte q 0 Q, 5. A set of ccepting sttes F Q. As n exmple consider the following utomton where D = ({q 0, q 1, q 2}, {0, 1}, δ D, q 0, {q 2}) δ D = {((q 0, 0), q 1), ((q 0, 1), q 0), ((q 1, 0), q 1), ((q 1, 1), q 2), ((q 2, 0), q 2), ((q 2, 1), q 2)} The DFA my be more conveniently represented by trnsition tble: 0 1 q 0 q 1 q 0 q 1 q 1 q 2 q 2 q 2 q 2 The tble represents the function δ, i.e. to find the vlue of δ(q, x) we hve to look t the row lbelled q nd the column lbelled x. The initil stte is mrked by n nd ll finl sttes re mrked by. Yet nother, opticlly more inspiring, lterntive re trnsition digrms: 1 0 q0 q1 q ,1 There is n rrow into the initil stte nd ll finl sttes re mrked by double rings. If δ(q, x) = q then there is n rrow from stte q to q which is lbelled x. We write Σ for the set of words (i.e. sequences) over the lphbet Σ. This includes the empty word which is written ɛ. I.e. {0, 1} = {ɛ, 0, 1, 00, 01, 10, 11, 000,... } 8 7

5 2.1.2 The lnguge of DFA To ech DFA A we ssocite lnguge L(A) Σ. To see whether word w L(A) we put mrker in the initil stte nd when reding symbol forwrd the mrker long the edge mrked with this symbol. When we re in n ccepting stte t the end of the word then w L(A), otherwise w / L(A). In the exmple bove we hve tht 0 / L(D),101 L(D) nd 110 / L(D). Indeed, we hve L(D) = {w w contins the substring 01} To be more precise we give forml definition of L(A). First we define the extended trnsition function ˆδ Q Σ Q. Intuitively, ˆδ(q, w) = q if strting from stte q we end up in stte q when reding the word w. Formlly, ˆδ is defined by primitive recursion: ˆδ(q, ɛ) = q (1) ˆδ(q, xw) = ˆδ(δ(q, x), w) (2) Here xw stnds for non empty word whose first symbol is x nd the rest is w. E.g. if we re told tht xw = 010 then this entils tht x = 0 nd w = 10. w my be empty, i.e. xw = 0 entils x = 0 nd w = ɛ. As n exmple we clculte ˆδ D (q 0, 101) = q 1 : ˆδ D (q 0, 101) = ˆδ D (δ D (q 0, 1), 01) by (2) Using ˆδ we my now define formlly: = ˆδ D (q 0, 01) becuse δ D (q 0, 1) = q 0 = ˆδ D (δ D (q 0, 0), 1) by (2) = ˆδ D (q 1, 1) becuse δ D (q 0, 0) = q 1 = ˆδ D (δ D (q 1, 1), ɛ) by (2) = ˆδ D (q 2, ɛ) becuse δ D (q 1, 1) = q 2 = q 2 by (1) L(A) = {w ˆδ(q 0, w) F } Hence we hve tht 101 L(D) becuse ˆδ D (q 0, 101) = q 2 nd q 2 F D. 2.2 Nondeterministic Finite Automt Wht is n NFA? Nondeterministic finite utomt (NFA) hve trnsition functions which my ssign severl or no sttes to given stte nd n input symbol. They ccept word if there is ny possible trnsition from the one of initil sttes to one of the finl sttes. It is importnt to note tht lthough NFAs hve nondetermistic trnsition function, it cn lwys be determined whether or not word belongs to its lnguge (w L(A)). Indeed, we shll see tht every NFA cn be trnslted into n DFA which ccepts the sme lnguge. Here is n exmple of n NFA C which ccepts ll words over Σ = {0, 1} s.t. the symbol before the lst is 1. 0,1 q0 q1 q2 1 0,1 A nondeterministic finite utomton (NFA) A = (Q, Σ, δ, q 0, F ) is given by: 1. A finite set of sttes Q, 2. A finite set of input symbols Σ, 3. A trnsition function δ Q Σ P(Q), 4. A set of initil stte S Q, 5. A set of ccepting sttes F Q. The differences to DFAs re to hve strt sttes insted of single one nd the type of the trnsition function. As n exmple we hve tht where δ C so given by C = ({q 0, q 1, q 2}, {0, 1}, δ C, {q 0}, {q2}) δ C 0 1 q 0 {q 0} {q 0, q 1} q 1 {q 2} {q 2} q 2 {} {} Note tht we diverge he slightly from the definition in the book, which uses single initil stte insted of set of initil sttes. Doing so mens tht we cn void introducing ɛ-nfas (see [HMU01], section 2.5) The lnguge ccepted by n NFA To see whether word w Σ is ccepted by n NFA (w L(A)) we my hve to use severl mrkers. Initilly we put one mrker on the initil stte. Then ech time when we red symbol we look t ll the mrkers: we remove the old mrker nd put mrkers t ll the sttes which re rechble vi n rrow mrked with the current input symbol (this my include the stte which ws 10 9

6 mrked in the previously). Thus we my hve to use severl mrker but it my lso hppen tht ll mrkers dispper (if no pproprite rrows exist). In this cse the word is not ccepted. If t the end of the word ny of the finl sttes hs mrker on it then the word is ccepted. E.g. consider the word 100 (which is not ccepted by C). Initilly we hve 0,1 q0 q1 q2 1 0,1 After reding 1 we hve to use two mrkers becuse there re two rrows from q 0 which re lbelled 1: 0,1 q0 q1 q2 1 0,1 Now fter reding 0 the utomton hs still got two mrkers, one of them in n ccepting stte: 0,1 q0 q1 q2 1 0,1 However, fter reding the 2nd 0 the second mrker disppers becuse there is no edge leving q 2 nd we hve: 0,1 q0 q1 q2 1 0,1 which is not ccepting becuse no mrker is in the ccepting stte. To specify the extended trnsition function for NFAs we use n generlistion of the union opertion on sets. We define to be the union of (finite) set of sets: {A1, A 2,... A n } = A 1 A 2 A n In the specil cses of the empty sets of sets nd one element set of sets we define: {} = {} {A} = A As n exmple {{1}, {2, 3}, {1, 3}} = {1} {2, 3} {1, 3} = {1, 2, 3} Actully, we my define by comprehension, which lso extends the opertion to infinite sets of sets (lthough we don t need this here) B = {x A B.x A} We define ˆδ P(Q) Σ P(Q) with the intention tht ˆδ(S, w) is the set of sttes which is mrked fter hving red w strting with the initil mrkers given by S. ˆδ(S, ɛ) = S (3) ˆδ(S, xw) = ˆδ( {δ(q, x) q S}, w) (4) As n exmple we clculte ˆδ C(q 0, 100) which is {q 0} s we lredy know from plying with mrkers. ˆδ C({q 0}, 100) = ˆδ C( {δ C(q, 1) q {q 0}}, 00) by (4) = ˆδ C(δ C(q 0, 1), 00) = ˆδ C({q 0, q 1}, 00) = ˆδ C( {δ C(q, 0) q {q 0, q 1}}, 0) by (4) = ˆδ C(δ C(q 0, 0) δ C(q 1, 0), 0) = ˆδ C({q 0} {q 2}, 0) = ˆδ C({q 0, q 2}, 0) = ˆδ C( {δ C(q, 0) q {q 0, q 2}}, ɛ) by (4) = ˆδ C(δ C(q 0, 0) δ C(q 2, 0), 0) = ˆδ C({q 0} {}, ɛ) = {q 0} by (3) Using the extended trnsition function we define the lnguge of n NFA s This shows tht 100 / L(C) becuse The subset construction L(A) = {w ˆδ(S, w) F {}} ˆδ({q 0}, 100) = {q 0} {q 2} = {} DFAs cn be viewed s specil cse of NFAs, i.e. those for which the the there is precisely one strt stte S = {q 0} nd the trnsition function returns lwys one-element sets (i.e. δ(q, x) = {q } for ll q Q nd x Σ). Below we show tht for every NFA we cn construct DFA which ccepts the sme lnguge. This shows tht NFAs ren t more powerful s DFAs. However, in some cses NFAs need lot fewer sttes thn the corresponding DFA nd they re esier to construct. Given n NFA A = (Q, Σ, δ, S, F ) we construct the DFA where D(A) = (P(Q), Σ, δ D(A), S, F D(A) ) δ D(A) (S, x) = {δ(q, x) q S} F D(A) = {S Q N S F {}} The bsic ide of this construction (the subset construction) is to define DFA whose sttes re sets of sttes of the NFA. A finl stte of the DFA is set 12 11

7 which contins t lest finl stte of the NFA. The trnsitions just follow the ctive set of mrkers, i.e. stte S P(Q N ) corresponds to hving mrkers on ll q S nd when we follow the rrow lbelled x we get the set of sttes which re mrked fter reding x. As n exmple let us consider the NFA C bove. We construct DFA D(C) with δ D(C) given by D(C) = (P({q 0, q 1, q 2 }, {0, 1}, δ D(C), {q 0 }, F D(C) ) δ D(C) 0 1 {} {} {} {q 0 } {q 0 } {q 0, q 1 } {q 1 } {q 2 } {q 2 } {q 2 } {} {} {q 0, q 1 } {q 0, q 2 } {q 0, q 1, q 2 } {q 0, q 2 } {q 0 } {q 0, q 1 } {q 1, q 2 } {q 2 } {q 2 } {q 0, q 1, q 2 } {q 0, q 2 } {q 0, q 1, q 2 } nd F D(C) is the set of ll the sttes mrked with bove,i.e. Looking t the trnsition digrm: F D(C) = {{q 2 }, {q 0, q 2 }, {q 1, q 2 }, {q 0, q 1, q 2 }} 0 {q0} {q1} 1 0 {q0,q1} 0,1 {q0,q2} {q0,q1,q2} {q1,q2} {q2} {} 0,1 0,1 0,1 we note tht some of the sttes ({}, {q 1 }, {q 2 }, {q 1, q 2 }) cnnot be reched from the initil stte, which mens tht they cn be omitted without chnging the lnguge. Hence we obtin the following utomton: 0 {q0} 1 0 {q0,q1} {q0,q2} 0 {q0,q1,q2} 1 1 Lemm 2.1 ˆδ D(A) (S, w) = ˆδ A(S, w) The result of both functions re sets of sttes of the NFA A: for the left hnd side becuse the extended trnsition function on NFAs returns sets of sttes nd for the right hnd side becuse the sttes of D(A) re sets of sttes of A. Proof: We show this by induction over the length of the word w, let s write w for the length of word. w = 0 Then w = ɛ nd we hve w = n + 1 Then w = xv with v = n. ˆδ D(A) (S, ɛ) = S by (1) = ˆδ A(S, ɛ) by (3) ˆδ D(A) (S, xv) = ˆδ D(A) (δ D(A) (S, x), v) by (2) = ˆδ A(δ D(A) (S, x), v) ind.hyp. = ˆδ A( {δ A(q, x) q S}, v) = ˆδ A(S, xv) by (4) We cn now use the lemm to show Theorem 2.2 Proof: L(A) = L(D(A)) w L(A) Definition of L(A) for NFAs δ A(S, ˆ w) F A {} Lemm 2.1 ˆδ D(A)(S, w) F A {} Definition of F D(A) ˆδ D(A)(S, w) F D(A) Definition of L(A) for DFAs w L D(A) Corollry 2.3 NFAs nd DFAs recognize the sme clss of lnguges. Proof: We hve noticed tht DFAs re just specil cse of NFAs. On the other hnd the subset construction introduced bove shows tht for every NFA we cn find DFA which recognizes the sme lnguge. We still hve to convince ourselves tht the DFA D(A) ccepts the sme lnguge s the NFA A, i.e. we hve to show tht L(A) = L(D(A)). As lemm we show tht the extended trnsition functions coincide: 14 13

8 3 Regulr expressions Given n lphbet Σ lnguge is set of words L Σ. So fr we were ble to describe lnguges either by using set theory (i.e. enumertion or comprehension) or by n utomton. In this section we shll introduce regulr expressions s n elegnt nd concise wy to describe lnguges. We shll see tht the lnguges definble by regulr expressions re precisely the sme s those ccepted by deterministic or nondeterministic finite utomt. These lnguges re clled regulr lnguges or (ccording to the Chomsky hierrchy) Type 3 lnguges. As lredy mentioned in the introduction regulr expressions re used to define ptterns in progrms such s grep. grep gets s n rgument regulr expression nd then filters out ll those lines from file which mtch the regulr expression, where mtching mens tht the line contins substring which is in the lnguge ssigned to the regulr expression. It is interesting to note tht even in the cse when we serch for specific word (this is specil cse of regulr expresion) progrms like grep re more efficient thn nive implementtion of word serch. To find out more bout grep hve look t the UNIX mnul pge nd ply round with grep. Note tht the syntx grep uses is slightly different from the one we use here. grep lso use some convenient shorthnds which re not relevnt for theoreticl nlysis of regulr expressions becuse they do not extend the clss of lnguges. 3.1 Wht re regulr expressions? We ssume s given n lphbet Σ (e.g. Σ = {, b, c,..., z}) nd define the syntx of regulr expressions (over Σ) 1. is regulr expression. 2. ɛ is regulr expression. 3. For ech x Σ, x is regulr expression. E.g. in the exmple ll smll letters re regulr expression. We use boldfce to emphsize the difference between the symbol nd the regulr expression. 4. If E nd F re regulr expressions then E + F is regulr expression. 5. If E nd F re regulr expressions then EF (i.e. just one fter the other) is regulr expression. 6. If E is regulr expression then E is regulr expression. 7. If E is regulr expression then (E) is regulr expression. These re ll regulr expressions. Here re some exmples for regulr expressions: ɛ hllo hllo + hello h( + e)llo b (ɛ + b)(b) (ɛ + ) As in rithmetic they re some conventions how to red regulr expressions: binds stronger then sequence nd +. E.g. we red b s (b ). We hve to use prentheses to enforce the other reding (b). Sequencing binds stronger thn +. E.g. we red b + cd s (b) + (bc). To enforce nother reding we hve to use prentheses s in (b + c)d. 3.2 The mening of regulr expressions We now know wht regulr expressions re but wht do they men? For this purpose, we shll first define n opertion on lnguges clled the Kleene str. Given lnguge L Σ we define L = {w 0w 1... w n 1 n N i < n.w i L} Intuitively, L contins ll the words which cn be formed by conctenting n rbitrry number of words in L. This includes the empty word since the number my be 0. As n exmple consider L = {, b} {, b} : L = {ɛ,, b, b, b, b, b,... } You should notice tht we use the sme symbol s in Σ but there is subtle difference: Σ is set of symbols but L Σ is set of words. Alterntively (nd more bstrctly) one my describe L s the lest lnguge (wrt ) which contins L nd the empty word nd is closed under conctention: w L v L = wv L We now define the semntics of regulr expressions: To ech regulr expression E over Σ we ssign lnguge L(E) Σ. We do this by induction over the definition of the syntx: 1. L( ) = 2. L(ɛ) = {ɛ} 3. L(x) = {x} where x Σ. 4. L(E + F ) = L(E) L(F ) 5. L(EF ) = {wv w L(E) v L(F )} 6. L(E ) = L(E) 7. L((E)) = L(E) 16 15

9 Subtle points: in 1. the symbol my be used s regulr expression (s in L( )) or the empty set ( = {}). Similrily, ɛ in 2. my be regulr expression or word, in 6. my be used to construct regulr expressions or it is n opertion on lnguges. Which lterntive we men becomes only cler from the context, there is no generlly greed mthemticl nottion 1 to mke this difference explicit. Let us now clculte wht the exmples of regulr expressions from the previous section men, i.e. wht re the lnguges they define: ɛ hllo By 2. L(ɛ) = {ɛ} Let s just look t L(h). We know from 3: Hence by 5: L(h) = {h} L() = {} L(h) = {wv w {h} v {}} = {h} Continuing the sme resoning we obtin: hllo + hello h( + e)llo L(hllo) = {hllo} From the previous point we know tht: Hence by using 4 we get: Using 3 nd 4 we know Hence using 5 we obtin: L(hllo) = {hllo} L(hello) = {hello} L(hllo + hello) = {hllo} {hello}} = {hllo, hello} L( + e) = {, e} L(h( + e)llo) = {uvw u L(h) v L( + e) w L(llo)} = {uvw u {h} v {, e} w {(llo}} = {hllo, hello} 1 This is different in progrmming, e.g. in JAVA we use "... " to signl tht we men things literlly. b Let us introduce the following nottion: Now using 6 we know tht w i = ww... w }{{} i times L( ) = {w 0w 1... w n 1 n N i < n.w i L()} = {w 0w 1... w n 1 n N i < n.w i {}} = { n n N} nd hence using 5 we conclude L( b ) = {uv u L( ) v L(b )} = {uv u { n n N} v { m m N}} = { n b m m, n N} I.e. L( b ) is the set of ll words which strt with (possibly empty) sequence of s followed by (possibly empty) sequence of bs. (ɛ + b)(b) (ɛ + ) Let s nlyze the prts: Hence, we hve L(ɛ + b) = {ɛ, b} L((b) ) = {b i i N} L(ɛ + b) = {ɛ, b} L((ɛ + b)(b) (ɛ + )) = {u(b) i v u {ɛ, b} i N v {ɛ, b} In english: L((ɛ + b)(b) (ɛ + )) is the set of (possibly empty) sequences of interchnging s nd bs. 3.3 Trnslting regulr expressions to NFAs Theorem 3.1 For ech regulr expression E we cn construct b NFA N(E) s.t. L(N(E)) = L(E), i.e. the utomton ccepts the lnguge described by the regulr expression. Proof: We do this gin by induction on the syntx of regulr expressions: 1. N( ): 18 17

10 x N(0) N(x) which will reject everything (it hs got no finl sttes) nd hence L(N( )) = = L( ) 2. N(ɛ): This utomton only ccepts the word x, hence: L(N(x)) = {x} = L(x) 4. N(E + F ): We merge the digrms for N(E) nd N(F ) into one: N(E) N(F) N( ε) I.e. given N(E+F) N(E) = (Q E, Σ, δ E, S E, F E) N(F ) = (Q F, Σ, δ F, S F, F F ) This utomton ccepts the empty word but rejects everything else, hence: Now we use the disjoint union opertion on sets (see the MCS lecture L(N(ɛ)) = {ɛ} = L(ɛ) 3. N(x): 20 19

11 notes [Alt01], section 4.1) Q E+F = Q E + Q F δ E+F ((0, q), x) = {(0, q ) q δ E (q, x)} δ E+F ((1, q)), x = {(1, q ) q δ F (q, x)} S E+F = S E + S F F E+F = F E + F F N(E + F ) = (Q E+F, Σ, δ E+F, S E+F, F E+F ) The disjoint union just signls tht we re not going to identify sttes, even if they ccidently hppen to hve the sme nme. Just thinking of the gme with mrkers you should be ble to convince yourself tht L(N(E + F )) = L(N(E)) L(N(F )) Moreover to show tht we re llowed to ssume tht L(N(E + F )) = L(E + F ) L(N(E)) = L(E) L(N(F )) = L(F ) tht s wht is ment by induction over the syntx of regulr expressions. Now putting everything together: 5. N(EF ): L(N(E + F )) = L(N(E)) L(N(F )) = L(E) L(F ) = L(E + F ) We wnt to put the two utomt N(E) nd N(F ) in series. We do this by connecting the finl sttes of N(E) with the initil sttes of N(F ) in wy explined below. In this digrm I only depicted one initil nd one finl stte of ech of the utomt lthough they my be severl of them. Here is how we construct N(EF ) from N(E) nd N(F ): N(E) = (Q E, Σ, δ E, S E, F E) N(F ) = (Q F, Σ, δ F, S F, F F ) The sttes of N(EF ) re the disjoint union of the sttes of N(E) nd N(F ): Q EF = Q E + Q F The trnsition function of N(EF ) contins ll the trnsitions of N(E) nd N(F ) (s for N(E + F )) nd for ech stte q of N(E) which hs trnsition to finl stte of N(E) we dd trnsition with the sme lbel to ll the initil sttes of N(F ). δ EF ((0, q), x) = {(0, q ) q δ E(q, x)} {(1, q ) q.q δ E(q, x) q S E} δ EF ((1, q)) = {(1, q ) q δ F (q))} The initil sttes of N(EF ) re the initil sttes of N(E), nd the initil sttes of N(F ) if there is n initil stte of N(E) which is lso finl stte. S EF = {(0, q) q S E} {(1, q) q S F S E F E } The finl sttes of N(EF ) re the finl sttes of N(F ). We now set F EF = {(1, q) q F F } N(EF ) = (Q EF, Σ, δ EF, S EF, Z EF ) I hope tht you re ble to convince yourself tht nd hence we cn reson L(N(EF )) = {uv u L(N(E)) v L(N(F )) N(E) N(F) L(N(EF )) = {uv u L(N(E)) v L(N(F )) = {uv u L(E) v L(F ) 6. N(E ): = L(EF ) We construct N(E ) from N(E) by merging initil nd finl sttes of N(E) in wy similr to the previous construction nd we dd new stte which is initil nd finl. N(EF) 22 21

12 * since we cn run through the utomton n rbitrry number of times. The new stte llows us lso to ccept the empty sequence. Hence: L(N(E )) = {w 0w 1... w n 1 n N i < n.w i L(N(E))} = L(N(E)) = L(E) = L(E ) N(E) N(E*) 7. N((E)) = N(E) I.e. using brckets does not chnge nything. As n exmple we construct N( b ). First we construct N(): Given we construct N(E ). N(E) = (Q E, Σ, δ E, S E, F E ) We dd one extr stte : Q E = Q E + { } N E inherits ll trnsitions form N E nd for ech stte which hs n rrow to the finl stte lbelled x we lso dd n rrow to ll the initil sttes lbelled x. N() Now we hve to pply the -construction nd we obtin: δ E ((0, q), x) ={(0, q ) q δ E (q, x)} {(0, q ) δ E (q, x) F E q S E } The initil sttes of N(E ) re the initil sttes of N(E) nd : S E = {(0, q) q S E } {(1, )} The finl sttes of N E re the finl sttes of N E nd : F E = {(0, q) q F E } {(1, )} We define We clim tht N(E ) = (Q E, Σ, δ E, S E, F E ) N(*) L(N(E )) = {w 0 w 1... w n 1 n N i < n.w i L(N(E))} N(b ) is just the sme nd we get 24 23

13 b b N(*) N(b*) nd now we hve to serilize the two utomt nd we get: 1. L is given by regulr expression. 2. L is the lnguge ccepted by n NFA. 3. L is the lnguge cceped by DFA. Proof: We hve tht 1. = 2 by theorem 3.1. We know tht 2. = 3. by2.2 nd 3. = 1. by 3.2. As indicted in the introduction: the lnguges which re chrcterized by ny of the three equivlent conditions re clled regulr lnguges or type-3- lnguges. b b N(*b*) Now, you my observe tht this utomton, though correct, is unnecessry complicted, since we could hve just used b However, we shll not be concerned with minimlity t the moment. 3.4 Summing up... From the previous section we know tht lnguge given by regulr expression is lso recognized by NFA. Wht bout the other wy: Cn lnguge recognized by finite utomton (DFA or NFA) lso be described by regulr expression? The nswer is yes: Theorem 3.2 (Theorem 3.4, pge 91) Given DFA A there is regulr expression R(A) which recognizes the sme lnguge L(A) = L(R(A)). We omit the proof (which cn be found in the [HMU01] on pp.91-93). However, we conclude: Corollry 3.3 Given lnguge L Σ the following is equivlent: 26 25

14 4 Showing tht lnguge is not regulr Regulr lnguges re lnguges which cn be recognized by computer with finite (i.e. fixed) memory. Such computer corresponds to DFA. However, there re mny lnguges which cnnot be recognized using only finite memory, simple exmple is the lnguge L = {0 n 1 n n N} i.e. the lnguge of words which strt with number of 0s followed by the sme number of 1s. Note tht this is different to L(0 1 ) which is the lnguge of words of sequences of 0s followed by sequence of 1s but the umber hs not to be identicl (nd which we know to be regulr becuse it is given by regulr expression). Why cn L not be recognized by computer with fixed finite memory? Assume we hve 32 Megbytes of memory, tht is we hve = bits. Such computer corresponds to n enormous DFA with sttes (imgine you hve to drw the trnsition digrm). However, the computer cn only count until if we feed it ny more 0s in the beginning it will get confused! Hence, you need n unbounded mount of memory to recognize n. We shll now show generl theorem clled the pumping lemm which llows us to prove tht certin lnguge is not regulr. 4.1 The pumping lemm Theorem 4.1 Given regulr lnguge L, then there is number n N such tht ll words w L which re longer thn n ( w n) cn be split into three words w = xyz s.t. 1. y ɛ 2. xy n 3. for ll k N we hve xy k z L. Proof: For regulr lnguge L there exists DFA A s.t. L = L(A). Let us ssume tht A hs got n sttes. Now if A ccepts word w with w n it must hve visited stte q twice: x y We choose q s.t. it is the first cycle, hence xy n. We lso know tht y is non empty (otherwise there is no cycle). Now, consider wht hppens if we feed word of the form xy i z to the utomton, i.e. s insted of y it contins n rbitrry number of repetitions of y, including the cse i = 0, i.e. y is just left out. The utomton hs to ccept ll such words nd hence xy i z L q z 4.2 Applying the pumping lemm Theorem 4.2 The lnguge L = {0 n 1 n n N} is not regulr. Proof: Assume L would be regulr. We will show tht this leds to contrdiction using the pumping lemm. Now by the pumping lemm there is n n such tht we cn split ech word which is longer thn n such tht the properties given by the pumping lemm hold. Consider 0 n 1 n L, this is certinly longer thn n. We hve tht xyz = 0 n 1 n nd we know tht xy n, hence y cn only contin 0s, nd since y ɛ it must contin t lest one 0. Now ccording to the pumping lemm xy 0 z L but this cnnot be the cse becuse it contins t lest one 0 less but the sme number of 1s s 0 n 1 n. Hence, our ssumption tht L is regulr must hve been wrong. It is esy to see tht the lnguge {1 n n is even} is regulr (just construct the pproprite DFA or use regulr expression). However wht bout {1 n n is squre} where by sying n is squre we men tht is there is n k N s.t. n = k 2. We my try s we like there is no wy to find out whether we hve got squre number of 1s by only using finite memory. And indeed: Theorem 4.3 The lnguge L = {1 n n is squre} is not regulr. Proof: We pply the sme strtegy s bove. Assume L is regulr then there is number n such we cn split ll longer words ccording to the pumping lemm. Let s tke w = 1 n2 this is certinly long enough. By the pumping lemm we know tht we cn split w = xyz s.t. the conditions of the pumping lemm hold. In prticulr we know tht Using the 3rd condition we know tht 1 y xy n xyyz L tht is xyyz is squre. However we know tht n 2 = w = xyz < xyyz since 1 y = xyz + y n 2 + n < n 2 + 2n + 1 = (n + 1) 2 To summrize we hve since y n n 2 < xyyz < (n + 1)

15 Tht is xyyz lies between two subsequent squres. But then it cnnot be squre itself, nd hence we hve contrdiction to xyyz L. We conclude L is not regulr. Given word w Σ we write w R for the word red bckwrds. I.e. bc R = bc. Formlly this cn be defined s ɛ R = ɛ (xw) R = w R x We use this to define the lnguge of even length plindromes L pli = {ww R w Σ I.e. for Σ = {, b} we hve bb L pli. Using the intuition tht finite utomt cn only use finite memory it should be cler tht this lnguge is not regulr, becuse one hs to remember the first hlf of the word to check whether the 2nd hlf is the sme word red bckwrds. Indeed, we cn show: Theorem 4.4 Given Σ = {, b} we hve tht L pli is not regulr. Proof: We use the pumping lemm: We ssume tht L pli is regulr. Now given pumping number n we construct w = n bb n L pli, this word is certinly longer thn n. From the pumping lemm we know tht there is splitting of the word w = xyz s.t. xy n nd hence y my only contin 0s nd since y ɛ t lest one. We conclude tht xz L pli where xz = m bb n where m < n. However, this word cnnot be plindrome since only the first hlf contins ny s. Hence our ssumption L pli is regulr must be wrong. The proof works for ny lphbet with t lest 2 different symbols. However, if Σ contins only one symbol s in Σ = {1} then L pli is the lnguge of n even number of 1s nd this is regulr L pli = (11). 5 Context free grmmrs We will now introduce context free grmmrs (CFGs) s formlism to define lnguges. CFGs re more generl thn regulr expressions, i.e. there re more lnguges definble by CFGs (clled type-2-lnguges). We will define the corresponding notion of utomt, the push down utomt (PDA) lter. 5.1 Wht is context-free grmmr? A context-free grmmr G = (V, Σ, S, P ) is given by A finite set V of vribles or nonterminl symbols. A finite set Σ of symbols or terminl symbols. We ssume tht the sets V nd Σ re disjoint. A strt symbol S V. A finite set P V (V T ) of productions. A production (A, α), where A V nd α (V T ) is sequence of terminls nd vribles, is written s A α. As n exmple we define grmmr for the lnguge of rithmeticl expressions over (using only + nd ), i.e. elements of this lnguge re + ( ) or ( + ) ( + ). However words like + + or )( re not in the lnguge. We define G = ({E, T, F }, {(, ),, +, }, E, P ) where P is given by: P = {E T E E + T T F T T F F F (E)} To sve spce we my combine ll the rules with the sme left hnd side, i.e. we write P = {E T E + T T F T F F (E) 5.2 The lnguge of grmmr How do we check whether word w Σ is in the lnguge of the grmmr? We strt with the strt symbol E nd use one production V α to replce the nonterminl symbol by the α until we hve no nonterminl symbols left - this 30 29

16 is clled derivtion. I.e. in the exmple G: E G E + T G T + T G F + T G + T G + F G + (E) G + (T ) G + (T F ) G + (F F ) G + ( F ) G + ( ) Note tht G here stnds for the reltion derives in one step nd hs nothing to do with impliction. In the exmple we hve lwys replced the leftmost non-terminl symbol (hence it is clled leftmost derivtion) but this is not necessry. Given ny grmmr G = (V, Σ, S, P ) we define the reltion derives in one step The reltion derives is defined s 2 G (V T ) (V T ) αv γ G αβγ : V β P G (V T ) (V T ) α 0 G α n : α G α 1 G... α n 1 G α n this includes the cse α G α becuse n cn be 0. We now sy tht the lnguge of grmmr L(G) Σ is given by ll words (over Σ) which cn be derived in ny number of steps, i.e. L(G) = {w Σ S G w} A lnguge which cn be given by context-free grmmr is clled context-free lnguge (CFL). turns out to be context-free; We lso note tht G = ({S}, {, b}, S, {S ɛ S bsb}) Theorem 5.1 All regulr lnguges re context-free. We don t give proof here - the ide is tht regulr expressions cn be trnslted into (specil) context-free grmmrs, i.e. b cn be trnslted into G = ({A, B}, {, b}, {A A B, B bb ɛ) (Extensions of) context free grmmrs re used in computer linguistics to describe rel lnguges. As n exmple consider Σ = {the, dog, ct, which, bites, brks, ctches} we use the grmmr G = ({S, N, NP, V I, V T, V P }, Σ, S, P ) where S NP V P N ct dog NP the N NP which V P V I brks bites V T bites ctches V P V I V T NP which llows us to derive interesting sentences like the dog which ctches the ct which bites brks An importnt exmple for context-free lnguges is the syntx of progrmming lnguges. We hve lredy mentioned ppendix of [GJSB00] which uses formlism slightly different from the one introduced here. Another exmple is the toy lnguge used in the compilers course ([Bc02], see Mini-Tringle Concrete nd Abstrct Syntx However, note tht not ll syntctic spects of progrmming lnguges re cptured by the context free grmmr, i.e. the fct tht vrible hs to be declred before used nd the type correctness of expressions re not cptured. 5.3 More exmples Some of the lnguges which we hve shown not to be regulr re ctully context-free. The lnguge {0 n 1 n n N} is given by the following grmmr Also the lnguge of plindromes G = ({S}, {0, 1}, S, {S ɛ 0S1}) {ww R w {, b} } 2 G is the trnsitive-reflexive closure of G

17 5.4 Prse trees With ech derivtion we lso ssocite derivtion tree, which shows the structure of the derivtion. As n exmple consider the tree ssocited with the derivtion of + ( ) given before: E T + T 5.5 Ambiguity We sy tht grmmr G is mbiguous if there is word w L(G) for which there is more thn one prse tree. This is usully bd thing becuse it entils tht there is more thn one wy to interpret word (i.e. it leds to semnticl mbiguity). As n exmple consider the following lterntive grmmr for rithmeticl expressions: We define G = ({E}, {(, ),, +, }, E, P ) where P is given by: P = {E E + E E E (E)} F F This grmmr is shorter nd requires only one vrible insted of 4. Moreover it genertes the sme lnguge, i.e. we hve L(G) = L(G ). But it is mbigous: Consider + we hve the following prse trees: ( E ) E E E * E E + E T E + E E * E T F * F Ech prse tree correspond to different wy to red the expression, i.e. the first one corresponds to ( + ) nd the second one to + ( ). Depending on which one is chosen n expression like my evlute to 12 or to 8. Informlly, we gree tht binds more thn + nd hence the 2nd reding is the intended one. This is ctully chieved by the first grmmr which only llows the 2nd reding: E The top of the tree (clled its root) is lbelled with strt symbol, the other nodes re lbelled with nonterminl symbols nd the leves re lbelled by terminl symbols. The word which is derived cn be red off from the leves of the tree. The importnt property of prse tree is tht the ncestors of n internl node correspond to production in the grmmr. E T F + T T * F F 34 33

18 6 Pushdown Automt We will now consider new notion of utomt Pushdown Automt (PDA). PDAs re finite utomt with stck, i.e. dt structure which cn be used to store n rbitrry number of symbols (hence PDAs hve n infinite set of sttes) but which cn be only ccessed in lst-in-first-out (LIFO) fshion. The lnguges which cn be recognized by PDA re precisely the context free lnguges. 6.1 Wht is Pushdown Automton? A Pushdown Automton P = (Q, Σ, Γ, δ, q 0, Z 0, F ) is given by the following dt A finite set Q of sttes, A finite set Σ of symbols (the lphbet), A finite set Γ of stck symbols, A trnsition function δ Q (Σ {ɛ}) Γ P fin (Q Γ ) Here P fin (A) re the finite subsets of set, i.e. this cn be defined s An initil stte q 0 Q, An initil stck symbol Z 0 Γ, A set of finl sttes F Q. P fin (A) = {X X A Xis finite.} As n exmple we consider PDA P which recognizes the lnguge of even length plindromes over Σ = {0, 1} L = {ww R w {0, 1} }. Intuitively, this PDA pushes the input symbols on the stck until it guesses tht it is in the middle nd then it compres the input with wht is on the stck, popping of symbols from the stck s it goes. If it reches the end of the input precisely t the time when the stck is empty, it ccepts. P 0 = ({q 0, q 1, q 2 }, {0, 1}, {0, 1, #}, δ, q 0, #, {q 2 }) where δ is given by the following equtions: δ(q 0, 0, #) = {(q 0, 0#)} δ(q 0, 1, #) = {(q 0, 1#)} δ(q 0, 0, 0) = {(q 0, 00)} δ(q 0, 1, 0) = {(q 0, 10)} δ(q 0, 0, 1) = {(q 0, 01)} δ(q 0, 1, 1) = {(q 0, 11)} δ(q 0, ɛ, #) = {(q 1, #)} δ(q 0, ɛ, 0) = {(q 1, 0)} δ(q 0, ɛ, 1) = {(q 1, 1)} δ(q 1, 0, 0) = {(q 1, ɛ)} δ(q 1, 1, 1) = {(q 1, ɛ)} δ(q 1, ɛ, #) = {(q 2, ɛ)} δ(q, w, z) = {} everywhere else To sve spce we my bbrevite this by writing: δ(q 0, x, z) = {(q 0, xz)} δ(q 0, ɛ, z) = {(q 1, z)} δ(q 1, x, x) = {(q 1, ɛ)} δ(q 1, ɛ, #) = {(q 2, ɛ)} δ(q, x, z) = {} everywhere else where q Q, x Σ, z Γ. We obtin the previous tble by expnding ll the possibilities for q, x, z. We drw the trnsition digrm of P by lbelling ech trnsition with triple x, Z, γ with x Σ, Z Γ, γ Γ : x,z,xz 6.2 How does PDA work? x,x, ε q0 q1 q2 ε, z,z ε,#,ε At ny time the stte of the computtion of PDA P = (Q, Σ, Γ, δ, q 0, Z 0, F ) is given by: the stte q Q the PDA is in, the input string w Σ which still hs to be processed, the contents of the stck γ Γ. Such triple (q, w, γ) Q Σ Γ is clled n Instntnous Description (ID). We define reltion P ID ID between IDs which describes how the PDA cn chnge from one ID to the next one. Since PDAs in generl re nondeterministic this is reltion (not function), i.e. there my be more thn one possibility. There re two possibilities for P : 1. (q, xw, zγ) P (q, w, αγ) if (q, α) δ(q, x, z) 2. (q, w, zγ) P (q, w, αγ) if (q, α) δ(q, ɛ, z) In the first cse the PDA reds n input symbol nd consults the trnsition function δ to clculte possible new stte q nd sequence of stck symbols α which replce the currend symbol on the top z. In the second cse the PDA ignores the input nd silently moves into new stte nd modifies the stck s bove. The input is unchnged. Consider the word 0110 wht re possible sequences of IDs for P 0 strting with (q 0, 0110, #)? (q 0, 0110, #) P0 (q 0, 110, 0#) 1. with (q 0, 0#) δ(q 0, 0, #) P0 (q 0, 10, 10#) 1. with (q 0, 10) δ(q 0, 1, 0) P0 (q 1, 10, 10#) 2. with (q 1, 1) δ(q 0, ɛ, 1) P0 (q 1, 0, 0#) 1. with (q 1, ɛ) δ(q 1, 1, 1) P0 (q 1, ɛ, #) 1. with (q 1, ɛ) δ(q 1, 0, 0) P0 (q 2, ɛ, ɛ) 2. with (q 2, ɛ) δ(q 1, ɛ, #) 36 35

19 We write (q, w, γ) P (q, w, γ) if the PDA cn move from (q, w, γ) to (q, w, γ ) in (possibly empty) sequence of moves. Above we hve shown tht (q 0, 0110, #) P0 (q 2, ɛ, ɛ). However, this is not the only possible sequence of IDs for this input. E.g. the PDA my just guess the middle wrong: (q 0, 0110, #) P0 (q 0, 110, 0#) 1. with (q 0, 0#) δ(q 0, 0, #) P0 (q 1, 110, 0#) 2. with (q 1, 0) δ(q 0, ɛ, 0) We hve shown (q 0, 0110, #) (q P0 1, 110, 0#). Here the PDA gets stuck there is no stte fter (q 1, 110, 0#). If we strt with word which is not in the lnguge L (like 0011) then the utomton will lwys get stuck before reching finl stte. 6.3 The lnguge of PDA There re two wys to define the lnguge of PDA P = (Q, Σ, Γ, δ, q 0, Z 0, F ) (L(P ) Σ ) becuse there re two notions of cceptnce: Acceptnce by finl stte L(P ) = {w (q 0, w, Z 0 ) P (q, ɛ, γ) q F } Tht is the PDA ccepts the word w if there is ny sequence of IDs strting from (q 0, w, Z 0 ) nd leding to (q, ɛ, γ), where q F is one of the finl sttes. Here it doesn t ply role wht the contents of the stck re t the end. In our exmple the PDA P 0 would ccept 0110 becuse (q 0, 0110, #) P0 (q 2, ɛ, ɛ) nd q 2 F. Hence we conclude 0110 L(P 0 ). On the other hnd since there is no successful sequence of IDs strting with (q 0, 0011, #) we know tht 0011 / L(P 0 ). Acceptnce by empty stck L(P ) = {w (q 0, w, Z 0 ) P (q, ɛ, ɛ)} Tht is the PDA ccepts the word w if there is ny sequence of IDs strting from (q 0, w, Z 0 ) nd leding to (q, ɛ, ɛ), in this cse the finl stte plys no role. If we specify PDA for cceptnce by empty stck we will leve out the set of finl sttes F nd just use P = (Q, Σ, Γ, δ, q 0, Z 0 ). Our exmple utomton P 0 lso works if we leve out F nd use cceptnce by empty stck. We cn lwys turn PDA which use one cceptnce method into one which uses the other. Hence, both cceptnce criteri specify the sme clss of lnguges. 6.4 Deterministic PDAs We hve introduced PDAs s nondeterministic mchines which my hve severl lterntives how to continue. We now define Deterministic Pushdown Automt (DPDA) s those which never hve choice. To be precise we sy tht PDA P = (Q, Σ, Γ, δ, q 0, Z 0, F ) is deterministic (is DPDA) iff δ(q, x, z) + δ(q, ɛ, z) 1 Remember, tht X stnds for the number of elements in finite set X. Tht is: DPDA my get stuck but it hs never ny choice. In our exmple the utomton P 0 is not deterministic, e.g. we hve δ(q 0, 0, #) = {(q 0, 0#)} nd δ(q 0, ɛ, #) = {(q 1, #)} nd hence δ(q 0, 0, #) + δ(q 0, ɛ, #) = 2. Unlike the sitution for finite utomt, there is in generl no wy to trnslte nondeterministic PDA into deterministic one. Indeed, there is no DPDA which recognizes the lnguge L! Nondeterministic PDAs re more powerful thn deterministic PDAs. However, we cn define similr lnguge L over Σ = {0, 1, $} which cn be recognized by deterministic PDA: L = {w$w R w {0, 1} } Tht is L contins plindroms with mrker $ in the middle, e.g. 01$10 L. We define DPDA P 1 for L : where δ is given by: Here is its trnsition grph: P 1 = ({q 0, q 1, q 2}, {0, 1, $}, {0, 1, #}, δ, q 0, #, {q 2}) δ(q 0, x, z) = {(q 0, xz) x {0, 1}} δ(q 0, $, z) = {(q 1, z)} δ(q 1, x, x) = {(q 1, ɛ)} δ(q 1, ɛ, #) = {(q 2, ɛ)} δ(q, x, z) = {} everywhere else x,z,xz x,x, ε q0 q1 q2 $,z,z ε,#,ε We cn check tht this utomton is deterministic. In prticulr the 3rd nd 4th line cnnot overlp becuse # is not n input symbol. Different to PDAs in generl the two cceptnce methods re not equivlent for DPDAs cceptnce by finl stte mkes it possible to define bigger clss of lnguges. Hence, we shll lwys use cceptnce by finl stte for DPDAs

20 6.5 Context free grmmrs nd push-down-utomt Theorem 6.1 For lnguge L Σ the following is equivlent: 1. L is given by CFG G, L = L(G). 2. L is the lnguge of PDA P, L = L(P ). To summrize: Context Free Lnguges (CFLs) cn be described by Context Free Grmmr (CFG) nd cn be processed by pushdown utomton. We will he only show how to construct PDA from grmmr - the other direction is shown in [HMU01] (6.3.2, pp. 241). Given CFG G = (V, Σ, S, P ), we define PDA where δ is defined s follows: P (G) = ({q 0 }, Σ, V Σ, δ, q 0, S) δ(q 0,, ) = {(q 0, ɛ)} δ(q 0, ɛ, A) = {(q 0, α) A α P } for ll A V. for ll Σ. We hven t given set of finl sttes becuse we use cceptnce by empty stck. Yes, we use only one stte! Tke s n exmple G = ({E, T, F }, {(, ),, +, }, E, P ) where we define with P = {E T E + T T F T F F (E) P (G) = ({q 0 }, {(, ),, +, }, {E, T, F, (, ),, +, }, δ, q 0, E) How does the P (G) ccept + (*)? (q 0, + (*), E) (q 0, + (*), E+T ) (q 0, + (*), T +T ) (q 0, + (*), F +T ) (q 0, + (*), +T ) (q 0, + (*), +T ) (q 0, (*), T ) (q 0, (*), F ) (q 0, (*), (E)) (q 0, *), E)) (q 0, *), T )) (q 0, *), T F )) (q 0, *), F F )) (q 0, *), F )) (q 0, *), *F )) (q 0, ), F )) (q 0, ), )) (q 0, ), )) (q 0, ɛ, ɛ) Hence + (*) L(P (G)). This exmple hopefully lredy illustrtes the generl ide: w L(G) S... w (q 0, w, S) (q 0, ɛ, ɛ) w L(P (G)) The utomton we hve constructed is very non-deterministic: Whenever we hve choice between different rules the utomton my silently choose one of the lterntive. δ(q 0, ɛ, E) = {(q 0, T ), (q 0, E + T )} δ(q 0, ɛ, T ) = {(q 0, F ), (q 0, T F )} δ(q 0, ɛ, F ) = {(q 0, ), (q 0, (E))} δ(q 0, (, () = {(q 0, ɛ)} δ(q 0, ), )) = {(q 0, ɛ)} δ(q 0,, ) = {(q 0, ɛ)} δ(q 0, +, +) = {(q 0, ɛ)} δ(q 0, *, *) = {(q 0, ɛ)} 40 39

21 7 How to implement recursive descent prser A prser is progrm which processes input defined by context-free grmmr. The trnsltion given in the previous section is not very useful in the design of such progrm becuse of the non-determinism. Here I show how for certin clss of grmmrs this non-determinism cn be eliminted nd using the exmple of rithmeticl expressions I will show how JAVA-progrm cn be constructed which prses nd evlutes expressions. 7.1 Wht is LL(1) grmmr? The bsic ide of recursive descent prser is to use the current input symbol to decide which lterntive to choose. Grmmrs which hve the property tht it is possible to do this re clled LL(1) grmmrs. First we introduce n end mrker $, for given G = (V, Σ, S, P ) we define the ugmented grmmr G $ = (V, Σ, S, P ) where V = V {S } where S is chosen s.t. S / V Σ, Σ = Σ {$} where $ is chosen s.t. $ / V Σ, P = P {S S$} The ide is tht L(G $ ) = {w$ w L(G)} Now for ech nonterminl symbol A V Σ we define First(A) = { Σ A β} Follow(A) = { Σ S αaβ} i.e. First(A) is the set of terminl symbols with which word derived from A my strt nd Follow(A) is the set of symbols which my occur directly fter A. We use the ugmented grmmr to hve mrker for the end of the word. For ech production A α P we define the set Lookhed(A α) which re the set of symbols which indicte tht we re in this lterntive. Lookhed(A B 1 B 2... B n ) = {First(B i ) 1 k < i.b k ɛ} { Follow(A) if B1 B 2... B k ɛ otherwise We now sy grmmr G is LL(1), iff for ech pir A α, A β P with α β it is the cse tht Lookhed(A α) Lookhed(A β) = 7.2 How to clculte First nd Follow We hve to determine whether A ɛ. If there re no ɛ-production we know tht the nswer is lwys negtive, otherwise If A ɛ P we know tht A ɛ. If A B 1 B 2... B n where ll B i re nonterminl symbols nd for ll 1 i n: B i ɛ then we lso know A ɛ. We clculte First nd Follow in similr fshion: First() = {} if Σ. If A B 1B 2... B n nd there is n i n s.t. 1 k < i.b k ɛ then we dd First(B i) to First(A). And for Follow: $ Follow(S) where S is the originl strt symbol. If there is production A αbβ then everything in First(β) is in Follow(B). If there is production A αbβ with β ɛ then everything in Follow(A) is lso in Follow(B). 7.3 Constructing n LL(1) grmmr Let s hve look t the grmmr G for rithmeticl expressions gin. G = ({E, T, F }, {(, ),, +, }, E, P ) where P = {E T E + T T F T F F (E) We don t need the Follow-sets in the moment becuse the empty word doesn t occur in the grmmr. For the nonterminl symbols we hve First(F ) = {, (} First(T ) = {, (} First(E) = {, (} nd now it is esy to see tht most of the Lookhed-sets gree, e.g. Lookhed(E T ) = {, (} Lookhed(E E + T ) = {, (} Lookhed(T F ) = {, (} Lookhed(T T F ) = {, (} Lookhed(F ) = {} Lookhed(F (E)) = {(} Hence the grmmr G is not LL(1). However, luckily there is n lterntive grmmr G which defines the sme lnguge: G = ({E, E, T, T, F }, {(, ),, +, }, E, P ) where P = {E T E E +T E ɛ T F T T *F T ɛ F (E) 42 41

22 Since we hve ɛ-productions we do need the Follow-sets. Now we clculte the Lookhed-sets: First(E) = First(T ) = First(F ) = {, (} First(E ) = {+} First(T ) = {*} Follow(E) = Follow(E ) = {), $} Follow(T ) = Follow(T ) = {+, ), $} Follow(F ) = {+, *, ), $} Lookhed(E T E ) = {, (} Lookhed(E +T E ) = {+} Lookhed(E ɛ) = Follow(E ) = {), $} Lookhed(T +F T ) = {, (} Lookhed(T *F T ) = {*} Lookhed(T ɛ) = Follow(T ) = {+, ), $} Lookhed(F ) = {} Lookhed(F (E)) = {(} Hence the grmmr G is LL(1). 7.4 How to implement the prser We cn now implement prser - one wy would be to construct deterministic PDA. However, using JAVA we cn implement the prser using recursion - here the internl JAVA stck plys the role of the stck of the PDA. First of ll we hve to seprte the input into tokens which re the terminl symbols of our grmmr. To keep things simple I ssume tht tokens re seprted by blnks, i.e. one hs to type ( + ) * for (+)*. This hs the dvntge tht we cn use jv.util.stringtokenizer. In rel implementtion tokenizing is usully done by using finite utomt. I don t wnt to get lost in jv detils - in the min progrm I red line nd produce tokenizer: String line=in.redline(); st = new StringTokenizer(line+" $"); The tokenizer st nd the current token re sttic vribles. I implement the convenience method next which ssigns the next token to curr. sttic StringTokenizer st; sttic String curr; sttic void next() { try { curr=st.nexttoken().intern(); } ctch( NoSuchElementException e) { curr=null; } } We lso implement convenience method error(string) to report n error nd terminte the progrm. Now we cn trnslte ll productions into methods using the Lookhed sets to determine which lterntive to choose. E.g. we trnslte E +T E ɛ into (using E1 for E to follow JAVA rules): sttic void prsee1() { if (curr=="+") { next(); prset(); prsee1(); } else if(curr==")" curr=="$" ) { } else { error("unexpected :"+curr); } The bsic ide is to Trnslte ech occurrence of non terminl symbol into test tht this symbol hs been red nd cll of next(). Trnslte ech nonterminl symbol into cll of the method with the sme nme. If you hve to decide between different productions use the lookhed sets to determine which one to use. If you find tht there is no wy to continue cll error(). We initite the prsing process by clling next() to red the first symbol nd then cll prsee(). If fter processing prsee() we re t the end mrker, then the prsing hs been successful. next(); prsee(); if(curr=="$") { System.out.println("OK "); } else { error("end expected"); } The complete prser cn be found t Actully, we cn be bit more relistic nd turn the prser into simple evlutor by 44 43

3 Regular expressions

3 Regular expressions 3 Regulr expressions Given n lphet Σ lnguge is set of words L Σ. So fr we were le to descrie lnguges either y using set theory (i.e. enumertion or comprehension) or y n utomton. In this section we shll

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed

More information

1.4 Nonregular Languages

1.4 Nonregular Languages 74 1.4 Nonregulr Lnguges The number of forml lnguges over ny lphbet (= decision/recognition problems) is uncountble On the other hnd, the number of regulr expressions (= strings) is countble Hence, ll

More information

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is

More information

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38 Theory of Computtion Regulr Lnguges (NTU EE) Regulr Lnguges Fll 2017 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of Finite Automt A finite utomton hs finite set of control

More information

Theory of Computation Regular Languages

Theory of Computation Regular Languages Theory of Computtion Regulr Lnguges Bow-Yw Wng Acdemi Sinic Spring 2012 Bow-Yw Wng (Acdemi Sinic) Regulr Lnguges Spring 2012 1 / 38 Schemtic of Finite Automt control 0 0 1 0 1 1 1 0 Figure: Schemtic of

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automt Theory nd Forml Lnguges TMV027/DIT321 LP4 2018 Lecture 10 An Bove April 23rd 2018 Recp: Regulr Lnguges We cn convert between FA nd RE; Hence both FA nd RE ccept/generte regulr lnguges; More

More information

1.3 Regular Expressions

1.3 Regular Expressions 56 1.3 Regulr xpressions These hve n importnt role in describing ptterns in serching for strings in mny pplictions (e.g. wk, grep, Perl,...) All regulr expressions of lphbet re 1.Ønd re regulr expressions,

More information

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn

More information

Formal languages, automata, and theory of computation

Formal languages, automata, and theory of computation Mälrdlen University TEN1 DVA337 2015 School of Innovtion, Design nd Engineering Forml lnguges, utomt, nd theory of computtion Thursdy, Novemer 5, 14:10-18:30 Techer: Dniel Hedin, phone 021-107052 The exm

More information

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh Finite Automt Informtics 2A: Lecture 3 Mry Cryn School of Informtics University of Edinburgh mcryn@inf.ed.c.uk 21 September 2018 1 / 30 Lnguges nd Automt Wht is lnguge? Finite utomt: recp Some forml definitions

More information

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton 25. Finite Automt AUTOMATA AND LANGUAGES A system of computtion tht only hs finite numer of possile sttes cn e modeled using finite utomton A finite utomton is often illustrted s stte digrm d d d. d q

More information

5.1 Definitions and Examples 5.2 Deterministic Pushdown Automata

5.1 Definitions and Examples 5.2 Deterministic Pushdown Automata CSC4510 AUTOMATA 5.1 Definitions nd Exmples 5.2 Deterministic Pushdown Automt Definitions nd Exmples A lnguge cn be generted by CFG if nd only if it cn be ccepted by pushdown utomton. A pushdown utomton

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)

More information

FABER Formal Languages, Automata and Models of Computation

FABER Formal Languages, Automata and Models of Computation DVA337 FABER Forml Lnguges, Automt nd Models of Computtion Lecture 5 chool of Innovtion, Design nd Engineering Mälrdlen University 2015 1 Recp of lecture 4 y definition suset construction DFA NFA stte

More information

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun: CMPU 240 Lnguge Theory nd Computtion Spring 2019 NFAs nd Regulr Expressions Lst clss: Introduced nondeterministic finite utomt with -trnsitions Tody: Prove n NFA- is no more powerful thn n NFA Introduce

More information

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014 CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014 Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA

More information

Machines and their languages (G51MAL) Lecture notes Spring 2005

Machines and their languages (G51MAL) Lecture notes Spring 2005 Machines and their languages (G51MAL) Lecture notes Spring 2005 Thorsten Altenkirch (revised by Henrik Nilsson) May 11, 2005 Contents 1 Introduction 2 1.1 Examples on syntax......................... 2

More information

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30 Tlen en Automten Test 1, Mon 7 th Dec, 2015 15h45 17h30 This test consists of four exercises over 5 pges. Explin your pproch, nd write your nswer to ech exercise on seprte pge. You cn score mximum of 100

More information

More on automata. Michael George. March 24 April 7, 2014

More on automata. Michael George. March 24 April 7, 2014 More on utomt Michel George Mrch 24 April 7, 2014 1 Automt constructions Now tht we hve forml model of mchine, it is useful to mke some generl constructions. 1.1 DFA Union / Product construction Suppose

More information

First Midterm Examination

First Midterm Examination Çnky University Deprtment of Computer Engineering 203-204 Fll Semester First Midterm Exmintion ) Design DFA for ll strings over the lphet Σ = {,, c} in which there is no, no nd no cc. 2) Wht lnguge does

More information

Designing finite automata II

Designing finite automata II Designing finite utomt II Prolem: Design DFA A such tht L(A) consists of ll strings of nd which re of length 3n, for n = 0, 1, 2, (1) Determine wht to rememer out the input string Assign stte to ech of

More information

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4 Intermedite Mth Circles Wednesdy, Novemer 14, 2018 Finite Automt II Nickols Rollick nrollick@uwterloo.c Regulr Lnguges Lst time, we were introduced to the ide of DFA (deterministic finite utomton), one

More information

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1 Chpter Five: Nondeterministic Finite Automt Forml Lnguge, chpter 5, slide 1 1 A DFA hs exctly one trnsition from every stte on every symol in the lphet. By relxing this requirement we get relted ut more

More information

This lecture covers Chapter 8 of HMU: Properties of CFLs

This lecture covers Chapter 8 of HMU: Properties of CFLs This lecture covers Chpter 8 of HMU: Properties of CFLs Turing Mchine Extensions of Turing Mchines Restrictions of Turing Mchines Additionl Reding: Chpter 8 of HMU. Turing Mchine: Informl Definition B

More information

Harvard University Computer Science 121 Midterm October 23, 2012

Harvard University Computer Science 121 Midterm October 23, 2012 Hrvrd University Computer Science 121 Midterm Octoer 23, 2012 This is closed-ook exmintion. You my use ny result from lecture, Sipser, prolem sets, or section, s long s you quote it clerly. The lphet is

More information

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages Deprtment of Computer Science, Austrlin Ntionl University COMP2600 Forml Methods for Softwre Engineering Semester 2, 206 Assignment Automt, Lnguges, nd Computility Smple Solutions Finite Stte Automt nd

More information

Closure Properties of Regular Languages

Closure Properties of Regular Languages Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L

More information

Finite Automata-cont d

Finite Automata-cont d Automt Theory nd Forml Lnguges Professor Leslie Lnder Lecture # 6 Finite Automt-cont d The Pumping Lemm WEB SITE: http://ingwe.inghmton.edu/ ~lnder/cs573.html Septemer 18, 2000 Exmple 1 Consider L = {ww

More information

Convert the NFA into DFA

Convert the NFA into DFA Convert the NF into F For ech NF we cn find F ccepting the sme lnguge. The numer of sttes of the F could e exponentil in the numer of sttes of the NF, ut in prctice this worst cse occurs rrely. lgorithm:

More information

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS The University of Nottinghm SCHOOL OF COMPUTER SCIENCE LEVEL 2 MODULE, SPRING SEMESTER 2016 2017 LNGUGES ND COMPUTTION NSWERS Time llowed TWO hours Cndidtes my complete the front cover of their nswer ook

More information

Agenda. Agenda. Regular Expressions. Examples of Regular Expressions. Regular Expressions (crash course) Computational Linguistics 1

Agenda. Agenda. Regular Expressions. Examples of Regular Expressions. Regular Expressions (crash course) Computational Linguistics 1 Agend CMSC/LING 723, LBSC 744 Kristy Hollingshed Seitz Institute for Advnced Computer Studies University of Mrylnd HW0 questions? Due Thursdy before clss! When in doubt, keep it simple... Lecture 2: 6

More information

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9. Regulr Expressions, Pumping Lemm, Right Liner Grmmrs Ling 106 Mrch 25, 2002 1 Regulr Expressions A regulr expression descries or genertes lnguge: it is kind of shorthnd for listing the memers of lnguge.

More information

CSC 473 Automata, Grammars & Languages 11/9/10

CSC 473 Automata, Grammars & Languages 11/9/10 CSC 473 utomt, Grmmrs & Lnguges 11/9/10 utomt, Grmmrs nd Lnguges Discourse 06 Decidbility nd Undecidbility Decidble Problems for Regulr Lnguges Theorem 4.1: (embership/cceptnce Prob. for DFs) = {, w is

More information

Nondeterminism and Nodeterministic Automata

Nondeterminism and Nodeterministic Automata Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely

More information

1 Nondeterministic Finite Automata

1 Nondeterministic Finite Automata 1 Nondeterministic Finite Automt Suppose in life, whenever you hd choice, you could try oth possiilities nd live your life. At the end, you would go ck nd choose the one tht worked out the est. Then you

More information

Parse trees, ambiguity, and Chomsky normal form

Parse trees, ambiguity, and Chomsky normal form Prse trees, miguity, nd Chomsky norml form In this lecture we will discuss few importnt notions connected with contextfree grmmrs, including prse trees, miguity, nd specil form for context-free grmmrs

More information

Context-Free Grammars and Languages

Context-Free Grammars and Languages Context-Free Grmmrs nd Lnguges (Bsed on Hopcroft, Motwni nd Ullmn (2007) & Cohen (1997)) Introduction Consider n exmple sentence: A smll ct ets the fish English grmmr hs rules for constructing sentences;

More information

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or

More information

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir CSCI 2400 Models of Computtion, Section 3 Solutions to Homework 4 Problem 1. ll the solutions below refer to the Pumping Lemm of Theorem 4.8, pge 119. () L = f n b l k : k n + lg Let's ssume for contrdiction

More information

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1 Non-Deterministic Finite Automt Fll 2018 Costs Busch - RPI 1 Nondeterministic Finite Automton (NFA) Alphbet ={} q q2 1 q 0 q 3 Fll 2018 Costs Busch - RPI 2 Nondeterministic Finite Automton (NFA) Alphbet

More information

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!) CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All

More information

Formal Languages and Automata

Formal Languages and Automata Moile Computing nd Softwre Engineering p. 1/5 Forml Lnguges nd Automt Chpter 2 Finite Automt Chun-Ming Liu cmliu@csie.ntut.edu.tw Deprtment of Computer Science nd Informtion Engineering Ntionl Tipei University

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic

More information

CS 314 Principles of Programming Languages

CS 314 Principles of Programming Languages C 314 Principles of Progrmming Lnguges Lecture 6: LL(1) Prsing Zheng (Eddy) Zhng Rutgers University Ferury 5, 2018 Clss Informtion Homework 2 due tomorrow. Homework 3 will e posted erly next week. 2 Top

More information

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory CS 275 utomt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Prolem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) nton Setzer (Bsed on ook drft y J. V. Tucker nd K. Stephenson)

More information

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51 Non Deterministic Automt Linz: Nondeterministic Finite Accepters, pge 51 1 Nondeterministic Finite Accepter (NFA) Alphbet ={} q 1 q2 q 0 q 3 2 Nondeterministic Finite Accepter (NFA) Alphbet ={} Two choices

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tble of Contents: Week 1: Preliminries (set lgebr, reltions, functions) (red Chpters 1-4) Weeks

More information

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA) Finite Automt (FA or DFA) CHAPTER Regulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, equivlence of NFAs DFAs, closure under regulr

More information

Part 5 out of 5. Automata & languages. A primer on the Theory of Computation. Last week was all about. a superset of Regular Languages

Part 5 out of 5. Automata & languages. A primer on the Theory of Computation. Last week was all about. a superset of Regular Languages Automt & lnguges A primer on the Theory of Computtion Lurent Vnbever www.vnbever.eu Prt 5 out of 5 ETH Zürich (D-ITET) October, 19 2017 Lst week ws ll bout Context-Free Lnguges Context-Free Lnguges superset

More information

Lecture 09: Myhill-Nerode Theorem

Lecture 09: Myhill-Nerode Theorem CS 373: Theory of Computtion Mdhusudn Prthsrthy Lecture 09: Myhill-Nerode Theorem 16 Ferury 2010 In this lecture, we will see tht every lnguge hs unique miniml DFA We will see this fct from two perspectives

More information

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa CS:4330 Theory of Computtion Spring 208 Regulr Lnguges Equivlences between Finite utomt nd REs Hniel Brbos Redings for this lecture Chpter of [Sipser 996], 3rd edition. Section.3. Finite utomt nd regulr

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2 CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch.

Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch. Overview H9 Vertlerouw H 9: Prsing: op-down & LL(1) do 3 mei 2001 56 heo Ruys h. 8 - Prsing 8.1 ontext-free Grmmrs 8.2 op-down Prsing 8.3 LL(1) Grmmrs See lso [ho, Sethi & Ullmn 1986] for more thorough

More information

Lecture 08: Feb. 08, 2019

Lecture 08: Feb. 08, 2019 4CS4-6:Theory of Computtion(Closure on Reg. Lngs., regex to NDFA, DFA to regex) Prof. K.R. Chowdhry Lecture 08: Fe. 08, 2019 : Professor of CS Disclimer: These notes hve not een sujected to the usul scrutiny

More information

Chapter 2 Finite Automata

Chapter 2 Finite Automata Chpter 2 Finite Automt 28 2.1 Introduction Finite utomt: first model of the notion of effective procedure. (They lso hve mny other pplictions). The concept of finite utomton cn e derived y exmining wht

More information

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck. Outline Automt Theory 101 Rlf Huuck Introduction Finite Automt Regulr Expressions ω-automt Session 1 2006 Rlf Huuck 1 Session 1 2006 Rlf Huuck 2 Acknowledgement Some slides re sed on Wolfgng Thoms excellent

More information

Minimal DFA. minimal DFA for L starting from any other

Minimal DFA. minimal DFA for L starting from any other Miniml DFA Among the mny DFAs ccepting the sme regulr lnguge L, there is exctly one (up to renming of sttes) which hs the smllest possile numer of sttes. Moreover, it is possile to otin tht miniml DFA

More information

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb. CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt

More information

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true. York University CSE 2 Unit 3. DFA Clsses Converting etween DFA, NFA, Regulr Expressions, nd Extended Regulr Expressions Instructor: Jeff Edmonds Don t chet y looking t these nswers premturely.. For ech

More information

Normal Forms for Context-free Grammars

Normal Forms for Context-free Grammars Norml Forms for Context-free Grmmrs 1 Linz 6th, Section 6.2 wo Importnt Norml Forms, pges 171--178 2 Chomsky Norml Form All productions hve form: A BC nd A vrile vrile terminl 3 Exmples: S AS S AS S S

More information

a b b a pop push read unread

a b b a pop push read unread A Finite Automton A Pushdown Automton 0000 000 red unred b b pop red unred push 2 An Exmple A Pushdown Automton Recll tht 0 n n not regulr. cn push symbols onto the stck cn pop them (red them bck) lter

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 2013 Outline 1 Riemnn Sums 2 Riemnn Integrls 3 Properties

More information

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.) CS 373, Spring 29. Solutions to Mock midterm (sed on first midterm in CS 273, Fll 28.) Prolem : Short nswer (8 points) The nswers to these prolems should e short nd not complicted. () If n NF M ccepts

More information

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation

Strong Bisimulation. Overview. References. Actions Labeled transition system Transition semantics Simulation Bisimulation Strong Bisimultion Overview Actions Lbeled trnsition system Trnsition semntics Simultion Bisimultion References Robin Milner, Communiction nd Concurrency Robin Milner, Communicting nd Mobil Systems 32

More information

Non-deterministic Finite Automata

Non-deterministic Finite Automata Non-deterministic Finite Automt Eliminting non-determinism Rdoud University Nijmegen Non-deterministic Finite Automt H. Geuvers nd T. vn Lrhoven Institute for Computing nd Informtion Sciences Intelligent

More information

Handout: Natural deduction for first order logic

Handout: Natural deduction for first order logic MATH 457 Introduction to Mthemticl Logic Spring 2016 Dr Json Rute Hndout: Nturl deduction for first order logic We will extend our nturl deduction rules for sententil logic to first order logic These notes

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 203 Outline Riemnn Sums Riemnn Integrls Properties Abstrct

More information

Coalgebra, Lecture 15: Equations for Deterministic Automata

Coalgebra, Lecture 15: Equations for Deterministic Automata Colger, Lecture 15: Equtions for Deterministic Automt Julin Slmnc (nd Jurrin Rot) Decemer 19, 2016 In this lecture, we will study the concept of equtions for deterministic utomt. The notes re self contined

More information

CHAPTER 1 Regular Languages. Contents

CHAPTER 1 Regular Languages. Contents Finite Automt (FA or DFA) CHAPTE 1 egulr Lnguges Contents definitions, exmples, designing, regulr opertions Non-deterministic Finite Automt (NFA) definitions, euivlence of NFAs nd DFAs, closure under regulr

More information

CISC 4090 Theory of Computation

CISC 4090 Theory of Computation 9/6/28 Stereotypicl computer CISC 49 Theory of Computtion Finite stte mchines & Regulr lnguges Professor Dniel Leeds dleeds@fordhm.edu JMH 332 Centrl processing unit (CPU) performs ll the instructions

More information

Lexical Analysis Finite Automate

Lexical Analysis Finite Automate Lexicl Anlysis Finite Automte CMPSC 470 Lecture 04 Topics: Deterministic Finite Automt (DFA) Nondeterministic Finite Automt (NFA) Regulr Expression NFA DFA A. Finite Automt (FA) FA re grph, like trnsition

More information

1 Structural induction

1 Structural induction Discrete Structures Prelim 2 smple questions Solutions CS2800 Questions selected for Spring 2018 1 Structurl induction 1. We define set S of functions from Z to Z inductively s follows: Rule 1. For ny

More information

Java II Finite Automata I

Java II Finite Automata I Jv II Finite Automt I Bernd Kiefer Bernd.Kiefer@dfki.de Deutsches Forschungszentrum für künstliche Intelligenz Finite Automt I p.1/13 Processing Regulr Expressions We lredy lerned out Jv s regulr expression

More information

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA Nondeterminism Nondeterministic Finite Automt Nondeterminism Subset Construction A nondeterministic finite utomton hs the bility to be in severl sttes t once. Trnsitions from stte on n input symbol cn

More information

and that at t = 0 the object is at position 5. Find the position of the object at t = 2.

and that at t = 0 the object is at position 5. Find the position of the object at t = 2. 7.2 The Fundmentl Theorem of Clculus 49 re mny, mny problems tht pper much different on the surfce but tht turn out to be the sme s these problems, in the sense tht when we try to pproimte solutions we

More information

First Midterm Examination

First Midterm Examination 24-25 Fll Semester First Midterm Exmintion ) Give the stte digrm of DFA tht recognizes the lnguge A over lphet Σ = {, } where A = {w w contins or } 2) The following DFA recognizes the lnguge B over lphet

More information

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,

More information

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn

More information

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique? XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4

More information

11.1 Finite Automata. CS125 Lecture 11 Fall Motivation: TMs without a tape: maybe we can at least fully understand such a simple model?

11.1 Finite Automata. CS125 Lecture 11 Fall Motivation: TMs without a tape: maybe we can at least fully understand such a simple model? CS125 Lecture 11 Fll 2016 11.1 Finite Automt Motivtion: TMs without tpe: mybe we cn t lest fully understnd such simple model? Algorithms (e.g. string mtching) Computing with very limited memory Forml verifiction

More information

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Lexicl Anlysis nd These slides re sed on slides copyrighted y Keith Cooper, Ken Kennedy & Lind Torczon t Rice University First Progrmming Project Instruction Scheduling Project hs een posted

More information

7.2 The Definite Integral

7.2 The Definite Integral 7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where

More information

Automata and Languages

Automata and Languages Automt nd Lnguges Prof. Mohmed Hmd Softwre Engineering Lb. The University of Aizu Jpn Grmmr Regulr Grmmr Context-free Grmmr Context-sensitive Grmmr Regulr Lnguges Context Free Lnguges Context Sensitive

More information

Turing Machines Part One

Turing Machines Part One Turing Mchines Prt One Wht problems cn we solve with computer? Regulr Lnguges CFLs Lnguges recognizble by ny fesible computing mchine All Lnguges Tht sme drwing, to scle. All Lnguges The Problem Finite

More information

Lecture 6 Regular Grammars

Lecture 6 Regular Grammars Lecture 6 Regulr Grmmrs COT 4420 Theory of Computtion Section 3.3 Grmmr A grmmr G is defined s qudruple G = (V, T, S, P) V is finite set of vribles T is finite set of terminl symbols S V is specil vrible

More information

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy: Scnner Specifying ptterns source code tokens scnner prser IR A scnner must recognize the units of syntx Some prts re esy: errors mps chrcters into tokens the sic unit of syntx x = x + y; ecomes

More information

1 Structural induction, finite automata, regular expressions

1 Structural induction, finite automata, regular expressions Discrete Structures Prelim 2 smple uestions s CS2800 Questions selected for spring 2017 1 Structurl induction, finite utomt, regulr expressions 1. We define set S of functions from Z to Z inductively s

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages Algorithms & Models of Computtion CS/ECE 374, Fll 2017 NFAs continued, Closure Properties of Regulr Lnguges Lecture 5 Tuesdy, Septemer 12, 2017 Sriel Hr-Peled (UIUC) CS374 1 Fll 2017 1 / 31 Regulr Lnguges,

More information

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v. 1 Exercises Chpter 1 Exercise 1.1. Let Σ e n lphet. Prove wv = w + v for ll strings w nd v. Prove # (wv) = # (w)+# (v) for every symol Σ nd every string w,v Σ. Exercise 1.2. Let w 1,w 2,...,w k e k strings,

More information

Finite Automata Part Three

Finite Automata Part Three Finite Automt Prt Three Hello Hello Wonderful Wonderful Condensed Condensed Slide Slide Reders! Reders! The The first first hlf hlf of of this this lecture lecture consists consists lmost lmost exclusively

More information

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010 CS 311 Homework 3 due 16:30, Thursdy, 14 th Octoer 2010 Homework must e sumitted on pper, in clss. Question 1. [15 pts.; 5 pts. ech] Drw stte digrms for NFAs recognizing the following lnguges:. L = {w

More information

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018 CS 301 Lecture 04 Regulr Expressions Stephen Checkowy Jnury 29, 2018 1 / 35 Review from lst time NFA N = (Q, Σ, δ, q 0, F ) where δ Q Σ P (Q) mps stte nd n lphet symol (or ) to set of sttes We run n NFA

More information

Deterministic Finite Automata

Deterministic Finite Automata Finite Automt Deterministic Finite Automt H. Geuvers nd J. Rot Institute for Computing nd Informtion Sciences Version: fll 2016 J. Rot Version: fll 2016 Tlen en Automten 1 / 21 Outline Finite Automt Finite

More information

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.

Duality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below. Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we

More information

NFAs continued, Closure Properties of Regular Languages

NFAs continued, Closure Properties of Regular Languages lgorithms & Models of omputtion S/EE 374, Spring 209 NFs continued, losure Properties of Regulr Lnguges Lecture 5 Tuesdy, Jnury 29, 209 Regulr Lnguges, DFs, NFs Lnguges ccepted y DFs, NFs, nd regulr expressions

More information

COMPUTER SCIENCE TRIPOS

COMPUTER SCIENCE TRIPOS CST.2011.2.1 COMPUTER SCIENCE TRIPOS Prt IA Tuesdy 7 June 2011 1.30 to 4.30 COMPUTER SCIENCE Pper 2 Answer one question from ech of Sections A, B nd C, nd two questions from Section D. Submit the nswers

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic nd Theory of Computing Fuhu (Frnk) Cheng Deprtment of Computer Science University of Kentucky 1 Tle of Contents: Week 1: Preliminries (set lger, reltions, functions) (red Chpters 1-4) Weeks

More information

THEOTY OF COMPUTATION

THEOTY OF COMPUTATION Pushdown utomt nd Prsing lgorithms: Pushdown utomt nd context-free lnguges; Deterministic PDNondeterministic PD- Equivlence of PD nd CFG-closure properties of CFL. PUSHDOWN UTOMT ppliction: Regulr lnguges

More information

Finite-State Automata: Recap

Finite-State Automata: Recap Finite-Stte Automt: Recp Deepk D Souz Deprtment of Computer Science nd Automtion Indin Institute of Science, Bnglore. 09 August 2016 Outline 1 Introduction 2 Forml Definitions nd Nottion 3 Closure under

More information