CS 275 Automata and Formal Language Theory

CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.6.: Push Down Automt Remrk: This mteril is no longer tught nd not directly exm relevnt Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson) Dept. of Computer Science, Swnse University http://www.cs.swn.c.uk/ csetzer/lectures/ utomtformllnguge/current/index.html Mrch 9, 2018 CS 275 Chpter II.6. 1/ 50

II.6.1. Push Down Automt II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs II.6.3. Equivlence of CFG nd PDA CS 275 Chpter II.6. 2/ 50

II.6.1. Push Down Automt Motivtion There re context free lnguges which re not regulr, so they cn t be recognised by NFA. The problem is tht when prsing lnguge such s L = {ww R w {, b} }, when checking the second hlf of the word, one needs to know the first hlf of the word in reverse order. An NFA hs only finitely mny sttes, nd therefore only finite memory. We cn repir the problem by dding stck to memory, which llows us to record in cse of L the first hlf of the word. The resulting mchine will be clled utomton ( PDA). push-down CS 275 Sect. II.6.1. 3/ 50

II.6.1. Push Down Automt Architecture of PDA A PDA consists of An input type, contining the input word. The input tpe will be red only from left to right. A finite stte q. A stck. CS 275 Sect. II.6.1. 4/ 50

II.6.1. Push Down Automt Picture finite Stte q top A b c Input Tpe Stck CS 275 Sect. II.6.1. 5/ 50

II.6.1. Push Down Automt Instructions of PDA A PDA hs two kinds of instruction: An empty move instructions which depending on the stte, the top symbol of the stck, without looking t the next symbol on the input tpe chooses new stte, possibly empty sequence of stck symbols to replce the top symbol on the stck, nd keeps the pointer to the input tpe. An ordinry instruction, which depending on the stte, the top symbol of the stck, nd the next input symbol on the tpe chooses new stte, possibly empty sequence of stck symbols to replce the top symbol on the stck, nd moves the pointer on the input tpe forwrd. CS 275 Sect. II.6.1. 6/ 50

II.6.1. Push Down Automt Execution of PDA The utomton strts with the hed on the left most symbol of the input tpe, with some initil stte nd strt symbol on the stck. It will then do the following non-deterministiclly: If the stck is empty, it will get stuck. Otherwise, depending on the stte s nd the top stck symbol x it will do one of the following: If s nd x mtch n empty move trnsition then replce the top stck symbol by the new symbols to be put on the stck, switch to new stte nd keep the position on the input tpe s before. If the next symbol on the input tpe is y nd s, x, y mtch n ordinry trnsition then replce the top stck symbol by the new symbols to be put on the stck, switch to new stte nd keep the position on the input tpe s before. CS 275 Sect. II.6.1. 7/ 50

II.6.1. Push Down Automt Execution of PDA If none of the bove instructions mtches, the PDA gets stuck. The PDA will stop if it gets stuck (becuse of empty stck or hving no opertion possible) while still reding the word, it wnts to red letter nd hs reched the end of the word. There re two kinds of lnguges defined from PDA: The lnguge ccepted by finl stte. A word is ccepted by finl stte, if the PDA reds the complete word nd reches stte which is finl (ccepting). The lnguge ccepted by empty stck. A word is ccepted by empty stck, if the PDA reds the complete word nd then obtins empty stck. CS 275 Sect. II.6.1. 8/ 50

II.6.1. Push Down Automt Acceptnce by Finl Stte vs Empty Stte Clerly, cceptnce by finl stte is nturl generlistion of NFA. The stndrd LL(k), LR(k), LALR prsing lgorithms use deterministic PDA ccepting by finl stte. CFG correspond in nturl wy to non-deterministic PDA which ccept by empty stck. We will see tht the lnguges ccepted by PDAs by empty stck nd ccepted by PDAs by finl stte re equivlent. Therefore PDAs ccepting by empty stte correspond to intermedite mchines for proving the equivlence of context free lnguges nd lnguges ccepted by finl stte by PDA. CS 275 Sect. II.6.1. 9/ 50

II.6.1. Push Down Automt Acceptnce by Finl Stte vs Empty Stte Deterministic PDAs ccepting by empty stck only ccept lnguges L which hve the prefix property, i.e. if w is proper prefix of word w L, then w L. Since prsing should be idelly done by deterministic mchine, deterministic PDAs ccepting by finl stte form more nturl model. CS 275 Sect. II.6.1. 10/ 50

II.6.1. Push Down Automt Convention Regrding Stcks When writing w for the stck, the top element will be the left most element of w. CS 275 Sect. II.6.1. 11/ 50

II.6.1. Push Down Automt Definition of PDA Definition A push down utomton ( PDA) P = (T, Q, Γ, q 0, Z 0, F, δ) consists of input lphbet T ; finite set of sttes Q; stck lphbet Γ; strt stte q 0 Q; n strt stck symbol Z 0 Γ. set of ccepting or finl sttes F Q; trnsition reltion δ Q Γ (T {ɛ}) Q Γ, where we write (q, Z) (q, w) for (q, Z,, q, w) δ. CS 275 Sect. II.6.1. 12/ 50

II.6.1. Push Down Automt Interprettion of the Components The input lphbet T will be the lphbet on the tpe; so the PDA will look t elements of T nd check whether it ccepts them or not; The stck will consist of elements of the stck lphbet Γ. The PDA will strt in strt stte q 0 with stck consisting of the strt stck symbol Z 0. F will be the set of finl sttes in cse we consider the lnguge ccepted by finl stte. (q, Z) (q, w) mens tht the PDA cn, when in stte q, nd if the next tpe symbol is move on the tpe once to the right, chnge to stte q, nd replce the top stck symbol Z by w. ɛ (q, Z) (q, w) mens tht the PDA cn, when in stte q nd hving top symbol Z on the stck, mke move without looking t the next letter, nd switch to stte q, nd replce the top symbol on the stck by w. CS 275 Sect. II.6.1. 13/ 50

II.6.1. Push Down Automt Exmple We define PDA which ccepts the lnguge L := {ww R w {, b} } We will use the stck in order to record the first hlf of the word, so need the stck symbols, b. In ddition we need the bottom symbol of the stck Z 0. After hving finished prsing the word, if we red this symbol the PDA will more to n ccepting stte. There re 3 sttes: Initil stte q 0. In this stte the PDA will red the first hlf of the word, nd push it onto the stck. An intermedite stte q1. When reding the second hlf of the word, the PDA will be in this stte nd pop symbols from the stck. A finl ccepting stte. CS 275 Sect. II.6.1. 14/ 50

II.6.1. Push Down Automt Exmple So our PDA will be ({, b}, {q 0, q 1, q 2 }, {Z 0,, b}, q 0, Z 0, {q 2 }, δ) CS 275 Sect. II.6.1. 15/ 50

II.6.1. Push Down Automt Trnsitions We hve the following trnsitions in δ: Initilly the PDA is in stte q 0, sees stck symbol Z 0, nd then reds the first symbol on the tpe nd pushes it on the stck: (q 0, Z 0 ) (q 0, Z 0 ) (q 0, Z 0 ) b (q 0, bz 0 ) When reding future letters, the PDA will, when in stte q 0, push them on the stck: (q 0, ) (q 0, b) (q 0, ) (q 0, b) b b (q 0, ) (q 0, b) (q 0, b) (q 0, bb) CS 275 Sect. II.6.1. 16/ 50

II.6.1. Push Down Automt Trnsitions The PDA will guess when it hs reched the middle of the stck, nd then switch silently (mking ɛ-trnsition, i.e. trnsition without reding the next symbol) to stte q 1 without modifying the stck. This cn hppen t the beginning (without hving red symbol, so stck is Z 0 ), or when it hs lredy pushed some symbol on the stck, so the stck symbol cn be ny of, b, Z 0 : (q 0, Z 0 ) (q 0, ) (q 0, b) ɛ (q 1, Z 0 ) ɛ (q 1, ) ɛ (q 1, b) CS 275 Sect. II.6.1. 17/ 50

II.6.1. Push Down Automt Trnsitions When in stte q 1, the PDA compres whether the letters it reds re identicl to those it red in the first prt, so whether the letter it reds is identicl to the letter on the stck. If yes it empties the stck. (q 1, ) (q 1, b) (q 1, ɛ) b (q 1, ɛ) CS 275 Sect. II.6.1. 18/ 50

II.6.1. Push Down Automt Trnsitions When the stck is emptied, while in stte q 1, the PDA cn move to the ccepting stte q 2. It will s well empty the stck. If there re more letters to be red, the PDA will get stuck in tht stte, becuse there will be no trnsitions in this stte. (q 1, Z 0 ) ɛ (q 2, ɛ) CS 275 Sect. II.6.1. 19/ 50

Presenttion of the PDA by Tble PDA terminls P, b sttes q 0, q 1, q 2 stck lphb. Z 0,, b strt stte q 0 strt stck Z 0 finl q 2 trnsitions (q 0, Z 0 ) (q 0, Z 0 ) (q 0, Z 0 ) (q 0, bz 0 ) (q 0, ) (q 0, ) (q 0, b) (q 0, b) b (q 0, ) (q 0, b) (q 0, b) (q 0, bb) ɛ ɛ (q 0, Z 0 ) (q 1, Z 0 ) (q 0, ) (q 1, ) ɛ (q 0, b) (q 1, b) (q 1, ) (q 1, ɛ) b ɛ (q 1, b) (q 1, ɛ) (q 1, Z 0 ) (q 2, ɛ) b b

II.6.1. Push Down Automt Grphicl Presenttion We cn present PDA s well grphiclly by trnsition digrm with sttes, strt stte, finl sttes represented s before, trnsitions lbelled with n expression, Z/w where T, Z Γ, w Γ, where trnsition from stte q to q lbelled by, Z/w stnds for (q, Z) (q, w) The lphbet T is implicitly given by the elements of the lphbet ppering in the trnsitions. CS 275 Sect. II.6.1. 21/ 50

II.6.1. Push Down Automt Exmple in Grphicl Nottion ɛ, Z 0 /Z 0 ɛ, / ɛ, b/b ɛ, Z 0 /ɛ q 0 q 1 q 2, Z 0 /Z 0 b, Z 0 /bz 0, /, b/b b, /b b, b/bb, /ɛ b, b/ɛ CS 275 Sect. II.6.1. 22/ 50

II.6.1. Push Down Automt Configurtion of PDA The complete stte of PDA is given by its stck nd n element of Q. We cll this the configurtion of PDA. Definition Let P = (T, Q, Γ, q 0, Z 0, F, δ) be PDA. A configurtion of P is given by n element (q, z) where q Q, z Γ. CS 275 Sect. II.6.1. 23/ 50

II.6.1. Push Down Automt Trnsition of Configurtions We extend the reltion (q, Z) (q, z) for q, q Q, Z Γ, z Γ, T {ɛ} to one step reltion (q, z) 1 (q, z ) between configurtions (q, z), (q, z ) nd w T expressing tht: If the PDA hs configurtion (q, z), then it cn mke one step movement using to configurtion (q, z ). Furthermore we define n n-step trnsition reltion (q, z) w n (q, z ) nd trnsition reltion expressing tht we cn move from one configurtion to nother in rbitrrily mny steps (q, z) w (q, z ) CS 275 Sect. II.6.1. 24/ 50

II.6.1. Push Down Automt One Step Trnsition of Configurtion Definition Let P = (T, Q, Γ, q 0, Z 0, F, δ) be PDA. for q, q Q, Z Γ, z Γ, define the one step trnsition reltion between configurtions (q, z) nd (q, z ) nd T {ɛ}, written s by: (q, z) 1 (q, z ) If (q, Z) (q, z), then z Z (q, Zz ) 1 (q, zz ) Since 1 extends, we often write write (q, z) (q, z ) insted of (q, z) 1 (q, z ). CS 275 Sect. II.6.1. 25/ 50

II.6.1. Push Down Automt n-step Trnsition of Configurtion Definition Let P = (T, Q, Γ, q 0, Z 0, F, δ) be PDA. We define for n N the n-step trnsition reltion between configurtions (q, z), (q, z ) nd w T, written s (q, z) w n (q, z ) s follows: ɛ (q, z) 0 (q, z) If (q, z) 1 (q, z ) nd (q, z ) w n (q, z ), then (q, z) w n+1 (q, z ) CS 275 Sect. II.6.1. 26/ 50

II.6.1. Push Down Automt Trnsition of Configurtion Definition Let P = (T, Q, Γ, q 0, Z 0, F, δ) be PDA. We define the trnsition reltion between configurtions (q, z), (q, z ) nd w T, written s (q, z) w (q, z ) s follows: (q, z) w (q, z ) iff n N.(q, z) w n (q, z ) CS 275 Sect. II.6.1. 27/ 50

II.6.1. Push Down Automt Lnguge Accepted by PDA Definition Let P = (T, Q, Γ, q 0, Z 0, F, δ) be PDA. lnguge ccepted by finl The is defined s stte of P, denoted by L finl (P), L finl (P) := {w T (q 0, Z 0 ) w (q, z) for some q F, z Γ } A finl-stte PDA is PDA P for which we define its lnguge to be L(P) = L finl (P). CS 275 Sect. II.6.1. 28/ 50

II.6.1. Push Down Automt Lnguge Accepted by PDA Definition (Cont) The lnguge ccepted L empty (P), is defined s by empty stck of P, denoted by L empty (P) := {w T (q 0, Z 0 ) w (q, ɛ) for some q Q} An empty-stck PDA is PDA P for which we define its lnguge to be L(P) = L empty (P). CS 275 Sect. II.6.1. 29/ 50

II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs II.6.1. Push Down Automt II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs II.6.3. Equivlence of CFG nd PDA CS 275 Sect. II.6.2. 30/ 50

II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs Equivlence Acceptnce by Finl/Empty Stck Theorem Let L be lnguge. The following re equivlent: (1) L = L(P) for finl stte PDA P. (2) L = L(P) for n empty stck PDA P. CS 275 Sect. II.6.2. 31/ 50

II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs Proof Ide of (1) (2) Let P = (T, Q, Γ, q 0, Z 0, F, δ). The ide for constructing n empty stck PDA P is s follows: P opertes essentilly s P. If P reches n ccepting stte, then P cn switch to specil stte. In tht stte it empties using ɛ-trnsition the stck nd nd therefore ccepts the string. Full detils cn be found in Additionl Mteril. CS 275 Sect. II.6.2. 32/ 50

II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs Proof of (2) (1) Let P = (T, Q, Γ, q 0, Z 0, F, δ). The ide for constructing finl stte PDA P is s follows: P keeps specil stck symbol Z 0 t the bottom of the stck. It opertes s P, until it observes tht the top stck symbol is Z 0. This indictes tht P would hve reched n empty stck nd therefore ccepted the string. Therefore P moves into specil ccepting stte q finl nd termintes. Full detils cn be found in the Additionl Mteril. CS 275 Sect. II.6.2. 33/ 50

II.6.3. Equivlence of CFG nd PDA II.6.1. Push Down Automt II.6.2. Equivlence of Finl-Stte nd Empty-Stck-PDAs II.6.3. Equivlence of CFG nd PDA CS 275 Sect. II.6.3. 34/ 50

II.6.3. Equivlence of CFG nd PDA Equivlence Acceptnce by Finl/Empty Stck We re going to show tht the context free lnguges re exctly the lnguges which cn be recognised by (non-deterministic) empty stck PDA. Since empty stck PDA re finl stte PDA recognise the sme lnguges, the context free lnguges re exctly the lnguges which cn be recognised by PDA. CS 275 Sect. II.6.3. 35/ 50

II.6.3. Equivlence of CFG nd PDA Equivlence of empty stck PDA nd CFG Theorem Let L be lnguge. The following re equivlent: (1) L = L(G) for CFG G. (2) L = L(P) for n empty stck PDA P. CS 275 Sect. II.6.3. 36/ 50

II.6.3. Equivlence of CFG nd PDA Corollry Becuse empty stck nd finl stte PDAs re equivlent, from the theorem follows immeditely the following corollry: Corollry Let L be lnguge. The following re equivlent: (1) L = L(G) for CFG G. (2) L = L(P) for n empty stck PDA P. (2) L = L(P) for finl stte PDA P. CS 275 Sect. II.6.3. 37/ 50

II.6.3. Equivlence of CFG nd PDA Proof of the Theorem We will only show (1) (2), (2) (1) is quite sophisticted. CS 275 Sect. II.6.3. 38/ 50

II.6.3. Equivlence of CFG nd PDA Proof of (1) (2) Assume G = (T, N, S, P) is CFG. We need to construct PDA which simultes G. There re two wys of constructing such PDA, one follows the LL-prsing method, one uses the LR-prsing method. LL prsing will result in PDA with single stte. The configurtion cn be given by its stck. We will introduce in the min slides LL-prsing nd in the Additionl mteril LR-prsing. Deterministic versions of of them re the bsis for mny prsers. In the dditionl mteril you will find the full proof of (1) (2) by showing for the PDA bsed on the LL prser tht it is equivlent to the CFG. CS 275 Sect. II.6.3. 39/ 50

II.6.3. Equivlence of CFG nd PDA PDA bsed on LL-Prsing LL-prsing stnds for left-to-right prsing bsed on leftmost derivtion. It constructs leftmost derivtion top down, nd is therefore n exmple of top down prser. We tke s exmple the grmmr with the rules We consider left-most derivtion S AC A Ab A b C ccd C cd S AC AbC bbc bbccd bbccdd CS 275 Sect. II.6.3. 40/ 50

II.6.3. Equivlence of CFG nd PDA Exmple of LL Prsing S AC A Ab A b C ccd C cd S AC AbC bbc bbccd bbccdd We strt on our PDA with stck S for the strt symbol. So we hve stck S We guess tht the rule is S AC, nd replce the top symbol S on the stck by AC. So we hve stck AC (the top symbol of the stck is the left most one). We guess tht the rule for the top symbol is A Ab, nd replce the top symbol A on the stck by Ab. So we hve stck AbC (the top of the stck is the symbol most to the left). CS 275 Sect. II.6.3. 41/ 50

II.6.3. Equivlence of CFG nd PDA Exmple of LL Prsing S AC A Ab A b C ccd C cd S AC AbC bbc bbccd bbccdd We hve now stck AbC. We cn ccept the letter, nd remove this symbol from the stck. So we hve stck AbC We guess tht the rule is A b, nd replce the top symbol A on the stck by b. So we hve stck bbc Now we cn 3 times ccept letter, nmely, b, b nd remove them from the stck. So we hve stck C. CS 275 Sect. II.6.3. 42/ 50

II.6.3. Equivlence of CFG nd PDA Exmple of LL Prsing S AC A Ab A b C ccd C cd S AC AbC bbc bbccd bbccdd We hve now stck C. We use rule C ccd nd replce C by ccd. So we hve stck ccd We consume the next letter c nd ccept it nd remove c from the stck. So we hve stck Cd We expnd C using C cd to cd. So we hve stck cdd. Now we consume nd ccept letters c, d, d, cler them from the stck, nd then ccept the word becuse we hve reched the empty stck. CS 275 Sect. II.6.3. 43/ 50

II.6.3. Equivlence of CFG nd PDA S A C C C C C C C A A b b b b b b b d c C C d d d c d d d ccept ɛ ɛ ɛ ɛ ɛ b b c c d d CS 275 Sect. II.6.3. 44/ 50

II.6.3. Equivlence of CFG nd PDA LL(k) Prsing We hd to guess which production to use. LL(k) is n lgorithm to decide by using lookhed of the next k symbols, which production to use. For instnce when the stck ws AC we could hve guessed by knowing tht the next two letters re,, tht we need to use the production A Ab nd not A b, since the ltter would give s next two letters b. Usully only lookhed of 1 symbol is used, but most stndrd grmmrs (e.g. Jv) hve no LL(1) grmmrs. CS 275 Sect. II.6.3. 45/ 50

II.6.3. Equivlence of CFG nd PDA Invrint The invrint kept bove ws tht when hving consumed string w nd obtined stck v, then we hve S wv. CS 275 Sect. II.6.3. 46/ 50

II.6.3. Equivlence of CFG nd PDA Resulting PDA We obtin PDA with single stte. Therefore we omit the stte, nd ignore the strt stte, set of sttes, finl sttes, the stte in configurtions, nd hve trnsitions for Z Γ, T, z Γ. Z z The stck symbols were the lphbet used in strings occurring in the derivtions, i.e. T N. The strt stck symbol ws S. CS 275 Sect. II.6.3. 47/ 50

II.6.3. Equivlence of CFG nd PDA Resulting PDA The PDA hs the following productions: If the top symbol is non-terminl A, nd there ws rule A w, we could replce A by w, nd mke n ɛ-trnsition. So we hve in this cse trnsition ɛ A w If the top symbol ws terminl, then we could ccept letter nd pop the symbol from the stck. So we hve trnsitions ɛ CS 275 Sect. II.6.3. 48/ 50

II.6.3. Equivlence of CFG nd PDA Resulting Empty Stck Single Stte PDA The empty stck PDA derived from G = (T, N, S, P) is s follows: PDA P terminls T sttes single stte stck lphb. T N strt stck S trnsitions A ɛ w if A w P ɛ if T In the dditionl mteril there is proof L(P) = L(G). CS 275 Sect. II.6.3. 49/ 50

II.6.3. Equivlence of CFG nd PDA LR Prsing nd Proof of Equivlence This cn be found in the Additionl Mteril CS 275 Sect. II.6.3. 50/ 50