Equivalence of CFG s and PDA s

Similar documents
Miscellaneous. Closure Properties Decision Properties

Pushdown Automata (2015/11/23)

UNIT-VI PUSHDOWN AUTOMATA

Automata and Formal Languages. Push Down Automata. Sipser pages Lecture 13. Tim Sheard 1

Accept or reject. Stack

Pushdown Automata (PDA) The structure and the content of the lecture is based on

CSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman

Pushdown Automata: Introduction (2)

MTH401A Theory of Computation. Lecture 17

Introduction to Formal Languages, Automata and Computability p.1/42

CSE 105 THEORY OF COMPUTATION

(pp ) PDAs and CFGs (Sec. 2.2)

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

CS Pushdown Automata

(pp ) PDAs and CFGs (Sec. 2.2)

CSE 105 THEORY OF COMPUTATION

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

CPS 220 Theory of Computation Pushdown Automata (PDA)

Nondeterminism and Epsilon Transitions

Pushdown Automata. Reading: Chapter 6

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Chapter Five: Nondeterministic Finite Automata

Regular Expressions and Language Properties

CFGs and PDAs are Equivalent. We provide algorithms to convert a CFG to a PDA and vice versa.

Solution Scoring: SD Reg exp.: a(a

Nondeterministic Finite Automata

CFG and PDA accept the same languages

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Fundamentele Informatica II

5 Context-Free Languages

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

CISC4090: Theory of Computation

Einführung in die Computerlinguistik

The Idea of a Pushdown Automaton

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Nondeterministic Finite Automata. Nondeterminism Subset Construction

Push-down Automata = FA + Stack

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Theory of Computation (IV) Yijia Chen Fudan University

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Computational Models - Lecture 5 1

Homework. Context Free Languages. Announcements. Before We Start. Languages. Plan for today. Final Exam Dates have been announced.

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

Fooling Sets and. Lecture 5

Lecture 13: Turing Machine

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

This lecture covers Chapter 6 of HMU: Pushdown Automata

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Context-free Languages and Pushdown Automata

Theory Bridge Exam Example Questions

Pushdown Automata. Chapter 12

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Properties of Context-Free Languages. Closure Properties Decision Properties

Lecture 17: Language Recognition

Computability and Complexity

Context-Free Languages

Theory of Computation (II) Yijia Chen Fudan University

Automata Theory, Computability and Complexity

CSE 105 THEORY OF COMPUTATION

MA/CSSE 474 Theory of Computation

CSC236 Week 10. Larry Zhang

Turing machines and linear bounded automata

CPS 220 Theory of Computation

Introduction to Theory of Computing

Foundations of Informatics: a Bridging Course

CDM Parsing and Decidability

Languages, regular languages, finite automata

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Pushdown Automata (Pre Lecture)

Undecidable Problems and Reducibility

Part I: Definitions and Properties

Theory of Computation

CSE 105 THEORY OF COMPUTATION

Computational Models: Class 5

Decision, Computation and Language

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CSE 105 THEORY OF COMPUTATION

Pushdown Automata. Pushdown Automata. Pushdown Automata. Pushdown Automata. Pushdown Automata. Pushdown Automata. The stack

Nondeterministic finite automata

Theory of Computation

PUSHDOWN AUTOMATA (PDA)

Turing Machines Part III

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

Finite Automata. Mahesh Viswanathan

The Pumping Lemma and Closure Properties

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Midterm Exam 2 CS 341: Foundations of Computer Science II Fall 2018, face-to-face day section Prof. Marvin K. Nakayama

SCHEME FOR INTERNAL ASSESSMENT TEST 3

Intro to Theory of Computation

CS481F01 Prelim 2 Solutions

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Homework 8. a b b a b a b. two-way, read/write

Transcription:

Equivalence of CFG s and PDA s Mridul Aanjaneya Stanford University July 24, 2012 Mridul Aanjaneya Automata Theory 1/ 53

Recap: Pushdown Automata A PDA is an automaton equivalent to the CFG in language-defining power. Only the nonterministic PDA s define all possible CFL s. But the deterministic version models parsers. Most programming languages have deterministic PDA s. Mridul Aanjaneya Automata Theory 2/ 53

Recap: Intuition Think of an ε-nfa with the additional power that it can manipulate a stack. Its moves are determined by: 1 The current state (of its NFA). 2 The current input symbol (or ε), and 3 The current symbol on top of its stack. Mridul Aanjaneya Automata Theory 3/ 53

Recap: Intuition Being nondeterministic, the PDA can have a choice of next moves. In each choice, the PDA can: 1 Change state, and also 2 Replace the top symbol on the stack by a sequence of zero or more symbols. Zero symbols = pop. Many symbols = sequence of pushes. Mridul Aanjaneya Automata Theory 4/ 53

Recap: PDA Formalism A PDA is described by: 1 A finite set of states (Q, typically). 2 An input alphabet (Σ, typically). 3 A stack alphabet (Γ, typically). 4 A transition function (δ, typically). 5 A start state (q 0, in Q, typically). 6 A start symbol (Z 0, in Γ, typically). 7 A set of final states (F Q, typically). Mridul Aanjaneya Automata Theory 5/ 53

Recap: The Transition Function Takes three arguments: 1 A state in Q. 2 An input which is either a symbol in Σ or ε. 3 A stack symbol in Γ. δ(q,a,z) is a set of zero or more actions of the form (p,α). p is a state, α is a string of stack symbols. Mridul Aanjaneya Automata Theory 6/ 53

Recap: Actions of the PDA If δ(q,a,z) contains (p,α) among its actions, then one thing the PDA can do in state q, with a at the front of the input, and Z on top of the stack is: 1 Change the state to p. 2 Remove a from the front of the input (but a may be ε). 3 Replace Z on the top of the stack by α. Mridul Aanjaneya Automata Theory 7/ 53

Example: PDA Design a PDA to accept {0 n 1 n n 1}. The states: q = start state. We are in state q if we have seen only 0 s so far. p = we ve seen at least one 1 and may now proceed only if the inputs are 1 s. f = final state; accept. Mridul Aanjaneya Automata Theory 8/ 53

Example: PDA The stack symbols: Z 0 = start symbol. Also marks the bottom of the stack, so we know we have counted the same number of 1 s as 0 s. X = marker, used to count the number of 0 s seen on the input. Mridul Aanjaneya Automata Theory 9/ 53

Example: PDA The transitions: δ(q,0,z 0 ) = {(q,xz 0 )}. δ(q,0,x) = {(q,xx)}. These two rules cause one X to be pushed onto the stack for each 0 read from the input. δ(q,1,x) = {(p,ε)}. When we see a 1, go to state p and pop one X. δ(p,1,x) = {(p,ε)}. Pop one X per 1. δ(p,ε,z 0 ) = {(f,z 0 )}. Accept at bottom. Mridul Aanjaneya Automata Theory 10/ 53

Actions of the Example PDA 0 0 0 111 q Z 0 Mridul Aanjaneya Automata Theory 11/ 53

Actions of the Example PDA 0 0 1 11 q X Z 0 Mridul Aanjaneya Automata Theory 12/ 53

Actions of the Example PDA 0 1 1 1 q X X Z 0 Mridul Aanjaneya Automata Theory 13/ 53

Actions of the Example PDA 1 1 1 q X X X Z 0 Mridul Aanjaneya Automata Theory 14/ 53

Actions of the Example PDA 1 1 p X X Z 0 Mridul Aanjaneya Automata Theory 15/ 53

Actions of the Example PDA 1 p X Z 0 Mridul Aanjaneya Automata Theory 16/ 53

Actions of the Example PDA f Z 0 Mridul Aanjaneya Automata Theory 17/ 53

Instantaneous Descriptions We can formalize the pictures just seen with an instantaneous description (ID). An ID is a triple (q,w,α), where: 1 q is the current state. 2 w is the remaining input. 3 α is the stack contents, top at the left. Mridul Aanjaneya Automata Theory 18/ 53

The Goes-To Relation To say that ID I can becomes ID J in one move of the PDA, we can write I J. Formally, (q,aw,xα) (p,w,βα) for any w and α, if δ(q,a,x) contains (p,β). Extend to, meaning zero or more moves, by: Basis: I I. Induction: If I J and J K, then I K. Mridul Aanjaneya Automata Theory 19/ 53

Example: Goes-To Using the previous example PDA, we can describe the sequence of moves by (q, 000111, Z 0 ) (q, 00111, XZ 0 ) (q, 0111, XXZ 0 ) (q, 111, XXXZ 0 ) (p, 11, XXZ 0 ) (p, 1, XZ 0 ) (p, ε, Z 0 ) (f, ε, Z 0 ) Thus, (q,000111,z 0 ) (f,ε,z 0 ). Mridul Aanjaneya Automata Theory 20/ 53

Example: Goes-To Using the previous example PDA, we can describe the sequence of moves by (q, 000111, Z 0 ) (q, 00111, XZ 0 ) (q, 0111, XXZ 0 ) (q, 111, XXXZ 0 ) (p, 11, XXZ 0 ) (p, 1, XZ 0 ) (p, ε, Z 0 ) (f, ε, Z 0 ) Thus, (q,000111,z 0 ) (f,ε,z 0 ). Question What would happen on the input 0001111? Mridul Aanjaneya Automata Theory 21/ 53

Answer (q, 0001111, Z 0 ) (q, 001111, XZ 0 ) (q, 01111, XXZ 0 ) (q, 1111, XXXZ 0 ) (p, 111, XXZ 0 ) (p, 11, XZ 0 ) (p, 1, Z 0 ) (f, 1, Z 0 ) Note: The last action is legal because a PDA can use ε input even if input remains. The last ID has no move. 0001111 is not accepted, because the input is not completely consumed. Mridul Aanjaneya Automata Theory 22/ 53

Aside: FA and PDA Notations We represented moves of a FA by an extended δ, which did not mention the input yet to be read. We could have chosen a similar notation for PDA s, where the FA state is replaced by a state-stack combination. Mridul Aanjaneya Automata Theory 23/ 53

FA and PDA Notations Similarly, we could have chosen a FA notation with ID s. Just drop the stack notation. Why the difference? FA tend to models thinks like protocols with infinitely long inputs. PDA model parsers, which are given a fixed program to process. Mridul Aanjaneya Automata Theory 24/ 53

Language of a PDA The common way to define the language of a PDA is by final state. If P is a PDA, then L(P) is the set of strings w such that (q 0,w,Z 0 ) (f,ε,α) for final state f and any α. Mridul Aanjaneya Automata Theory 25/ 53

Language of a PDA Another language defined by the same PDA is by empty stack. If P is a PDA, then N(P) is the set of strings w such that (q 0,w,Z 0 ) (q,ε,ε) for any state q. Mridul Aanjaneya Automata Theory 26/ 53

Equivalence of Language Definitions 1 If L = L(P), then there is another PDA P such that L = N(P ) 2 If L = N(P), then there is another PDA P such that L = L(P ). Mridul Aanjaneya Automata Theory 27/ 53

Proof: L(P) N(P ) Intuition P will simulate P. If P accepts, P will empty its stack. P has to avoid accidentally emptying its stack, so it uses a special bottom marker to catch the case where P empties its stack without accepting. Mridul Aanjaneya Automata Theory 28/ 53

Proof: L(P) N(P ) P has all the states, symbols, and moves of P, plus: 1 Stack symbol X 0, used to guard the stack bottom against accidental emptying. 2 New start state s and erase state e. 3 δ(s,ε,x 0 ) = {(q 0,Z 0 X 0 )}. Get P started. 4 δ(f,ε,x) = δ(e,ε,x) = {(e,ε)} for any final state f of P and any stack symbol X. Mridul Aanjaneya Automata Theory 29/ 53

Proof: N(P) L(P ) Intuition P simulates P. P has a special bottom-marker to catch the situation where P empties its stack. If so, P accepts. Mridul Aanjaneya Automata Theory 30/ 53

Proof: N(P) L(P ) P has all the states, symbols, and moves of P, plus: 1 Stack symbol X 0, used to guard the stack bottom. 2 New start state s and final state f. 3 δ(s,ε,x 0 ) = {(q 0,Z 0 X 0 )}. Get P started. 4 δ(q,ε,x 0 ) = {(f,ε)} for any state q of P. Mridul Aanjaneya Automata Theory 31/ 53

Deterministic PDA s To be deterministic, there must be at most one choice of move for any state q, input symbol a, and stack symbol X. In addition, there must not be a choice between using input ε or real input. Formally, δ(q,a,x) and δ(q,ε,x) cannot both be nonempty. Mridul Aanjaneya Automata Theory 32/ 53

Equivalence of PDA s and CFG s: Overview When we talked about closure properties of regular languages, it was useful to be able to jump between RE and DFA representations. Similarly, CFG s and PDA s are both useful to deal with properties of CFL s. Mridul Aanjaneya Automata Theory 33/ 53

Equivalence of PDA s and CFG s: Overview Also, PDA s, being algorithmic, are often easier to use when arguing that a language is a CFL. Example: It is easy to see how a PDA can recognize balanced parentheses, not so easy as a grammar. But all depends on knowing that CFG s and PDA s both define the CFL s. Mridul Aanjaneya Automata Theory 34/ 53

Converting a CFG to a PDA Let L = L(G). Construct PDA P such that N(P) = L. P has: One state q. Input symbols = terminals of G. Stack symbols = all symbols of G. Start symbol = start symbol of G. Mridul Aanjaneya Automata Theory 35/ 53

Intuition about P Given input w, P will step through a leftmost derivation of w from the start symbol S. Since P can t know what this derivation is, or even what the end of w is, it uses nondeterminism to guess the production to use at each step. Mridul Aanjaneya Automata Theory 36/ 53

Intuition about P At each step, P represents some left-sentential form (step of a leftmost derivation). If the stack of P is α, and P has so far consumed x from its input, then P represents left-sentential form xα. At empty stack, the input consumed is a string in L(G). Mridul Aanjaneya Automata Theory 37/ 53

Transition Function of P 1 δ(q,a,a) = (q,ε). (Type 1 rules) This step does not change the LSF represented, but moves responsibility for a from the stack to the consumed input. 2 If A α is a production of G, then δ(q,ε,a) contains (q,α). (Type 2 rules) Guess a production for A, and represent the next LSF in the derivation. Mridul Aanjaneya Automata Theory 38/ 53

Proof that N(P) = L(G) We need to show that (q,wx,s) (q,x,α) for any x if and only if S lm wα. Part 1: only if is an induction on the number of steps made by P. Basis: 0 steps. Then α = S, w = ε, and S lm S is surely true. Mridul Aanjaneya Automata Theory 39/ 53

Induction for Part 1 Consider n moves of P: (q,wx,s) (q,x,α) and assume the IH for sequences of n-1 moves. There are two cases, depending on whether the last move uses a Type 1 or Type 2 rule. Mridul Aanjaneya Automata Theory 40/ 53

Use of a Type 1 Rule The move sequence must be of the form (q,yax,s) (q,ax,aα) (q,x,α), where ya = w. By the IH applied to the first n-1 steps, S lm yaα. But ya = w, so S lm wα. Mridul Aanjaneya Automata Theory 41/ 53

Use of a Type 2 Rule The move sequence must be of the form (q,wx,s) (q,x,aβ) (q,x,γβ), where A γ is a production and α = γβ. By the IH applied to the first n-1 steps, S lm waβ. Thus, S lm wγβ = wα. Mridul Aanjaneya Automata Theory 42/ 53

Proof of Part 2 ( if ) We also must prove that if S lm wα, then (q,wx,s) (q,x,α) for any x. Induction on number of steps in the leftmost derivation. Ideas are similar. Mridul Aanjaneya Automata Theory 43/ 53

Proof - Completion We now have (q,wx,s) (q,x,α) for any x if and only if S lm wα. In particular, let x = α = ε. Then (q,w,s) (q,ε,ε) if and only if S lm w. That is, w N(P) if and only if w L(G). Mridul Aanjaneya Automata Theory 44/ 53

From a PDA to a CFG Now assume L=N(P). We ll construct a CFG G such that L = L(G). Intuition: G will have variables generating exactly the inputs that cause P to have the net effect of popping a stack symbol X while going from state p to state q. P never gets below this X while doing so. Mridul Aanjaneya Automata Theory 45/ 53

Variables of G G s variables are of the form [pxq]. This variable generates all and only the strings w such that (p, w, X ) (q, ε, ε) Also a start symbol S we ll talk about later. Mridul Aanjaneya Automata Theory 46/ 53

Productions of G Each production for [pxq] comes from a move of P in state p with stack symbol X. Simplest case: δ(p,a,x) contains (q,ε). Then the production is [pxq] a. Note that a can be an input symbol or ε. Here, [pxq] generates a, because reading a is one way to pop X and go from p to q. Mridul Aanjaneya Automata Theory 47/ 53

Productions of G Next simplest case: δ(p,a,x) contains (r,y) for some state r and symbol Y. G has production [pxq] a[ryq]. We can erase X and go from p to q by reading a (entering state r and replacing the X by Y) and then reading some w that gets P from r to q while erasing the Y. Note: [pxq] aw whenever [ryq] w. Mridul Aanjaneya Automata Theory 48/ 53

Productions of G Third simplest case: δ(p,a,x) contains (r,yz) for some state r and symbols Y and Z. Now, P has replaced X by YZ. To have the net effect of erasing X, P must erase Y, going from state r to some state s, and then erase Z, going from s to q. Mridul Aanjaneya Automata Theory 49/ 53

Action of P a wx wx x p r s q X Y Z Z Mridul Aanjaneya Automata Theory 50/ 53

Productions of G Since we do not know state s, we must generate a family of productions: [pxq] a[rys][szq] It follows [pxq] awx whenever [rys] w and [szq] x. Mridul Aanjaneya Automata Theory 51/ 53

Productions of G: General Case Suppose δ(p,a,x) contains (r,y 1,...,Y k ) for some state r and k 3. Generate family of productions [pxq] a[ry 1 s 1 ][s 1 Y 2 s 2 ]... [s k 2 Y k 1 s k 1 ][s k 1 Y k q] Mridul Aanjaneya Automata Theory 52/ 53

Completion of the Construction We can prove that (q 0,w,Z 0 ) (p,ε,ε) iff [q 0 Z 0 p] w. Proof is two easy inductions. Left as exercises. But state p can be anything. Thus, add to G another variable S, the start symbol, and add productions S [q 0 Z 0 p] for each state p. Mridul Aanjaneya Automata Theory 53/ 53