Computational Models - Lecture 4

Similar documents
Computational Models - Lecture 3

Computational Models - Lecture 4 1

What we have done so far

The View Over The Horizon

Computational Models - Lecture 5 1

Computational Models - Lecture 4 1

CISC4090: Theory of Computation

Computational Models: Class 3

Computational Models: Class 5

Computational Models - Lecture 3 1

Computational Models - Lecture 5 1

Introduction to Theory of Computing

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CISC 4090 Theory of Computation

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

The Pumping Lemma for Context Free Grammars

Theory of Computation (II) Yijia Chen Fudan University

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Computational Models Lecture 5

Computational Models - Lecture 3 1

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Theory of Computation

Final exam study sheet for CS3719 Turing machines and decidability.

Lecture 17: Language Recognition

CISC 4090 Theory of Computation

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

Lecture III CONTEXT FREE LANGUAGES

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Context-Free and Noncontext-Free Languages

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Theory of Computation (IV) Yijia Chen Fudan University

DM17. Beregnelighed. Jacob Aae Mikkelsen

CS311 Computational Structures More about PDAs & Context-Free Languages. Lecture 9. Andrew P. Black Andrew Tolmach

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

3515ICT: Theory of Computation. Regular languages

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Foundations of Informatics: a Bridging Course

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Pumping Lemma for CFLs

Cliff s notes for equivalence of CFLs and L(PDAs) LisaCFL L = L(M) for some PDA M L=L(M)forsomePDAM L = L(G) for some CFG G

Context-Free Languages

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

CPS 220 Theory of Computation

PUSHDOWN AUTOMATA (PDA)

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

Context-Free Languages (Pre Lecture)

MA/CSSE 474 Theory of Computation

Properties of Context-Free Languages. Closure Properties Decision Properties

CSE 105 THEORY OF COMPUTATION

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Computational Models - Lecture 1 1

Critical CS Questions

CPSC 421: Tutorial #1

MA/CSSE 474 Theory of Computation

Einführung in die Computerlinguistik

Automata Theory. CS F-10 Non-Context-Free Langauges Closure Properties of Context-Free Languages. David Galles

CS500 Homework #2 Solutions

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Context-Free Grammars and Languages. Reading: Chapter 5

SCHEME FOR INTERNAL ASSESSMENT TEST 3

Finite Automata and Regular languages

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

CSE 355 Test 2, Fall 2016

Languages, regular languages, finite automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CpSc 421 Final Exam December 15, 2006

Computational Models Lecture 2 1

CSE 105 THEORY OF COMPUTATION

Context-Free Languages

1 More finite deterministic automata

Context Free Grammars

Functions on languages:

5 Context-Free Languages

Computational Models Lecture 2 1

CS 154. Finite Automata, Nondeterminism, Regular Expressions

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Sri vidya college of engineering and technology

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Context Free Languages: Decidability of a CFL

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

CSE 105 THEORY OF COMPUTATION

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Pushdown Automata: Introduction (2)

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Fall 1999 Formal Language Theory Dr. R. Boyer. Theorem. For any context free grammar G; if there is a derivation of w 2 from the

DD2371 Automata Theory

Properties of Context-free Languages. Reading: Chapter 7

Pushdown Automata (2015/11/23)

Non-context-Free Languages. CS215, Lecture 5 c

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 5 : DFA minimization

Pushdown Automata. Reading: Chapter 6

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Transcription:

Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push Down Automata (PDA) Hopcroft and Ullman, 3.4 Sipser, 2.1, 2.2 & 2.3 Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 1

The Equivalence Relation L Let L Σ be a language. Define an equivalence relation L on pairs of strings: Let x,y Σ. We say that x L y if for every string z Σ, xz L if and only if yz L. It is easy to see that L is indeed an equivalence relation (reflexive, symmetric, transitive) on Σ. In addition, if x L y then for every string z Σ, xz L yz as well (this is called right invariance). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 2

The Equivalence Relation L Like every equivalence relation, L partitions Σ to (disjoint) equivalence classes. For every string x, let [x] Σ denote its equivalence class w.r.t. L (if x L y then [x] = [y] equality of sets). Question is, how many equivalence classes does L induce? In particular, is the number of equivalence classes of L finite or infinite? Well, it could be either finite or infinite. This depends on the language L. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 3

Example: L = (ab ba) L has the following four equivalence classes: [ǫ] = L [a] = La [b] = Lb [aa] = L(aa bb)σ Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 4

Classes of L : More Examples Let L 1 {0, 1} contain all strings where the number of 1s is divisible by 4. Then L1 has finitely many equivalence classes. Let L 2 = L(0 10 ). Then L2 has finitely many equivalence classes. Let L 3 {0, 1} contain all strings of the form 0 n 1 n. Then L3 has infinitely many equivalence classes. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 5

Myhill-Nerode Theorem Theorem: Let L Σ be a language. Then L is regular L has finitely many equivalence classes. Three specific consequences: L 1 {0, 1} contains all strings where the number of 1s is divisible by 4. Then L 1 is regular. L 2 = L(0 10 ) is regular. L 3 {0, 1} contains all strings of the form 0 n 1 n. Then L 2 is not regular. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 6

Myhill-Nerode Theorem: Proof = Suppose L is regular. Let M = (Q, Σ,δ,q 0,F) be a DFA accepting it. For every x Σ, let δ(q 0,x) Q be the state where the computation of M on input x ends. The relation M on pairs of strings is defined as follows: x M y if δ(q 0,x) = δ(q 0,y). Clearly, M is an equivalence relation. Furthermore, if x M y, then for every z Σ, also xz M yz. Therefore, xz L if and only if yz L. This means that x M y = x L y. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 7

Myhill-Nerode Theorem: Proof (cont.) = The equivalence relation M has finitely many equivalence classes (at most the number of states in M). We saw that x M y = x L y. This means that the equivalence classes of M refine those of L. Thus, the number of equivalence classes of M is greater or equal than the number of equivalence classes of L. Therefore, L has finitely many equivalence classes, as desired. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 8

Myhill-Nerode Theorem: Proof (cont.) = Suppose L has finitely many equivalence classes. We ll construct a DFA M that accepts L. Let x 1,...,x n Σ be representatives for the finitely many equivalence classes of L. The states of M are the equivalence classes [x 1 ],...,[x n ]. The transition function δ is defined as follows: For all a Σ, δ([x i ],a) = [x i a] (the equivalence class of x i a). Why is δ well-defined? (Hint: right invariance: if x L y, then xz L yz). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 9

Myhill-Nerode Theorem: Proof (conc.) = The initial state is [ε]. The accept states are F = {[x i ] x i L}. Easy: On input x Σ, M ends at state [x] (why?). Therefore M accepts x iff x L (why?). So L is accepted by DFA, hence L is regular. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 10

Context Switch Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 11

Algorithmic Questions for NFAs Q.: Given an NFA, N, and a string s, is s L(N)? Answer: Construct the DFA equivalent to N and run it on w. Q.: Is L(N) =? Answer: This is a reachability question in graphs: Is there a path in the states graph of N from the start state to some accepting state. There are simple, efficient algorithms for this task. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 12

More Algorithmic Questions for NFAs Q.: Is L(N) = Σ? Answer: Check if L(N) =. Q.: Given N 1 and N 2, is L(N 1 ) L(N 2 )? Answer: Check if L(N 2 ) L(N 1 ) =. Q.: Given N 1 and N 2, is L(N 1 ) = L(N 2 )? Answer: Check if L(N 1 ) L(N 2 ) and L(N 2 ) L(N 1 ). In the future, we will see that for stronger models of computations, many of these problems cannot be solved by any algorithm. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 13

Another, More Radical Context Switch So far we saw finite automata, regular languages, regular expressions, Myhill-Nerode theorem and pumping lemma for regular languages. We now introduce stronger machines and languages with more expressive power: pushdown automata, context-free languages, context-free grammars, pumping lemma for context-free languages. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 14

Context-Free Grammars An example of a context free grammar, G 1 : A 0A1 A B B # Terminology: Each line is a substitution rule or production. Each rule has the form: symbol string. The left-hand symbol is a variable (usually upper-case). A string consists of variables and terminals. One variable is the start variable. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 15

Rules for Generating Strings Write down the start variable (lhs of top rule). Pick a variable written down in current string and a derivation that starts with that variable. Replace that variable with right-hand side of that derivation. Repeat until no variables remain. Return final string (concatenation of terminals). Process is inherently non deterministic. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 16

Example Grammar G 1 : A 0A1 A B B # Derivation with G 1 : A 0A1 00A11 000A111 000B111 000#111 Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 17

A Parse Tree A A A B 0 0 0 # 1 1 1 Question: What strings can be generated in this way from the grammar G 1? Answer: Exactly those of the form 0 n #1 n (n 0). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 18

Context-Free Languages The language generated in this way is called the language of the grammar. For example, L(G 1 ) is {0 n #1 n n 0}. Any language generated by a context-free grammar is called a context-free language. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 19

A Useful Abbreviation Rules with same variable on left hand side A A 0A1 B are written as: A 0A1 B Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 20

English-like Sentences A grammar G 2 to describe a few English sentences: < SENTENCE > < NOUN-PHRASE >< VERB > < NOUN-PHRASE > < ARTICLE >< NOUN > < NOUN > boy girl flower < ARTICLE > a the < VERB > touches likes sees Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 21

Deriving English-like Sentences A specific derivation in G 2 : < SENTENCE > < NOUN-PHRASE >< VERB > < ARTICLE >< NOUN >< VERB > a < NOUN >< VERB > a boy < VERB > a boy sees More strings generated by G 2 : a flower sees the girl touches Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 22

Derivation and Parse Tree < SENTENCE > < NOUN-PHRASE >< VERB > < ARTICLE >< NOUN >< VERB > a < NOUN >< VERB > a boy < VERB > a boy sees SENTENCE NOUN-PHRASE VERB ARTICLE NOUN a boy sees Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 23

Formal Definitions A context-free grammar is a 4-tuple (V, Σ,R,S) where V is a finite set of variables, Σ is a finite set of terminals, R is a finite set of rules: each rule is a variable and a finite string of variables and terminals. S is the start symbol. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 24

Formal Definitions If u and v are strings of variables and terminals, and A w is a rule of the grammar, then we say uav yields uwv, written uav uwv. We write u v if u = v or u u 1... u k v. for some sequence u 1,u 2,...,u k. { Definition: The language } of the grammar is w Σ S w. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 25

Example Consider G 3 = (V, {(, )},R,S). R (Rules): S (S) SS ε. Some words in the language: (()), (()()). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 26

Arithmetic Example Consider G 4 = (V, Σ,R,E) where V = {E,T,F } Σ = {a, +,, (, )} Rules: E E + T T T T F F F (E) a Strings generated by the grammar: a + a a and (a + a) a. What is the language of this grammar? Hint: arithmetic expressions. E = expression, T = term, F = factor. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 27

Parse Tree for a + a a E T F E + T T T F F (E) a E E T T T F F F a + a X a Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 28

Parse Tree for (a + a) a E T E T F E + T T T F F (E) a T F E E T T F F F ( a + a ) X a Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 29

Designing Context-Free Grammars - 1 If CFG is the union of several CFGs, rename variables (not terminals) so they are disjoint, and add new rule S S 1 S 2... S i. For example, to get a grammar for the language L = {0 n 1 n n 0} {1 n 0 n n 0} we construct the grammars: S 1 0 S 1 1 ǫ and then add the rule: S 2 1 S 2 0 ǫ S S 1 S 2 Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 30

Designing Context-Free Grammars - 2 To construct CFG for a regular language, follow a DFA for the language. For initial state q 0, make R 0 the start variable. For state transition δ(q i,a) = q j add rule R i ar j to grammar. For each final state q f, add rule R f ε to grammar. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 31

Designing Context-Free Grammars - 3 For languages with linked substrings, a rule of form R urv may be helpful, forcing desired relation between substrings. For L = {0 n 1 n n 0}: For L = {1 n ##0 n n 0}: S 0 S 1 ǫ S 1 S 0 ## Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 32

Closure Properties Regular languages are closed under union concatenation star Context-Free Languages are closed under union : S S 1 S 2 concatenation S S 1 S 2 star S ε SS Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 33

More Closure Properties Regular languages are also closed under complement (reverse accept/non-accept states of DFA) ) intersection (L 1 L 2 = L 1 L 2. What about complement and intersection of context-free languages? Not clear... Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 34

Ambiguity Grammar: E E+E E E (E) a a X a + a E E E E E a X a + a E E E E E Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 35

Ambiguity We say that a string w is derived ambiguously from grammar G if w has two or more parse trees that generate it from G. Ambiguity is usually not only a syntactic notion but also a semantic one, implying multiple meanings for the same string. It is sometime possible to eliminate ambiguity by finding a different context free grammar generating the same language. This is true for the grammar above, which can be replaced by unambiguous grammar from slide 50. Some languages (e.g. {1 i 2 j 3 k i = j or j = k} are inherentrly ambigous. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 36

Chomsky Normal Form A simplified, canonical form of context free grammars. Every rule has the form A A S BC a ε where S is the start symbol, A, B and C are any variable, except B and C not the start symbol, and A can be the start symbol. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 37

Theorem Theorem: Any context-free language is generated by a context-free grammar in Chomsky normal form. Basic idea: Add new start symbol S 0. Eliminate all ε rules of the form A ε. Eliminate all unit rules of the form A B. Patch up rules so that grammar generates the same language. Convert remaining long rules to proper form. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 38

Proof Idea Add new start symbol S 0 and rule S 0 S. Guarantees that new start symbol does not appear on right-hand-side of a rule. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 39

Proof Eliminating ε rules. Repeat: remove some A ε. for each R uav, add rule R uv. and so on: for R uavaw add R uvaw, R uavw, and R uvw. for R A add R ε, except if R ε has already been removed. until all ε-rules not involving the original start variable have been removed. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 40

Proof Eliminate unit rules. Repeat: remove some A B. for each B u, add rule A u, unless this is previously removed unit rule. (u is a string of variables and terminals.) until all unit rules have been removed. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 41

Proof Finally, convert long rules. To replace each A u 1 u 2...u k (for k 3), introduce new non-terminals N 1,N 2,...,N k 1 and rules A u 1 N 1 N 1 u 2 N 2. N k 3 u k 2 N k 2 N k 2 u k 1 u k Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 42

Conversion Example Initial Grammar: S A B ASA ab B S b ε (1) Add new start state: S 0 S S ASA ab A B S B b ε Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 43

Conversion Example (2) S 0 S S ASA ab A B S B b ε (2) Remove ε-rule B ε: S 0 S S ASA ab a A B S ε B b ε Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 44

Conversion Example (3) S 0 S S ASA ab a A B S ε B b (3) Remove ε-rule A ε: S 0 S S ASA ab a AS SA S A B S ε B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 45

Conversion Example (4) S 0 S S ASA ab a AS SA S A B S B b (4) Remove unit rule S S S 0 S S ASA ab a AS SA S A B S B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 46

Conversion Example (5) S 0 S S ASA ab a AS SA A B S B b (5) Remove unit rule S 0 S: S 0 S ASA ab a AS SA S ASA ab a AS SA A B S B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 47

Conversion Example (6) S 0 ASA ab a AS SA S ASA ab a AS SA A B S B b (6) Remove unit rule A B: S 0 ASA ab a AS SA S ASA ab a AS SA A B S b B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 48

Conversion Example (7) S 0 ASA ab a AS SA S ASA ab a AS SA A S b B b Remove unit rule A S: S 0 ASA ab a AS SA S ASA ab a AS SA A S b ASA ab a AS SA B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 49

Conversion Example (8) S 0 ASA ab a AS SA S ASA ab a AS SA A b ASA ab a AS SA B b (8) Final simplification treat long rules: S 0 AA 1 UB a SA AS S AA 1 UB a SA AS A b AA 1 UB a SA AS A 1 SA U a B b Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 50

Non-Context-Free Languages The pumping lemma for finite automata and Myhill-Nerode theorem are our tools for showing that languages are not regular. We do not have a simple Myhill-Nerode type theorem. However, will now show a similar pumping lemma for context-free languages. It is slightly more complicated... Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 51

Pumping Lemma for CFL Also known as the uvxyz Theorem. Theorem: If A is a CFL, there is an l (critical length), such that if s A and s l, then s = uvxyz where for every i 0, uv i xy i z A vy > 0, (non-triviality) vxy l. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 52

Basic Intuition Let A be a CFL and G its CFG. Let s be a very long string in A. Then s must have a tall parse tree. And some root-to-leaf path must repeat a symbol. ahmmm,... why is that so? T R R u v x y z Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 53

Proof let G be a CFG for CFL A. let b be the max number of symbols in right-hand-side of any rule. no node in parse tree has > b children. at depth d, can have at most b d leaves. let V be the number of variables in G. set l = b V +2. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 54

Proof (2) let s be a string where s l let T be parse tree for s with fewest nodes T has height V + 2 some path in T has length V + 2 that path repeats a variable R Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 55

Proof (3) Split s = uvxyz T R R u v x y z each occurrence of R produces a string upper produces string vxy lower produces string x Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 56

Proof(4) Replacing smaller by larger yields uv i xy i z, for i > 0. T R R u v Rx y z Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 57

Proof (5) Replacing larger by smaller yields uxz. T R x u z Together, they establish: for all i 0, uv i xy i z A Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 58

Proof (6) Next condition is: vy > 0 (out of uvxyz) T R x If v and y are both ε, then u z is a parse tree for s with fewer nodes than T, contradiction. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 59

Proof (7) Final condition is vxy l: T R R V +2 u v x y z b V +2 the upper occurrence of R generates vxy. can choose symbols such that both occurrences of R lie in bottom V + 1 variables on path. subtree where R generates vxy is V + 2 high. strings by subtree at most b V +2 = l long. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 60

Non CFL Example Theorem: B = {a n b n c n } is not a CFL. Proof: By contradiction. Let l be the critical length. Consider s = a l b l c l. If s = uvxyz, neither v nor y can contain both a s and b s, or both b s and c s, because otherwise uv 2 xy 2 z would have out-of-order symbols. But if v and y contain only one letter, then uv 2 xy 2 z has an imbalance! Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 61

Non CFL Example (2) The language C = { a i b j c k 0 i j k } is not context-free. Let l be the critical length, and s = a l b l c l. Let s = uvxyz Neither v nor y contains two distinct symbols, because uv 2 xy 2 z would then have out-of-order symbols. Then one of the symbols a,b,c does not appear neither in v nor in y. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 62

Non CFL Example (2) - cont. The language C = { a i b j c k 0 i j k } is not context-free. Let l be the critical length, and s = a l b l c l. Let s = uvxyz One of the symbols a,b,c does not appear neither in v nor in y: If a s do not appear, then the string uv 0 xy 0 z C, as it has the same number of a s as s, but has fewer b s or fewer c s. If b s do not appear, then either a s or c s must appear in v or y (as they cannot be both empty). If a s appear, then the string uv 2 xy 2 z C, as it contains more a s than b s. If c s appear, the string uv 0 xy 0 z C, as it contains more b s If c s do not appear, then the string uv 2 xy 2 z C, as it contains more a s or more b s than c s. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 63

Non CFL Example (3) The language D = { ww w {0, 1} } is not context-free. Let s = 0 l 1 l 0 l 1 l As before, suppose s = uvxyz. Recall that vxy l. if vxy is in the first half of s, uv 2 xy 2 z moves a 1 into the first position in second half. if vxy is in the second half, uv 2 xy 2 z moves a 0 into the last position in first half. if vxy straddles the midpoint, then pumping down to uxz yields 0 l 1 i 0 j 1 l where i and j cannot both be l. Interestingly, we will shortly see that D = { ww R w {0, 1} } is a CFL. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 64

String Generators and String Acceptors Regular expressions are string generators they tell us how to generate all strings in a language L Finite Automata (DFA, NFA) are string acceptors they tell us if a specific string w is in L CFGs are string generators Are there string acceptors for Context-Free languages? YES! Push-down automata Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 65

A Finite Automaton Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 66

A PushDown Automaton Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 67

A PushDown Automaton We add a memory device with restricted access: A stack (last in, first out; push/pop). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 68

An Example Recall that the language {0 n 1 n n 0} is not regular. Consider the following PDA: Read input symbols, one by one For each 0, push it on the stack As soon as a 1 is seen, pop a 0 for each 1 read Accept if stack is empty when last symbol read Reject if Stack is non-empty when end of input symbol read Stack is empty but input symbol(s) still exist, 0 is read after 1. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 69

PDA Configuration A Configuration of a Push-Down Automata is a triplet (<state>, <remaining input string>, <stack>). A configuration is like a snapshot of PDA progress. A PDA computation is a sequence of successive configurations, starting from start configuration. In the string describing the stack, we put the top of the stack on the left (technical item). Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 70

Comparing PDA and Finite Automata Our PDAs have a special end of input symbol, $. This symbol is not part of input alphabet Σ. (This technical point differs from book. It is for convenience, and can be handled differently too.) PDA may be deterministic or non-deterministic. Unlike finite automata, non-determinism adds power: There are some languages accepted only by non-deterministic PDAs. Transition function δ looks different than DFA or NFA cases, reflecting stack functionality. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 71

The Transition Function Denote input alphabet by Σ and stack alphabet by Γ. the domain of the transition function δ is current state: Q next input, if any: Σ ε,$ (=Σ {ε} {$}) stack symbol popped, if any: Γ ε (=Γ {ε}) and its range is new state: Q stack symbol pushed, if any: Γ ε non-determinism: P(of two components above) δ : Q Σ ε,$ Γ ε P(Q Γ ε ) Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 72

Formal Definitions A pushdown automaton (PDA) is a 6-tuple (Q, Σ, Γ,δ,q 0,F), where Q is a finite set called the states, Σ is a finite set called the input alphabet, Γ is a finite set called the stack alphabet, δ : Q Σ ε,$ Γ ε P(Q Γ ε ) is the transition function, q 0 Q is the start state, and F Q is the set of accept states. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 73

Conventions It will be convenient to be able to know when the stack is empty, but there is no built-in mechanism to do that. Solution: Start by pushing $ onto stack. When you see it again, stack is empty. Question: When is input string exhausted? Answer: When $ is read. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 74

Semi Formal Definitions A pushdown automaton (PDA) M accepts a string x if there is a computation of M on x (a sequence of state and stack transitions according to M s transition function and corresponding to x) that leads to an accepting state and empty stack. The language accepted by M is the set of all strings x accepted by M. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 75

Notation Transition a,b c means if read a from input and pop b from stack then push c onto stack Meaning of ε transitions: if a = ε, don t read inputs if b = ε, don t pop any symbols if c = ε, don t push any symbols Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 76

Example The PDA accepts {0 n 1 n n 1}. Does it also accept the empty string? Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 77

Another Example A PDA that accepts { } a i b j c k i,j,k > 0 and i = j or i = k Informally: read and push a s either pop and match with b s or else ignore b s. Pop and match with c s a non-deterministic choice! Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 78

Another Example (cont.) This PDA accepts { } a i b j c k i,j,k > 0 and i = j or i = k Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 79

Another Example (cont.) A PDA that accepts { a i b j c k i,j,k > 0 and i = j or i = k } Note: non-determinism is essential here! Unlike finite automata, non-determinism does add power. Let us try to think how a proof that nondeterminism is indeed essential could go,. and realize it does not seem trivial or immediate. We will later give a proof that the language L = {x n y n n 0} { x n y 2n n 0 } is accepted by a non-deterministic PDA but not by a deterministic one. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 80

PDA Languages The Push-Down Automata Languages, L PDA, is the set of all languages that can be described by some PDA: L PDA = {L : PDA M L[M] = L} We already know L PDA L DFA, since every DFA is just a PDA that ignores the stack. L CFG L PDA? L PDA L CFG? Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 81

Equivalence Theorem Theorem: A language is context free if and only if some pushdown automaton accepts it. This time, both the if part and the only if part are non-trivial. We will present a high level view of the proof (not all details) later. Based on slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. p. 82