Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Similar documents
Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Properties of Context-Free Languages

NPDA, CFG equivalence

Non-context-Free Languages. CS215, Lecture 5 c

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Notes on Pumping Lemma

Properties of context-free Languages

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Chap. 7 Properties of Context-free Languages

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Foundations of Informatics: a Bridging Course

Properties of Context-Free Languages. Closure Properties Decision Properties

Ogden s Lemma for CFLs

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CPS 220 Theory of Computation

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Computational Models - Lecture 4 1

Context-Free Languages (Pre Lecture)

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

The View Over The Horizon

This lecture covers Chapter 7 of HMU: Properties of CFLs

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

CS375: Logic and Theory of Computing

CS20a: summary (Oct 24, 2002)

Notes for Comp 497 (454) Week 10

Computational Models - Lecture 4

Computational Models - Lecture 5 1

Finite Automata and Formal Languages

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

Theory of Computation

The Pumping Lemma for Context Free Grammars

Introduction to Theory of Computing

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Properties of Context-free Languages. Reading: Chapter 7

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

Einführung in die Computerlinguistik

Context-Free Grammars (and Languages) Lecture 7

MA/CSSE 474 Theory of Computation

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Computational Models - Lecture 4 1

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Computational Models - Lecture 5 1

Theory of Computation - Module 3

HW6 Solutions. Micha l Dereziński. March 20, 2015

CSCI Compiler Construction

Question Bank UNIT I

Concordia University Department of Computer Science & Software Engineering

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Computational Models: Class 5

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Computational Models - Lecture 3

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Section 1 (closed-book) Total points 30

Automata Theory CS F-08 Context-Free Grammars

CSE 355 Test 2, Fall 2016

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Context Free Languages and Grammars

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

FLAC Context-Free Grammars

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Automata Theory Final Exam Solution 08:10-10:00 am Friday, June 13, 2008

UNIT II REGULAR LANGUAGES

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

10. The GNFA method is used to show that

Finite Automata and Formal Languages TMV026/DIT321 LP Testing Equivalence of Regular Languages

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CYK Algorithm for Parsing General Context-Free Grammars

Computability Theory

Theory of Computation (Classroom Practice Booklet Solutions)

CDM Parsing and Decidability

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Final exam study sheet for CS3719 Turing machines and decidability.

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

CISC4090: Theory of Computation

Theory Bridge Exam Example Questions

Formal Languages, Automata and Models of Computation

6.8 The Post Correspondence Problem

Functions on languages:

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

CISC 4090 Theory of Computation

DD2371 Automata Theory

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Transcription:

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of unit productions; Elimination of useless symbols: Elimination of non-generating symbols; Elimination of non-reachable symbols; Chomsky normal forms: rules of the form A a or A BC. May 14th 2018, Lecture 14 TMV027/DIT321 1/29

Overview of Today s Lecture Regular grammars; Chomsky hierarchy; Pumping lemma for CFL; Closure properties of CFL; Decision properties of CFL; Contributes to the following learning outcome: Explain and manipulate the diff. concepts in automata theory and formal lang; Understand the power and the limitations of regular lang and context-free lang; Design automata, regular expressions and context-free grammars accepting or generating a certain language; Describe the language accepted by an automata or generated by a regular expression or a context-free grammar; Determine if a certain word belongs to a language; Differentiate and manipulate formal descriptions of lang, automata and grammars. May 14th 2018, Lecture 14 TMV027/DIT321 2/29 Regular Grammars Definition: A grammar where all rules are of the form A ab or A ǫ is called left regular. Definition: A grammar where all rules are of the form A Ba or A ǫ is called right regular. Note: We will see that regular grammars generate the regular languages. May 14th 2018, Lecture 14 TMV027/DIT321 3/29

Example: Regular Grammars A DFA that generates the language over {0, 1} with an even number of 0 s: 1 0 q 0 q 1 1 0 Exercise: What could the left regular grammar be for this language? Let q 0 be the start variable. q 0 ǫ 0q 1 1q 0 q 1 0q 0 1q 1 May 14th 2018, Lecture 14 TMV027/DIT321 4/29 Example: Regular Grammars Consider the following DFA over {0, 1}: 1 q 0 0 q 1 q 2 0 1 0 1 Exercise: What could the left regular grammar be for this language? Let q 0 be the start variable. q 0 0q 1 1q 0 q 1 0q 1 1q 2 q 2 ǫ 0q 1 1q 2 q 0 1q 0 10q 1 100q 1 1001q 2 10010q 1 100101q 2 100101 Exercise: What could the right regular grammar be for this language? Let q 2 be the start variable. q 0 ǫ q 0 1 q 1 q 0 0 q 1 0 q 2 0 q 2 q 1 1 q 2 1 q 2 q 1 1 q 2 01 q 1 101 q 1 0101 q 0 00101 q 0 100101 100101 May 14th 2018, Lecture 14 TMV027/DIT321 5/29

Regular Languages and Context-Free Languages Theorem: If L is a regular language then L is context-free. Proof: If L is a regular language then L = L(D) for a DFA D. Let D = (Q, Σ, δ, q 0, F ). We define a CFG G = (Q, Σ, R, q 0 ) where R is the set of productions: p aq if δ(p, a) = q p ǫ if p F We must prove that p wq iff ˆδ(p, w) = q and p w iff ˆδ(p, w) F. Then, in particular w L(G) iff w L(D). May 14th 2018, Lecture 14 TMV027/DIT321 6/29 Regular Languages and Context-Free Languages We prove by mathematical induction on w that p, q. p wq iff ˆδ(p, w) = q and p. p w iff ˆδ(p, w) F. Base case: If w = 0 then w = ǫ. Given the rules in the grammar, p q only when p = q and p ǫ only when p ǫ. We have ˆδ(p, ǫ) = p by definition of ˆδ and p F by the way we defined the grammar. Inductive step: Suppose w = n + 1, then w = av. Then ˆδ(p, av) = ˆδ(δ(p, a), v) with v = n. By IH δ(p, a) vq iff ˆδ(δ(p, a), v) = q. By construction we have a rule p aδ(p, a). Then p aδ(p, a) avq iff ˆδ(p, av) = ˆδ(δ(p, a), v) = q. By IH δ(p, a) v iff ˆδ(δ(p, a), v) F. Now p aδ(p, a) av iff ˆδ(p, av) = ˆδ(δ(p, a), v) F. May 14th 2018, Lecture 14 TMV027/DIT321 7/29

Chomsky Hierarchy This hierarchy of grammars was described by Noam Chomsky in 1956: Type 0: Unrestricted grammars Rules are of the form α β, α must be non-empty. They generate exactly all languages that can be recognised by a Turing machine; Type 1: Context-sensitive grammars Rules are of the form αaβ αγβ. α and β may be empty, but γ must be non-empty; Type 2: Context-free grammars Rules are of the form A α, α can be empty. Used to produce the syntax of most programming languages; Type 3: Regular grammars Rules are of the form A Ba, A ab or A ǫ. We have that Type 3 Type 2 Type 1 Type 0. May 14th 2018, Lecture 14 TMV027/DIT321 8/29 Pumping Lemma for Left Regular Languages Let G = (V, T, R, S) be a left regular grammar and let n = V. If a 1 a 2... a m L(G) for m > n, then any derivation S a 1 A 1 a 1 a 2 A 2... a 1... a i A... a 1... a j A... a 1... a m has length m and there is at least one variable A which is used twice. (Pigeon-hole principle) If x = a 1... a i, y = a i+1... a j and z = a j+1... a m, we have xy n and xy k z L(G) for all k. May 14th 2018, Lecture 14 TMV027/DIT321 9/29

Pumping Lemma for Context-Free Languages Theorem: Let L be a context-free language. Then, there exists a constant n which depends on L such that for every w L with w n, it is possible to break w into 5 strings x, u, y, v and z such that w = xuyvz and 1 uyv n; 2 uv ǫ, that is, either u or v is not empty; 3 k 0. xu k yv k z L. Proof: (Sketch) We can assume that the language is presented by a grammar in Chomsky Normal Form, working with L {ǫ}. Observe that parse trees for grammars in CNF have at most 2 children. Note: If m + 1 is the height of a parse tree for w, then w 2 m. (Prove this as an exercise!) May 14th 2018, Lecture 14 TMV027/DIT321 10/29 Proof Sketch: Pumping Lemma for Context-Free Languages Let V = m > 0. Take n = 2 m and w such that w 2 m. Any parse tree for w has a path from root to leave of length at least m + 1. Let A 0, A 1...., A k be the variables in the path. We have k m. Then at least 2 of the last m + 1 variables should be the same, say A i and A j. Observe figures 7.6 and 7.7 in pages 282 283. See Theorem 7.18 in the book for the complete proof. May 14th 2018, Lecture 14 TMV027/DIT321 11/29

Example: Pumping Lemma for Context-Free Languages Lemma: The language L = {a m b m c m m > 0} is not context-free. Proof: Let us assume L is context-free. Then the Pumping lemma must apply. Let n be the constant stated by the Pumping lemma. Let w = a n b n c n L; we have that w n. By the lemma we know that w = xuyvz such that uyv n uv ǫ k 0. xu k yv k z L Since uyv n there is one letter d {a, b, c} that does not occur in uyv. Since uv ǫ there is another letter e {a, b, c}, e d that does occur in uv. Then e has more occurrences than d in xu 2 yv 2 z and this contradicts the fact that xu 2 yv 2 z L. Hence L cannot be a context-free language. May 14th 2018, Lecture 14 TMV027/DIT321 12/29 Closure under Union Theorem: Let G 1 = (V 1, T, R 1, S 1 ) and G 2 = (V 2, T, R 2, S 2 ) be CFG. Then L(G 1 ) L(G 2 ) is a context-free language. Proof: Let us assume V 1 V 2 = (easy to get via renaming). Let S be a fresh variable. We construct G = (V 1 V 2 {S}, T, R 1 R 2 {S S 1 S 2 }, S). It is now easy to see that L(G) = L(G 1 ) L(G 2 ) since a derivation will have the form S S 1 w if w L(G 1 ) or S S 2 w if w L(G 2 ) May 14th 2018, Lecture 14 TMV027/DIT321 13/29

Closure under Concatenation Theorem: Let G 1 = (V 1, T, R 1, S 1 ) and G 2 = (V 2, T, R 2, S 2 ) be CFG. Then L(G 1 )L(G 2 ) is a context-free language. Proof: Again, let us assume V 1 V 2 =. Let S be a fresh variable. We construct G = (V 1 V 2 {S}, T, R 1 R 2 {S S 1 S 2 }, S). It is now easy to see that L(G) = L(G 1 )L(G 2 ) since a derivation will have the form S S 1 S 2 uv with S 1 u and S 2 v for u L(G 1 ) and v L(G 2 ). May 14th 2018, Lecture 14 TMV027/DIT321 14/29 Closure under Closure Theorem: Let G = (V, T, R, S) be a CFG. Then L(G) + and L(G) are context-free languages. Proof: Let S be a fresh variable. We construct G+ = (V {S }, T, R {S S SS }, S ) and G = (V {S }, T, R {S ǫ SS }, S ). It is easy to see that S ǫ in G. Also that S S w if w L(G) is a valid derivation both in G+ and in G. In addition, if w 1,..., w k L(G), it is easy to see that the derivation S SS w 1 S w 1 SS w 1 w 2 S... w 1 w 2... w k 1 S w 1 w 2... w k 1 S w 1 w 2... w k 1 w k is a valid derivation both in G+ and in G. May 14th 2018, Lecture 14 TMV027/DIT321 15/29

Non Closure under Intersection Example: Consider the following languages over {a, b, c}: L 1 = {a k b k c m k, m > 0} L 2 = {a m b k c k k, m > 0} It is easy to give CFG generating both L 1 and L 2, hence L 1 and L 2 are CFL. However L 1 L 2 = {a k b k c k k > 0} is not a CFL (see slide 12). May 14th 2018, Lecture 14 TMV027/DIT321 16/29 Closure under Intersection with Regular Language Theorem: If L is a CFL and P is a RL then L P is a CFL. Proof: See Theorem 7.27 in the book. (It uses push-down automata which we have not seen.) Example: Consider the following language over Σ = {0, 1}: L = {ww w Σ } Is L a regular language? Consider L = L L(0 1 0 1 ) = {0 n 1 m 0 n 1 m n, m 0}. L is not a CFL (see additional exercise 4 in exercises for CFL). Hence L cannot be a CFL since L(0 1 0 1 ) is a RL. May 14th 2018, Lecture 14 TMV027/DIT321 17/29

Non Closure under Complement Theorem: CFL are not closed under complement. Proof: Notice that L 1 L 2 = L 1 L 2 If CFL are closed under complement then they should be closed under intersection (since they are closed under union). Then CFL are in general not closed under complement. May 14th 2018, Lecture 14 TMV027/DIT321 18/29 Closure under Difference? Theorem: CFL are not closed under difference. Proof: Let L be a CFL over Σ. It is easy to give a CFG that generates Σ. Observe that L = Σ L. Then if CFL are closed under difference they would also be closed under complement. Theorem: If L is a CFL and P is a RL then L P is a CFL. Proof: Observe that P is a RL and L P = L P. May 14th 2018, Lecture 14 TMV027/DIT321 19/29

Closure under Reversal and Prefix Theorem: If L is a CFL then so is L r = {rev(w) w L}. Proof: Given a CFG G = (V, T, R, S) for L we construct the grammar G r = (V, T, R r, S) where R r is such that, for each rule A α in R, then A rev(α) is in R r. One should show by induction on the length of the derivations in G and G r that L(G r ) = L r. Theorem: If L is a CFL then so is Prefix(L). Proof: For closure under prefix see exercise 7.3.1 part a) in the book. May 14th 2018, Lecture 14 TMV027/DIT321 20/29 Decision Properties of Context-Free Languages Very little can be answered when it comes to CFL. The major tests we can answer are whether: The language is empty; (See the algorithm that tests for generating symbols in slide 4 lecture 13: if L is a CFL given by a grammar with start variable S, then L is empty if S is not generating.) A certain string belongs to the language. May 14th 2018, Lecture 14 TMV027/DIT321 21/29

Testing Membership in a Context-Free Language Checking if w L(G), where w = n, by trying all productions may be exponential on n. An efficient way to check for membership in a CFL is based on the idea of dynamic programming. (Method for solving complex problems by breaking them down into simpler problems, applicable mainly to problems where many of their subproblems are really the same; not to be confused with the divide and conquer strategy.) The algorithm is called the CYK algorithm after the 3 people who independently discovered the idea: Cock, Younger and Kasami. It is a O(n 3 ) algorithm. May 14th 2018, Lecture 14 TMV027/DIT321 22/29 Example: CYK Algorithm Consider the following grammar in CNF given by the rules S AB BA A AS a B BS b and starting symbol S. Does abba belong to the language generated by the grammar? We fill the corresponding table: {S} abba abb {B} bba {S} ab bb {S} ba {A} a {B} b {B} b {A} a a b b a Then S abba. May 14th 2018, Lecture 14 TMV027/DIT321 23/29

The CYK Algorithm Let G = (V, T, R, S) be a CFG in CNF and w = a 1 a 2... a n T. Does w L(G)? In the CYK algorithm we fill a table V 1n V 1(n 1) V 2n.. V 12 V 23 V 34... V (n 1)n V 11 V 22 V 33... V (n 1)(n 1) V nn a 1 a 2 a 3... a n 1 a n where V ij V is the set of A s such that A a i a i+1... a j. We want to know if S V 1n, hence S a 1 a 2... a n. May 14th 2018, Lecture 14 TMV027/DIT321 24/29 CYK Algorithm: Observations Each row corresponds to the substrings of a certain length: bottom row is length 1, second from bottom is length 2,... top row is length n; We work row by row upwards and compute the V ij s; In the bottom row we have i = j, that is, ways of generating a i ; V ij is the set of variables generating a i a i+1... a j of length j i + 1 (hence, V ij is in row j i + 1); In the rows below that of V ij we have all ways to generate shorter strings, including all prefixes and suffixes of a i a i+1... a j. May 14th 2018, Lecture 14 TMV027/DIT321 25/29

CYK Algorithm: Table Filling We compute V ij as follows (remember we work with a CFG in CNF): Base case: First row in the table. Here i = j. Then V ii = {A A a i R}. Recursive step: To compute V ij for i < j we have all V pq s in rows below. The length of the string is at least 2, so A a i a i+1... a j starts with A BC such that B a i a i+1... a k and C a k+1... a j for some k. So A V ij if k, i k < j such that B V ik and C V (k+1)j ; A BC R. We need to look at (V ii, V (i+1)j ), (V i(i+1), V (i+2)j ),..., (V i(j 1), V jj ). May 14th 2018, Lecture 14 TMV027/DIT321 26/29 Example: CYK Algorithm Consider the grammar given by the rules S XY Y AY a X XA a b A a and starting symbol S. Does babaa belong to the language generated by the grammar? We fill the corresponding table: babaa baba abaa bab aba {S, X } baa {S, X } ba ab {S, X } ba {S, X, Y } aa {X } b {A, X, Y } a {X } b {A, X, Y } a {A, X, Y } a b a b a a S / V 15 then S babaa. May 14th 2018, Lecture 14 TMV027/DIT321 27/29

Undecidable Problems for Context-Free Grammars/Languages Definition: An undecidable problem is a decision problem for which it is impossible to construct a single algorithm that always leads to a correct yes-or-no answer. Example: Halting problem: does this program terminate? The following problems are undecidable: Is the CFG G ambiguous? Is the CFL L inherently ambiguous? If L(G 1 ) and L(G 2 ) are CFL, is L(G 1 ) L(G 2 ) =? If L(G 1 ) and L(G 2 ) are CFL, is L(G 1 ) = L(G 2 )? is L(G 1 ) L(G 2 )? If L(G) is a CFL and P a RL, is P = L(G)? is P L(G)? If L(G) is a CFL over Σ, is L(G) = Σ? May 14th 2018, Lecture 14 TMV027/DIT321 28/29 Overview of Next Lecture Sections 6, 8 (just a bit of both): Push-down automata; Turing machines. May 14th 2018, Lecture 14 TMV027/DIT321 29/29