CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Similar documents
CISC4090: Theory of Computation

Computational Models - Lecture 4

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Computational Models - Lecture 3

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Introduction to Theory of Computing

Computational Models: Class 5

Context Free Languages and Grammars

CISC 4090 Theory of Computation

CPS 220 Theory of Computation

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Context Free Grammars

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Chomsky and Greibach Normal Forms

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Computational Models - Lecture 4 1

The Pumping Lemma for Context Free Grammars

Automata Theory CS F-08 Context-Free Grammars

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

The View Over The Horizon

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Computational Models - Lecture 4 1

Foundations of Informatics: a Bridging Course

Theory of Computation (IV) Yijia Chen Fudan University

CISC 4090 Theory of Computation

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

CPS 220 Theory of Computation Pushdown Automata (PDA)

Theory of Computation (II) Yijia Chen Fudan University

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Context-Free Grammars and Languages

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Pushdown Automata (Pre Lecture)

CSE 105 THEORY OF COMPUTATION

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

CS481F01 Prelim 2 Solutions

MTH401A Theory of Computation. Lecture 17

Context-Free Languages

MA/CSSE 474 Theory of Computation

Formal Languages, Grammars and Automata Lecture 5

Computational Models - Lecture 5 1

Pushdown Automata (2015/11/23)

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Computational Models - Lecture 5 1

5 Context-Free Languages

CSE 105 THEORY OF COMPUTATION

Chapter 5: Context-Free Languages

Intro to Theory of Computation

Properties of Context-Free Languages

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Grammars and Context Free Languages

CSCI 2670 Introduction to Theory of Computing

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Notes for Comp 497 (Comp 454) Week 10 4/5/05

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Theory of Computation - Module 3

CS375: Logic and Theory of Computing

Pushdown Automata (PDA) The structure and the content of the lecture is based on

CS5371 Theory of Computation. Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)

Grammars and Context Free Languages

Non-context-Free Languages. CS215, Lecture 5 c

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Properties of Context-free Languages. Reading: Chapter 7

Context-Free Grammars (and Languages) Lecture 7

Einführung in die Computerlinguistik

Context-Free Grammars. 2IT70 Finite Automata and Process Theory

This lecture covers Chapter 7 of HMU: Properties of CFLs

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

UNIT-VI PUSHDOWN AUTOMATA

Lecture 11 Context-Free Languages

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL, DPDA PDA)

Notes for Comp 497 (454) Week 10

Introduction and Motivation. Introduction and Motivation. Introduction to Computability. Introduction and Motivation. Theory. Lecture5: Context Free

Pushdown Automata. Reading: Chapter 6

Lecture 17: Language Recognition

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

CS Pushdown Automata

Automata and Computability. Solutions to Exercises

Homework. Context Free Languages. Announcements. Before We Start. Languages. Plan for today. Final Exam Dates have been announced.

CS311 Computational Structures More about PDAs & Context-Free Languages. Lecture 9. Andrew P. Black Andrew Tolmach

CS20a: summary (Oct 24, 2002)

DD2371 Automata Theory

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Accept or reject. Stack

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

I have read and understand all of the instructions below, and I will obey the University Code on Academic Integrity.

Context-Free Grammars and Languages. Reading: Chapter 5

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Introduction to Formal Languages, Automata and Computability p.1/42

Automata and Computability. Solutions to Exercises

Transcription:

CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF)

Announcement Homework 2 will be given soon (before Tue) Due date: Oct 31 (Tue), before class Midterm: Nov 3, (Fri), first hour of the class 1 hour Around 3 long questions, and some True and False Questions Scope: Automata Theory (DFA, NFA, CFG, PDA, ) + Some topics covered in the next two weeks

Objectives Introduce Context-free Grammar (CFG) and Context-free Language (CFL) Show that Regular Language can be described by CFG Terminology related to CFG Leftmost Derivation, Ambiguity, Chomsky Normal Form (CNF) Converting a CFG into CNF

Context-free Grammar (Example) Substitution Rules A 0A1 A B B # Variables A, B Terminals 0,1,# Start Variable Important: Substitution Rule in CFG has a special form: Exactly one variable (and nothing else) on the left side of the arrow A

How does CFG generate strings? A 0A1 A B B # Write down the start symbol Find a variable that is written down, and a rule that starts with that variable; Then, replace the variable with the rule Repeat the above step until no variable is left

How does CFG generate strings? A 0A1 A B B # Step 1. A (write down the start variable) Step 2. 0A1 (find a rule and replace) Step 3. 00A11 (find a rule and replace) Step 4. 00B11 (find a rule and replace) Step 5. 00#11 (find a rule and replace) Now, the string 00#11 does not have any variable. We can stop.

How does CFG generate strings? The sequence of substitutions to generate a string is called a derivation E.g., A derivation of the string 000#111 in the previous grammar is A 0A1 00A11 000A111 000B111 000#111 The same information can be represented pictorially by a parse tree (next slide)

0 0 Parse Tree A 0 A A A A B 1 1 1 #

Language of the Grammar In fact, the previous grammar can generate strings #, 0#1, 00#11, 000#111, The set of all strings that can be generated by a grammar G is called the language of G, denoted by L(G) Thus, the language of the previous grammar is {0 n #1 n n 0 }

CFG (Formal Definition) A CFG is a 4-tuple (V,T, R, S), where V is a finite set of variables T is a finite set of terminals R is a set of substitution rules, where each rule consists of a variable (left side of the arrow) and a string of variables and terminals (right side of the arrow) S2V called the start variable

CFG (terminology) Let u and v be strings of variables and terminals We say u derives v, denoted by u v, if u = v, or there exists u 1, u 2,, u k, k 0 such that u u 1 u 2 u k v In other words, for a grammar G = (V,T,R,S), * L(G) = { w2t* S w } *

CFG (more examples) Let G = ( {S}, {a,b}, R, S ), and the set of rules, R, is S asb SS This notation is an abbreviation for S asb S SS S What will this grammar generate? If we think of a as ( and b as ), G generates all strings of properly nested parentheses

Quick Quiz Is the following a CFG? G = { {A,B}, {0,1}, R, A } A 0B1 A 0 B 1B0 1 0B A

Designing CFG Can we design CFG for {0 n 1 n n 0}[ {1 n 0 n n 0}? Do we know CFG for {0 n 1 n n 0}? Do we know CFG for {1 n 0 n n 0}?

Designing CFG CFG for the language L1 = {0 n 1 n n 0} S 0S1 CFG for the language L2 = {1 n 0 n n 0} S 1S0 CFG for L1[ L2 S S 1 S 2 S 1 0S 1 1 S 2 1S 2 0

Designing CFG Can we design CFG for {0 2n 1 3n n 0}? Yes, by linking the occurrence of 0 s with the occurrence of 1 s The desired CFG is: S 00S111

Quick Quiz Can we construct the CFG for the language { w w is a palindrome }? Assume that the alphabet of w is {0,1} Examples for palindrome: 010, 0110, 001100, 01010, 1101011,

Regular Language & CFG Theorem: Any regular language can be described by a CFG. How to prove? (By construction)

Regular Language & CFG Proof: Let D be the DFA recognizing the language. Create a distinct variable V i for each state q i in D. Make V 0 the start variable of CFG Assume that q 0 is the start state of D Add a rule V i av j if (q i,a) = q j Add a rule V i if q i is an accept state Then, we can show that the above CFG generates exactly the same language as D (how to show?)

Regular Language & CFG (Example) DFA 0 1 1 start q 0 q 1 CFG G = ( {V 0, V 1 }, {0,1}, R, V 0 ), where R is 0 V 0 0V 0 1V 1 V 1 1V 1 0V 0

Leftmost Derivation A derivation which always replace the leftmost variable in each step is called a leftmost derivation E.g., Consider the CFG for the properly nested parentheses ( {S}, {(,)}, R, S ) with rule R: S ( S ) SS Then, S SS (S)S ( )S ( ) ( S ) ( ) ( ) is a leftmost derivation But, S SS S(S) (S)(S) ( ) ( S ) ( ) ( ) is not a leftmost derivation However, we note that both derivations correspond to the same parse tree

Ambiguity Sometimes, a string can have two or more leftmost derivations!! E.g., Consider CFG ( {S}, {+,x,a}, R, S) with rules R: S S + S S x S a The string a + a x a has two leftmost derivations as follows: S S + S a + S a +S x S a + a x S a + a x a S S x S S + S x S a +S x S a + a x S a + a x a

Ambiguity If a string has two or more leftmost derivations in a CFG G, we say the string is derived ambiguously in G A grammar is ambiguous if some strings is derived ambiguously Note that the two leftmost derivations in the previous example correspond to different parse trees (see next slide) In fact, each leftmost derivation corresponds to a unique parse tree

Two parse trees for a + a x a S S S + S S x S a S x S S + S a a a a a

Inherently Ambiguous Sometimes when we have an ambiguous grammar, we can find an unambiguous grammar that generates the same language However, some language can only be generated by ambiguous grammar E.g., { a n b n c m n, m 0} [ {a n b m c m n, m 0} This will be a bonus exercise in Homework 2 Such language is called inherently ambiguous

Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule is of the form A BC A a where a is any terminal A,B,C are variables B, C cannot be start variable However, S is allowed

Converting a CFG to CNF Theorem: Any context-free language can be generated by a context-free grammar in Chomsky Normal Form. Question: Why is a general CFG not in Chomsky Normal Form?

Proof Idea The only reasons for a CFG not in CNF: 1. Start variable appears on right side 2. It has rules, such as A 3. It has unit rules, such as A A 4. Some rules does not have exactly two variables or one terminal on right side Prove idea: Convert a grammar into CNF by handling the above cases

The Conversion (step 1) Proof: Let G be the context-free grammar generating the context-free language. We want to convert G into CNF. Step 1: Add a new start variable S 0 and the rule S 0 S, where S is the start variable of G This ensures that start variable of the new grammar does not appear on right side

The Conversion (step 2) Step 2: We take care of all rules. To remove the rule A, for each occurrence of A on the right side of a rule, we add a new rule with that occurrence deleted. E.g., R uavaw causes us to add the rules: R uavw, R uvaw, R uvw If we have the rule R A, we add R unless we had previously removed R After removing A, the new grammar still generates the same language as G.

The Conversion (step 3) Step 3: We remove the unit rule A B. To do so, for each rule B u (where u is a string of variables and terminals), we add the rule A u. E.g., if we have A B, B ac, B CC, we add: A ac, A CC After removing A B, the new grammar still generates the same language as G.

The Conversion (step 4) Step 4: Suppose we have a rule A u 1 u 2 u k, where k > 2 and each u i is a variable or a terminal. We replace this rule by A u 1 A 1, A 1 u 2 A 2, A 2 u 3 A 3,, A k-2 u k-1 u k After the change, the string on the right side of any rule is either of length 1 (a terminal) or length 2 (two variables, or 1 variable + 1 terminal, or two terminals)

The Conversion (step 4 cont.) To remove a rule A u 1 u 2 with some terminals on the right side, we replace the terminal u i by a new variable U i and add the rule U i u i After the change, the string on the right side of any rule is exactly a terminal or two variables

The Conversion (example) Let G be the grammar on the left side. We get the new grammar on the right side after the first step. S ASA ab A B S B b S 0 S S ASA ab A B S B b

The Conversion (example) After that, we remove B S 0 S S ASA ab A B S B b S 0 S S ASA ab a A B S B b Before removing B After removing B

The Conversion (example) After that, we remove A S 0 S S ASA ab a A B S B b Before removing A S 0 S S ASA ab a SA AS S A B S B b After removing A

The Conversion (example) Then, we remove S S and S 0 S S 0 S S ASA ab a SA AS A B S B b After removing S S S 0 ASA ab a SA AS S ASA ab a SA AS A B S B b After removing S 0 S

The Conversion (example) Then, we remove A B S 0 ASA ab a SA AS S ASA ab a SA AS A B S B b Before removing A B S 0 ASA ab a SA AS S ASA ab a SA AS A b S B b After removing A B

The Conversion (example) Then, we remove A S S 0 ASA ab a SA AS S ASA ab a SA AS A b S B b Before removing A S S 0 ASA ab a SA AS S ASA ab a SA AS A b ASA ab B b a SA AS After removing A S

The Conversion (example) Then, we apply Step 4 S 0 ASA ab a SA AS S ASA ab a SA AS A b ASA ab a SA AS B b Before Step 4 S 0 AA 1 UB a SA AS S AA 1 UB a SA AS A b AA 1 UB a SA B b AS A 1 SA U a After Step 4 Grammar is in CNF

Next Time Pushdown Automaton (PDA) An NFA equipped with a stack Power of CFG = Power of PDA In terms of describing a language