CPS 220 Theory of Computation

Similar documents
CPS 220 Theory of Computation Pushdown Automata (PDA)

CISC4090: Theory of Computation

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Theory of Computation (II) Yijia Chen Fudan University

Notes for Comp 497 (Comp 454) Week 10 4/5/05

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

MA/CSSE 474 Theory of Computation

Introduction to Theory of Computing

Notes for Comp 497 (454) Week 10

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Computational Models - Lecture 4

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Pushdown Automata: Introduction (2)

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Foundations of Informatics: a Bridging Course

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Lecture 17: Language Recognition

Properties of Context-Free Languages

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

CS 455/555: Finite automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Pushdown Automata (2015/11/23)

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Computational Models - Lecture 3

Einführung in die Computerlinguistik

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Context Free Grammars

CS20a: summary (Oct 24, 2002)

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Section 1 (closed-book) Total points 30

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Non-context-Free Languages. CS215, Lecture 5 c

Theory of Computation

Properties of Context-Free Languages. Closure Properties Decision Properties

Chap. 7 Properties of Context-free Languages

CSE 355 Test 2, Fall 2016

Context-Free and Noncontext-Free Languages

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Computational Models - Lecture 4 1

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

NPDA, CFG equivalence

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Introduction to Formal Languages, Automata and Computability p.1/42

Computability Theory

MA/CSSE 474 Theory of Computation

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Computational Models - Lecture 5 1

Context Free Languages and Grammars

Context Free Language Properties

5 Context-Free Languages

Chapter 6. Properties of Regular Languages

Automata and Computability. Solutions to Exercises

Computer Sciences Department

Theory of Computation - Module 3

CS375: Logic and Theory of Computing

10. The GNFA method is used to show that

The Pumping Lemma for Context Free Grammars

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Automata and Computability. Solutions to Exercises

Pushdown Automata (Pre Lecture)

Computational Models - Lecture 4 1

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Properties of Context-free Languages. Reading: Chapter 7

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

2.1 Solution. E T F a. E E + T T + T F + T a + T a + F a + a

Properties of context-free Languages

CS481F01 Prelim 2 Solutions

Final exam study sheet for CS3719 Turing machines and decidability.

MTH401A Theory of Computation. Lecture 17

Context-Free Grammars (and Languages) Lecture 7

Context-Free Languages (Pre Lecture)

DM17. Beregnelighed. Jacob Aae Mikkelsen

CSE 105 THEORY OF COMPUTATION

Chapter 3. Regular grammars

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

CSE 105 THEORY OF COMPUTATION

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

This lecture covers Chapter 7 of HMU: Properties of CFLs

Context-Free Grammars and Languages. Reading: Chapter 5

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

CISC 4090 Theory of Computation

CS5371 Theory of Computation. Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)

Context-Free Languages

Pushdown Automata. Chapter 12

Miscellaneous. Closure Properties Decision Properties

Einführung in die Computerlinguistik

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Finite Automata and Regular languages

HW6 Solutions. Micha l Dereziński. March 20, 2015

Computational Models: Class 5

Pushdown Automata. Reading: Chapter 6

Theory of Computation (IV) Yijia Chen Fudan University

CISC 4090 Theory of Computation

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Computability and Complexity

Transcription:

CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of states and no extra memory, recognizing exactly the Regular Languages. Finite Automata can be Deterministic or Nondeterministic, and these two kinds are equivalent in the sense that they recognize the same languages, but it is interesting to note that Nondeterministic Finite Automata can provide much more concise (in the extreme case, exponentially shorter!) descriptions of the same language as Deterministic ones. 2 Syntactic description: Regular Expressions are a class of expressions built out of a given alphabet Σ { ε }, and with operations Union, Concatenation, and Star. Some important properties of Regular Languages are: 1. If a language L is finite, then it is regular. 2. If a language L is regular, then the complement of L (Σ * - L ) is also regular. This closure under complementation is a very important and rare property. 3. If L is regular, then L R, the language of the strings of L reversed, is regular. 4. If L 1 and L 2 are regular, then L 1 L 2, L 1 L 2, and L 1 L 2 are regular. The closure under intersection follows from closure under complementation, because: L 1I L2 = L 1 U L2 [Note: stay tuned for project 1 - dealing with regular expressions] In order to prove that a language is not regular, a very useful tool is the Pumping Lemma

New Topic: Now we will start on a new topic, and we will describe a more powerful model of computation: the Context-Free Languages (CFL). This category of languages again has two ways to be described: 1 Machine description: CFLs can be recognized by Pushdown Automata (PDA). Those are basically Nondeterministic Finite Automata with an additional memory device, the stack. The stack is an unlimited size, First-In First-Out (FIFO) memory. One important thing to keep in mind is that the PDAs need nondeterminism in order to recognize the CFLs. So when the stack comes to play, the equivalence of the deterministic and nondeterministic machines breaks. 2 Syntactic description: CFLs can be recognized by a syntactic way of producing strings according to a finite set of rules, the Context-Free Grammars. Context Free Grammars and Languages are commonly used in syntactic parsers, such as those seen in compilers or in the XML language. Regular Languages Some nonregular languages Languages generated by CFGs

1. Context-Free Grammars - "more powerful method for describing languages" Let s start with an example: A A 1 A B B ε Start: A; Σ = {, 1} Definitions: substitution rules, variable, terminal, start variable, derivation This is a Context-free Grammar (CFG). Here is an example of using it to produce (or derive) a string: A A 1 A 1 1 A 1 1 1... n A 1 n n B 1 n n 1 n L(G) = all strings which can be generated (language of the grammar) L(G) is not a regular language. Why? FA does not have enough memory - however a new model (PDA) has stack memory. Definition: A context-free grammar (CFG) is a 4-tuple (V, T, P, S) such that: 1. V is a finite set of variables, or nonterminals. 2. Σ is the alphabet, here also called the set of terminals. 3. R is a set of derivation rules, or productions. Each rule is of the form: Variable String of variables & terminals. 4. S is a designated start symbol. (S V)

2. Derivations Definition: A Derivation. 1 We say that string u yields string v, denoted u v, if u turns to v after one application of a derivation rule. example: A 1 A 1 1 2 If u turns to v after many rule applications then we say that u * v. example: A 1 * A 1 1 1 1 1 1 3 The sequence u v 1 v 2 v k v is called a derivation of v from u. Definition: The language of a grammar G, L(G) = { w Σ * S * w } Definition: A Context-free language (CFL) is a language generated by a CFG. Practice Problem: Let Σ={,1} and let L(G)={w w contains an equal number of occurrences of the substrings 1 and 1}. Solution: G=({S},{,1},R,S) Set of rules - R S Α Β A 1NZN1A ε B ZNZB ε Z Ζ ε N 1Ν ε Z ε

S A 1NZN1A 1NZN1A 1NZN1A 1! 11 1 1!! 1! [parse tree] 3. Parse Trees. A derivation can be depicted in a parse tree. Example: Σ = {, 1, # }; V = { A } A A 1 # What would the parse tree look like for #111? Reading the leaves of the tree from left to right gives the produced string. The language makes even more sense like this: A A 1 ε Example: A grammar of arithmetic expressions with parentheses.

E E + T T T T F F F ( E ) a Variables (nonterminals): { E, T, F } Symbols (terminals): a + ( ) Parse tree for strings a + a a and (a + a) a What would the parse tree look like for result: ( a + a ) ( a + a )? Note: This language remembers to close the right number of parentheses opened. A FA cannot do that because it has only a finite amount of memory hardwired in its states. Example: L = { n 1 k 2 n k, n } A grammar for L: A A 2 B B 1 B ε Example: L = { w w R w Σ * } A grammar for L: A A 1 1 A ε Designing CFL 1. Divide and conquer

For example to get grammar: { n 1 n n } { n 1 n n } G 1 S 1 " S11! G S 2 " 1S! 2 2 add this starting substitution rule S! S 1 S 2 2. If the language is regular - create a DFA and then convert to CFG as follows: a. Make a variable Ri for each state qi of the DFA. b. Add the rule Ri arj to the CFG if there is a DFA transistion (with a) from state Ri to Rj. c. Add the rule Ri ε if qi is an accept state. d. Make R the start variable - where q is the start state of DFA. example: 1 1 1 1 1 let's start with: 1 1 R1 --> R2 1R1 ε R2 --> 1R1 R2 1 1 R3 --> 1R4 R3 ε R4 --> R3 1R4

1 R --> R1 R3 ε Test out the resulting grammar. c. use this substitution rule R-->uRv Example: 111 S-->S1 ε =============================================================== == Ambiguous grammar - grammar generates the same string in multiple ways (has several different parse trees) E E+E E x E (Ε) a Two different parse tree for same string a+a x a E-->ExE-->E+ExE-->a+ExE-->a+axE-->a+axa E-->E+E-->a+E-->a+ExE-->a+axE-->a+axa Generated ambiguously - 2 different parse trees, not 2 different derivations (same derivated string) Leftmost derivation - leftmost variable is the one replaced <<some CFL can be generated only my ambiguous grammars (inherently ambiguous)>>

=============================================================== ==

The Chomsky Normal Form The Chomsky Normal Form (CNF) allows only the following two kinds of productions: A BC, where B,C are nonterminals, and A a, where a is a terminal. Two more details: 1. The start symbol, S, cannot be at the rhs of any production. 2. We permit the special rule S ε. This is a very useful form when designing algorithms on CFGs, and it is very simple to study mathematically. Algorithm for converting a CFG into Chomsky Normal Form: 1. Create a new start symbol, S, and add the rule S S. Add the rule S ε if ε could be produced by the grammar. 2. Remove ε-productions, except the possible one from S. To do so, whenever R u A 1 u 1 A 2 u k-1 A k u k is a rule, and A yields ε (in any number of steps!), add the 2 k - 1 rules: Example: R u u 1. u k R u A 1 u 1 u 2 u k R u A 1 u 1. u k-2 A k-1 u k-1 u k A BaBaD B b ε C c ε These rules become: A aa Baa aba aad BaaD BaBa abad BaBaD B b C c 3. Remove unit productions. Those are productions of the form A B where A and B are nonterminals. To do so, find all nonterminal pairs X, Y such that X B 1 B k Y is a series of unit productions. For every such pair (X, Y), and for every production

Y u other than the unit productions, add the production X u to the grammar. Example: S AB A A C a aa B b C cc Becomes: S AB a aa cc A a aa cc B b C cc Notice now that C is useless we can remove it and its productions, if we wish. 4. Arrange all remaining productions A u with u 2, to contain only nonterminals. Example: A cdace Generate new nonterminals C, D, and E, and change the above rule to: A CDACE C c D d E e 5. Now each production is of the form A a, where a is a terminal, or A B 1 B k, where each B i is a nonterminal. If k > 2, change the production to: A B 1 C 2 C 2 B 2 C 3 C k-1 B k-1 B k

Example: Grammar G: S S 1 A ε A 1 A ε 1. New start symbol S New grammar: S S ε S S 1 A ε A 1 A ε 2. Eliminate ε productions (except from S ): New grammar: S S ε S S 1 1 A A 1 A 1 Notice, now the new grammar does not produce ε (G did!) 2. Eliminate unit productions. We have to eliminate S A New grammar: S S 1 1 1 A 1 ε S S 1 1 1 A 1 A 1 A 1 3. Arrange all remaining productions X u where u 2 to contain only nonterminals. The grammar becomes: S C S D C D D A 1 ε S C S D C D D A 1 A C A 1 C D 1 4. Finally, arrange all X u to have u 2, by adding new nonterminals if needed. New grammar: S C S C D D A 1 S C S C D D A 1 S S D A C A 1 C D 1