Theory Of Computation UNIT-II

Similar documents
TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Context-Free Grammar

Theory of Computation - Module 3

Properties of Context-free Languages. Reading: Chapter 7

Question Bank UNIT I

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Pushdown Automata. Reading: Chapter 6

Properties of Context-Free Languages

St.MARTIN S ENGINEERING COLLEGE Dhulapally, Secunderabad

Computational Models - Lecture 4

Sri vidya college of engineering and technology

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Properties of Context-Free Languages. Closure Properties Decision Properties

This lecture covers Chapter 7 of HMU: Properties of CFLs

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Theory of Computation Turing Machine and Pushdown Automata

The View Over The Horizon

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Context Free Grammars

Einführung in die Computerlinguistik

Computational Models - Lecture 3

Chomsky and Greibach Normal Forms

MA/CSSE 474 Theory of Computation

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CS375: Logic and Theory of Computing

Context-Free Grammars: Normal Forms

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State Machine (FSM).

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Theory of Computation

INSTITUTE OF AERONAUTICAL ENGINEERING

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Context-Free Grammars and Languages. Reading: Chapter 5

Concordia University Department of Computer Science & Software Engineering

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

The Pumping Lemma for Context Free Grammars

Non-context-Free Languages. CS215, Lecture 5 c

CS 373: Theory of Computation. Fall 2010

CISC4090: Theory of Computation

CPS 220 Theory of Computation

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Theory of Computation (II) Yijia Chen Fudan University

Lecture 11 Context-Free Languages

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Properties of context-free Languages

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Section 1 (closed-book) Total points 30

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

Fundamentele Informatica II

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

In English, there are at least three different types of entities: letters, words, sentences.

Computational Models - Lecture 4 1

FLAC Context-Free Grammars

Functions on languages:

Context-Free Grammars (and Languages) Lecture 7

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Foundations of Informatics: a Bridging Course

Miscellaneous. Closure Properties Decision Properties

Computational Models - Lecture 5 1

Theory of Computation (Classroom Practice Booklet Solutions)

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Notes for Comp 497 (454) Week 10

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Einführung in die Computerlinguistik

FABER Formal Languages, Automata. Lecture 2. Mälardalen University

NPDA, CFG equivalence

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Introduction to Theory of Computing

Introduction to Formal Languages, Automata and Computability p.1/42

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Finite Automata and Regular Languages

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Ogden s Lemma for CFLs

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

The Pumping Lemma and Closure Properties

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

DM17. Beregnelighed. Jacob Aae Mikkelsen

6.8 The Post Correspondence Problem

UNIT II REGULAR LANGUAGES

Uses of finite automata

CSE 355 Test 2, Fall 2016

Chapter 5: Context-Free Languages

Formal Languages, Automata and Models of Computation

Author: Vivek Kulkarni ( )

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Computational Models - Lecture 5 1

Computational Models - Lecture 4 1

CPSC 421: Tutorial #1

Theory & Practice Book. Computer Science & IT. for. Public Sector Exams

컴파일러입문 제 3 장 정규언어

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

Context-Free Languages (Pre Lecture)

Transcription:

Regular Expressions and Context Free Grammars: Regular expression formalism- equivalence with finite automata-regular sets and closure properties- pumping lemma for regular languages- decision algorithms for regular sets- applications. Context-Free Grammars derivation trees,, ambiguous and unambiguous grammarsequivalence of regular grammar and finite automata- Chomsky Normal Forms and Greibach Normal Forms pumping lemma for Context free languages applications. REGULAR SETS Regular languages can also be defined, from the empty set and from some finite number of singleton sets, by the operations of union, composition, and Kleene closure. Specifically, consider any alphabet. Then a regular set over is defined in the following way. The empty set Ø, the set {} containing only the empty string, and the set {a} for each symbol a in Σ, are regular sets. If L1 and L2 are regular sets, then so are the union L1 L2, the composition L1L2, and the Kleene closure L1*. No other set is regular. The Pumping Lemma for Regular Sets Applications: o The pumping lemma is a powerful tool for providing and proving certain language is regular or not. o It is also useful in the development of algorithms to answer certain questions concerning finite automata, such as whether the language accepted by a given FA is finite or infinite. Statement: Let L be a regular set. Then there is a constant n such that if z is any word in L, and z n. We may write z=uvw in such a way that uv n, v 1 and for all i 0, then uv i w is in L. Proof: If a language is accepted by a DFA M= (Q, Σ, δ, q 0, F) then it is regular with some particular number of states, say n. Consider a input of n or more symbols a 1 a 2..a m, m n and for i=1,2,..,m. Let δ (q 0, a 1 a 2..a i ) = q i It is not possible for each of the n+1 states q 0,q 1, q n to be distinct, since there are only different states. This there are two integers j and k, 0 j<k n, such that q j =q k. The path labeled a 1 a 2.a m in the transition diagram of M. Since j<k, the string a j+1...a k is of length 1. A j+1.a k a 1. a j a k+1 a m q 0 q j =q k q m If q m is in F, that is a 1 a 2..a m is in L(n), then a 1 a 2..a j a k+1 a k+2 a m is also in L(n). δ(q 0, a1..a j a k+1 a m )= δ(δ(q 0, a1..a j ) a k+1 a m ) =δ(q j, a k+1 a m ) (since δ(q 0, a1..a j )= q j ) = δ(q k, a k+1 a m ) (since q j = q k ) =q m which is the final state of the automata to be accepted. J. Veerendeswari/IT/RGCET Page 1

Hence proved. Theory Of Computation UNIT-II PROPERTIES OF REGULAR SETS Property 1: The regular sets are closed under union, concatenation, and Kleene closure. Proof: Let Σ be a finite set of symbols and let L, L 1 and L 2 be sets of strings from Σ *. The concatenation of L 1 and L 2, denoted L 1 L 2, is the set {xy x is in L 1 and y is in L 2 }. That is, the strings in L 1 L 2 are formed by choosing a string L 1 and following it by a string in L 2, in all possible combinations. Define L 0 = {ε} and L i =LL i-1 for i 1. The Kleene closure of L denoted L *, is the set L * = i=0 L i and the positive closure of L, denoted L +, is the set L + = i=1 L i That is, L * denotes words constructed by concatenating any number of words from L. L + is the same, but the case of zero words, whose concatenation is defined to be ε, is excluded. Note that L + contains ε if and only if L does. Property 2: The class of regular sets is closed under complementation. That is, if L is a regular set and L Σ *, and then Σ * -L is a regular set. Proof: Let L be L (M) for DFA M= (Q, Σ 1, δ, q 0, F) and L Σ *. First, we may assume Σ 1 =Σ, for if there are symbols in Σ 1 not in Σ, we may delete all transitions of M on symbols not in Σ. The fact that L Σ * assures us that we shall not thereby change the language of M. If there are symbols in Σ not in Σ 1, then none of these symbols appear in words of L. We may therefore introduce a dead state d into M with δ (d, a) = d for all a in Σ and δ (q, a) = d for all q in Q and a in Σ-Σ 1. Now, to accept Σ * -L, complement the final states of M. That is, let M = (Q, Σ 1, δ, q 0, Q-F). Then M accepts a word w if and only if δ (q 0, w) is in Q-F, that is, w is in Σ * -L. Note that it is essential to the proof that M is deterministic and without ε moves. Property 3: The regular sets are closed under intersection. Proof: L 1 L 2 = (L 1 U L 2 ), where the overbar denotes complementation with respect to an alphabet including the alphabets of L1 and L 2.Closure under intersection then follows from closure under union and complementation. It is worth noting that a direct construction of a DFA for the intersection of two regular sets exists. The construction involves taking the Cartesian product of states, and we sketch the construction as follows: Let M 1 = (Q 1, Σ, δ 1, q 1, F 1 ) and M 2 = (Q 2, Σ, δ 2, q 2, F 2 ) be two deterministic finite automata. Let M= (Q 1 * Q 2, Σ, δ, [q 1, q 2 ], F 1 * F 2 ) where for all p 1 in Q 1, p 2 in Q 2, and a in Σ, δ ([p 1, p 2 ], a) = [δ1 (p 1, a), δ 2 (p 2, a)] Property 4: The class of regular sets is closed under substitution. J. Veerendeswari/IT/RGCET Page 2

Proof: Let R Σ * be a regular set and for each a in Σ let R a * be a regular set. Let f: Σ-> * be the substitutions denoting R and R a. Replace each occurrence of the symbol a in the regular expression R by the regular expression for R a. To prove that the resulting regular expression denotes f, observe that the substitution of a union, product or closure is the union, product or closure of the substitution. [Thus for example, f (L 1 L 2 ) = f (L 1 )Uf (L 2 ).] A simple induction on the number of operators in the regular expression completes the proof. A type of substitution that is of special interest is the homomorphism. A homomorphism h is a substitution such that h(a) contains a single string for each a. We generally take h(a) to be the string itself, rather than the set containing that string. It is useful to define the inversehomomorphic image of a language L to be h -1 (L)={λ h(x) is in L} We also use, for string w; h -1 (w)={λ h(w) = w} Property 5: The class of regular sets is closed under homomorphism and inverse homomorphism. Proof: Closure under homomorphism follows immediately from closure under substitution, since every homomorphism is a substitution, in which h(a) has one member. To show closure under inverse homomorphism, let M= (Q, Σ, δ, q 0, F) be a DFA accepting L, and let h be a homomorphism from to Σ *. We construct a DFA M that accepts h -1 (L) by reading symbol a in and simulating M on h(a). Formally, let M = (Q, Σ, δ, q 0, F) and define δ (q, a) for q in Q and a in to be δ(q, h(a)). Note that h(a) may be long string, or ε, but δ is defined on all strings by extension. It is easy to show by induction on x that δ (q 0,x)=δ(q,h(x)). Therefore M accepts x if and only if M accepts h(x). That is, L(M )= h -1 (L(M)). Property 6: The class of regular sets is closed under quotient with arbitrary sets. Proof: Let M= (Q, Σ, δ, q 0, F) be a finite automaton accepting some regular set R, and let L be an arbitrary language. The quotient R/L is accepted b a finite automaton M = (Q, Σ, δ, q 0, F ), which behaves like M except that the final states of M are all states q of M such that δ (q 0, xy) is in F. Thus M accepts R/L. CONTEXT FREE GRAMMARS A context-free grammar is a 4-tuple (V, P, T, S), where 1. V is a finite set called the variables, 2. T is a finite set, disjoint from V, called the terminals, 3. P is a finite set of Productionrules, with each rule being a variable and a string of variables and terminals, and 4. S is the start symbolε V J. Veerendeswari/IT/RGCET Page 3

Rules to be followed while writing a CFG. 1. A single non terminal should be at LHS 2. Production should be always in the form of LHS RHS where RHS may be combination ofnon terminal and terminal symbol. 3. The null derivation can be specified as NT ε 4. One of the NT should be start symbol. Any language that can be generated bysome context-free grammar is called a context-free language (CFL). Derivation Derivation from S means generation of string ω from S Types of derivation 1. Left most derivation The derivation in which the left most non terminal is always replaced at each step (choose leftmost non terminal in a sentential form ) Example : Given the grammar ( set of productions) E E + E E E * E E id Obtain the left most derivation for the string id*id E => E * E => id * E => id * id 2. Right most derivation The derivation in which the right most non terminal is always replaced at each step (choose leftmost non terminal in a sentential form) Example : Given the grammar ( set of productions) E E + E E E * E E id Obtain the left most derivation for the string id*id E => E * E => E * id => id * id way: PARSE TREE OR DERIVATION TREE The parse tree is the diagrammatic representation of a derivation, which can be defined in the following A vertex with a label which is a non terminal symbol is a parse tree. If A y1y2 yn is a rule in R, then the treeis a parse tree J. Veerendeswari/IT/RGCET Page 4

Example : Given the grammar ( set of productions) E E + E E E * E E ( E) E id Obtain the left most derivation for the string id*id+id Left Most Derivation Tree Right Most Derivation Tree Example: Consider the following production S->aB/bA A->aS/bAA/a B->bS/aBB/b for string aaabbabbba find LMD and RMD Left Most Derivation: S->aB ->aabb ->aaabbb ->aaabbb ->aaabbb ->aaabbabb ->aaabbabb J. Veerendeswari/IT/RGCET Page 5

->aaabbabbs ->aaabbabbba ->aaabbabbba Right Most Derivation: S->aB ->aabb ->aabbs ->aabbba ->aabbba ->aaabbbba ->aaabbbba ->aaabsbbba ->aaabbabbba ->aaabbabbba SIMPLIFICATION OF CFG All the grammars are not always optimized. That means grammar may consists of some extra symbol(non terminal). Having extra symbols unnecessary increases the length of the grammar.simplification of grammar means reduction of grammar by removing useless symbol. The properties of the reduced grammar: 1. Each variable (ie non terminal) and each terminal of G appears in the derivation of some word in L. 2. There should not be any production as X Y where X and Y arenon terminal 3. If εis not in the language L then there need not be the production X Reduced grammar Removal of useless symbol elimination of production removal of unit production Removal of useless symbol Any symbol is useful when it appears on the right hand side in the production rule and generates some terminal string. If no such derivation exists then it is supposed to be an useless symbol. A symbol p is useful if there exists some derivation in the following form S=>aPB apb=>w Then p is said to be useful symbol.where and may be some terminal or non terminal symbol and will help us to derive certain string w in combination with p. Example:1 G={V,T,P,S} where V={S,T,X}, T={0,1} S 0T 1T X 0 1 rule 1 J. Veerendeswari/IT/RGCET Page 6

T 00 - rule 2 Now, in the above CFG the non terminal are S, T, X To derive some string we have to start from start symbol S S 0T S 0T 000 T 00 Thus we can reach to certain string after following these rules But if S X then there is no further rule as a definition to X. That means there is no point in the rule S X. hence we can declare that X is a useless symbol. And we can remove this so after removal of this useless symbol CFG becomes G=(V,T,P,S) where V={S,T} T={0,1} and P={S 0T 1T 0 1 T 00} S is start symbol Example:2 Consider the CFG G= {V,T,P,S} where V={S,A,B} T={0,1} P={S A11B 11A S B 11 A 0 B BB} solution Now in the given CFG if we try to derive any string A gives some terminal symbol as 0 but B does not give any terminal string. By following the rules with B we simply get sample number of B and no significant string. Hence we can declare B as useless symbol and can remove the rules associated with it. Hence after removal of useless symbol we get S 11A 11 A 0 Elimination of ε production If there is production we can remove it without changing the meaning of the grammar. Thus εproduction are not necessary in a grammar. Example:1 S 0S 1S ε Then we remove production. But we have to take a care of meaning of CFG. Ie meaning of CFG should not be get changed if we place S ε in other rules we get S 0 when S 0S and S ε J. Veerendeswari/IT/RGCET Page 7

As well as S 1 when S 1S and S ε Hence we can rewrite the rules as S 0S 1S 0 1 Thus production is removed. Example:2 S XYX X 0X ε Y 1Y ε Now while removing production we are deleting the rules X ε and Y ε to preserve the meaning of CFG we are actually placing ε at right hand side wherever X and Y have appeared. Let us take S XYX If first X at right hand side isε Then S YX Similarly if last X in RHS =ε Then S XY If Y= ε then S XX If Y and X are then S X S Y when both X are replaced by ε S XY YX XX X Y Now let us consider X 0X If we place ε at right hand side for X then X 0 X 0X 0 Y 1Y 1 We can rewrite the CFG with removed ε production as S XY YX XX X Y X 0X 0 Y 1Y 1 Removing unit production: The unit productions are the productions in which one non terminal gives another non terminal. Eg X Y Y Z Then X,Y and Z are unit productions. To optimize the grammar we need to remove the unit production. If A->B production we should add a rule A->x1x2x3x4 xn. J. Veerendeswari/IT/RGCET Page 8

Example 1: S 0A 1B\C A 0S 00 B 1 A C 01 SOLUTION: Clearly S C is a unit production. But while removing S C we have to consider what C gives. So, we can add a rule to S. S 0A\1B 01 Similarly B Ais also a unit production so we can modify it as B 1 0S 00 Thus finally we can write CFG without unit production as S 0A 1B 01 A 0S 00 B 1 0S 00 C 01 Example: 2 S A 0C1 A B 01 10 C ε CD SOLUTION: S A Bis a unit production C is a null production. C CD B and D are useless symbol. Reducing a grammar we have to avoid all the above conditions Let S Ai.e. A B is a useless symbol because B is not defined further more S 01 10 i.e. S 01 10 0C1 but C ε hence ultimately S 01 10 A but we can remove this production since B is a useless symbol. Hence A 01 10 But the start symbol S 01\10 There is no A in the derivation of A so by considering A also as a useless symbol we get final CFG as S 01 10 J. Veerendeswari/IT/RGCET Page 9

CHOMSKY NORMAL FORM (CNF) A Context-Free Grammar G is in Chomsky Normal Form if every production is of the form Non Terminal->Terminal Non Terminal-> Non Terminal*Non Terminal The given CFG should be converted in the above format then we can say that grammar is in CNF. Before converting the grammar into CNF it should be in reduced form. That means remove all useless symbol, ε production and unit production from it. Then this reduced grammar is converted into CNF. For Example: Consider G whose productions are S->AB, A->a, B->b. Then G is in Chomsky Normal Form. Reduction to Chomsky Normal Form The steps involved in reduction of CFG to CNF are Step 1: Elimination of null productions and unit productions. Let the resultant grammar be G= (V N, Σ, P, S) Step 2: Elimination of terminals on RHS. We define G= (V N, Σ, P, S ) where P and V N are constructed as follows: (i)all the productions in P of the form A->a or A->BC are included in P1. All the variables in V N are included in V N. C ai to V N (ii) Consider A->X 1 X 2.X n with some terminal on RHS of X i is a terminal say a i, add a new variable and C a ->α i to P1. In production A->X 1 X 2 X n, every terminal on RHS is replaced by the corresponding new variables on the RHS are retained. The resulting production is added to P 1. Thus we get G 1 = (V N, Σ, P 1, S). Step 3: Restricting the number of variables on RHS. For any production in P 1, the RHS consists of either a single terminal or two or more variables. We define G= (V N, Σ, P 2, S) as follows: (i)all productions in P 1 are added to P 2 if they are in the required form. All the variables in V N. (ii) Consider A->A 1 c 1, C 1 ->A 2 C 2, c m-2 ->A m-1 A m and new variables are c 1, c 2,.., c m-2. Example 1: Find the grammar in CNF equivalent to S aabb, A aa/a, B bb/b Step 1:As there are no unit productions or null productions, we need not carry out step 1. We proceed to step 2. J. Veerendeswari/IT/RGCET Page 10

Step 2: Let G 1 = (V N, {a, b}, P, S) where P 1 and V N are constructed. Add productions A->a, B->b to P 1. Hence S->aAbB ~ (1) replaced as a) S->C a AC b B Add new productions C a ->a and C b ->b b) A->aA becomes A->C a A c) B->bB becomes B->C b B Step 3: a) S->C a AC b B is converted as S->C a D 1 where D 1 ->AC b B Again converted as D 1 ->AD 2 where D 2 ->C b B b) A->C a A They are already in CNF c) B->C b B They are already in CNF Hence the resultant CNF are S->C a D 1 D 1 ->AD 2 D 2 ->C b B A->C a A B->C b B A->a B->b Example 2: Find the CNF equivalent to the grammar S ~S/[S S]/p/qwhere ~, [,,], p, q are terminals. Step 1: Consider the Grammar G. S->~S/[S S]/p/q Since there are no null productions and unit productions we can go for reduction. J. Veerendeswari/IT/RGCET Page 11

Step 2: Consider the production (1), S->~S Add new production C 1 ->~ The production S->~S becomes S->C 1 S Step 3: Consider the production (2), S-> [S S] Add new productions C 2 -> [ C 3 -> C 4 ->] Now S->C 2 SC 3 SC 4 S-> C 2 D 1 where D 1 -> SC 3 SC 4 and D 2 -> C 3 SC 4 So D 1 -> SD 2 Again D 2 -> C 3 SC 4 Let D3-> SC 4 And hence D 2 -> C 3 D 3 The resultant CNF productions are S->C 1 S C 1 ->~ S->C 2 D 1 D 1 ->SD 2 D 2 ->C 3 D 3 D3->SC 4 S->p Hence derived. S->q Note: NT-Terminal T-Terminal 1. Convert the following grammar to Chomsky Normal Form(CNF) G: S aad A ab/bab B b D d J. Veerendeswari/IT/RGCET Page 12

Solution: Consider S aad The production is of the form NT T*NT*NT Replace a by C a. We get, S C a AD Let D 1 AD. Then the production becomes S C a D 1 which is in CNF. Now consider A ab This is of the form NT T*NT Since C a a, A->aB becomes A C a B which is in CNF. Let us now consider A bab. This production is of the form NT T*NT*NT Replace b by C b and AB by D 2. We get, A C b D 2. The other two productions B b and D d are already in CNF. Hence the resultant CNF productions are S->C a D 1 D 1 ->AD A->C a B A->C b D 2. D 2. ->AB B->b D->d 2. Convert the following grammar to Chomsky Normal Form (CNF) G: S absb/a/aab A bs/aaab Solution: Consider S absb This production is of the form NT T*NT*T*NT Replacing a by C a and b by C b we get, S C a C b SC b This is of the form NT NT*NT*NT*NT. So we replace C b SC b by D 1. Hence the production becomes, S C a D 1 which is of the CNF. (Replace SC b by D to convert D 1 into CNF. So D->SC b ) Since S a is already in CNF we now consider S aab The above production is of the form NT->T*NT*T. Replace a by C a and b by C b. So now, S C a AC b. Replace AC b by D 2 to convert the production to CNF. The production now becomes S C a D 2. J. Veerendeswari/IT/RGCET Page 13

Let us now consider the production A bs. Replacing b by C b the grammar gets converted into CNF, A C b S. The last production to be converted to CNF is A aaab. Replacing C a and b by C b we get A C a AAC b. Let us now consider D 3 AD 2 where D 2 AC b. So the production becomes A C a D 3 which is in CNF. Hence the resultant CNF productions are D->SC b D 1 ->C b D S->C a D 1 D 2 ->AC b S->C a D 2 A->C a D 3 D 3 ->AD 2 D 2 ->AC b S->a 3. Convert the following grammar to Chomsky Normal Form (CNF) G: S ASA/bA A B/S B C Solution: Consider S ASA. The above production is of the form NT NT*NT*NT Let D 1 SA. Replacing SA the production becomes S AD 1 which is in CNF. Now consider S ba. Replace b by C b in order to convert the production to CNF. S->bA becomes S C b A. Consider the productions A B, A S and B C. They can also be written as A εb, A εs, B εc. If we assume C a ε then the productionsbecome A C a B, A C a S, B C a C which are in CNF. Hence the resultant CNF productions are J. Veerendeswari/IT/RGCET Page 14

S AD 1 D 1 SA A C a B S C b A A C a S B C a C 4. Convert the following grammar to Chomsky Normal Form (CNF) G: S 1A/0B A 1AA/0S/0 B 0BB/1S/1 Solution: Consider S 1A. The above production is of the form NT T*NT. In order to convert it to CNF, we replace 1 by C a. Upon replacing we get, S C a A which is in CNF. Now consider S 0B. The above production is of the form NT T*NT. In order to convert it to CNF, we replace 0 by C b. Upon replacing we get, S C b B which is in CNF. The next production to be considered is A 1AA which is of the form NT T*NT*NT. Let us assume that D 1 AA. Also we know C a 1. The production becomes A->C a D 1 which is in CNF. The next production A 0S is of the form NT T*NT. We know C b 0. Upon replacing we get, A C b S which is in CNF. The other production A 0 is already in CNF. Now take into account the production B->0BB which is of the form NT T*NT*NT. Let us assume that D 2 BB. Also we know C b 0. The production becomes B C b D 2 which is in CNF. The next production B 1S is of the form NT T*NT. We know C a 1. Upon replacing we get, B C a S which is in CNF. The other production B 1 is already in CNF. Hence the resultant CNF production S->C a A S->C b B D 1 ->AA A->C a D 1 A->C b S A->0 B->C a S J. Veerendeswari/IT/RGCET B->0 Page 15

GREIBACH NORMAL FORM (GNF) A Context-Free Grammar G is in Greibach Normal Form if every production is of the form A aα where α V N * and a Σ (Non Terminal Terminal*any number of Non Terminals) A a (Non Terminal Terminal) are in GNF. Greibach Normal Form Algorithm: For Example: S aab A bc B b C c begin for k:=1 to m do begin for j=1 to k-1 do for each production of the form Ak->Ajα do begin for all productions Aj->β do add production Ak->βα; remove production Ak->Ajα end; for each production of the form Ak->Akα do begin add production Bk->α and Bk->α Bk; remove production Ak->Akα end; for each production Ak->β, where β does not begin with Ak do end add production Ak->β Bk end J. Veerendeswari/IT/RGCET Page 16

Reduction to Greibach Normal Form Lemma (1): Let G= (V,P,T, S) be a CFG. Let A Bγ be an A-production in P. Let the B-production be B B 1 B 2.. B n Lemma (2): Let G= (V, Σ, P, S) be a CFG. Let A->Bγ be an A-production be A->Aα 1.. Aα γ B 1.. B n then z be a new variable when P 1 is defined as: (i)the set of A-productions in P 1 are A->β 1 β 2. β n A-> β 1 z β 2 z. β n z (ii)the set of z-productions in P 1 are z->α 1 α 2 α n z->α 1 z α 2 z α n z Example: Construct Equivalent GNF for the CFG S->AA/a,A->SS/b Step 1: The given grammar is in CNF. S and A are renamed as A 1 and A 2. Hence the productions become A 1 -> A 2 A 2 /a, A 2 -> A 1 A 1 /b There is no need for null production or unit production elimination because the production is already in CNF. Step 2:The A 1 productions are in the required form. They are A 1 -> A 2 A 2 /a. The production A 2 ->b is also in the required form whereas we have to apply lemma 1 to convert the production A 2 -> A 1 A 1 to the required form. Applying lemma 1 we get, A 2 -> A 2 A 2 A 1 A 2 -> aa 1 Step 3: We apply lemma 2 to A 2 productions as we have A 2 -> A 2 A 2 A 1 Let z 2 be a new variable. The resulting productions are A 2 -> aa 1 A 2 ->b A 2 -> aa 1 z 2 A 2 ->bz 2 z 2 ->A 2 A 1 z 2 -> A 2 A 1 z 2 Step 4: (i) The A 2 productions are A 2 -> aa 1 / b/ aa 1 z 2 (ii)among the A 1 productions we retain A 1 ->a and eliminate A 1 -> A 2 A 2 using lemma 1. The resulting productions are A 1 -> aa 1 A 2 /ba 2, A 1 -> aa 1 z A 2 /bz 2 A 2. The set of all A 1 -productions is A 1 ->a/ aa 1 A 2 /ba 2 / aa 1 z A 2 /bz 2 A 2 J. Veerendeswari/IT/RGCET Page 17

Step 5: The z 2 productions to be modified are z 2 -> A 2 A 1, z 2 -> A 2 A 1 z 2. Applying lemma 1 we get z 2 -> aa 1 A 1 /ba 1 /aa 1 z 2 A 1 / bz 2 A 1 z 2 -> aa 1 A 1 z 2 /ba 1 z 2 /aa 1 z 2 A 1 z 2 / bz 2 A 1 z 2 Hence the equivalent grammar is A 1 ->a/ aa 1 A 2 /ba 2 / aa 1 z A 2 /bz 2 A 2 A 2 -> aa 1 / b/ aa 1 z 2 / bz 2 z 2 -> aa 1 A 1 /ba 1 /aa 1 z 2 A 1 / bz 2 A 1 z 2 z 2 -> aa 1 A 1 z 2 /ba 1 z 2 /aa 1 z 2 A 1 z 2 / bz 2 A 1 z 2 Note: NT-Terminal T-Terminal 1. Let us convert to Greibach normal form the grammar G= ({A 1, A2, A3}, {a, b}, P, A 1 ), where P consists of the following: A 1 ->A 2 A 3 A 2 ->A 3 A 1 /b A 3 ->A 1 A 2 /a Solution: Step 1: Since the right-hand side of the productions for A 1 and A 2 start with terminals or highestnumbered variables, we begin with the production A 3 ->A 2 A 3 is the only production with A 1 on the left. The resultant set of productions is: A 1 ->A 2 A 3 A 2 ->A 3 A 1 /b A 3 ->A 2 A 3 A 2 /a Since the right side of the production A 3 ->A 2 A 3 A 2 begins with a lower-numbered variable, we substitute for the first occurrence of A 2 both A 3 A 1 and b. Thus A 3 ->A 2 A 3 A 2 is replaced by A 3 ->A 3 A 1 A 3 A 2 and A 3 ->ba 3 A 2. The new set is A 1 ->A 2 A 3 A 2 ->A 3 A 1 /b A 3 -> A 3 A 1 A 3 A 2 / ba 3 A 2 /a We now apply lemma 2 to the productions A 3 -> A 3 A 1 A 3 A 2 / ba 3 A 2 /a Let z 2 be a new variable. The resulting productions are A 1 ->A 2 A 3 A 2 ->A 3 A 1 /b A 3 ->ba 3 A 2 z 2 /az 2 / ba 3 A 2 /a z 2 -> A 1 A 3 A 2 / A 1 A 3 A 2 z 2 Step 2: Now all the productions with A 3 on the left have right-hand sides that start with terminals. These are used to replace A 3 in the productions A 2 ->A 3 A 1 and then the productions with A 2 on the left are used to replace A 2 in the production A 1 ->A 2 A 3. The result is the following. A 3 ->ba 3 A 2 z 2 A 3 ->ba 3 A 2 J. Veerendeswari/IT/RGCET Page 18

A 3 ->az 2 A 3 ->a A 2 ->ba 3 A 2 z 2 A 1 A 2 ->ba 3 A 2 A 1 A 2 ->az 2 A 1 A 2 ->aa 1 A 2 ->b A 1 ->ba 3 A 2 z 2 A 1 A 3 A 1 ->ba 3 A 2 A 1 A 3 A 1 ->b z 2 A 1 A 3 A 1 ->b A 1 A 3 A 1 ->ba 3 z 2 -> A 1 A 3 A 2 z 2 -> A 1 A 3 A 2 z 2 Step 3:The two z 2 productionsare converted to proper form, resulting in 10 more productions. That is, the productions z 2 -> A 1 A 3 A 2 z 2 -> A 1 A 3 A 2 z 2 are altered by substituting the right side of each of the five productions with A 1 on the left for the first occurrences of A 1. Thus z 2 -> A 1 A 3 A 2 becomes z 2 ->ba 3 A 2 z 2 A 1 A 3 A 3 A 2 z 2 ->az 2 A 1 A 3 A 3 A 2 z 2 -> ba 3 A 3 A 2 z 2 -> ba 3 A 2 z 2 A 1 A 3 A 3 A 2 z 2 -> aa 1 A 3 A 3 A 2 The other production for z 2 is replaced similarly. The final set of productions is A 3 ->ba 3 A 2 z 2 A 3 ->ba 3 A 2 A 3 ->az 2 A 3 ->a A 2 ->ba 3 A 2 z 2 A 1 A 2 ->ba 3 A 2 A 1 A 2 ->az 2 A 1 A 2 ->aa 1 A 2 ->b A 1 ->ba 3 A 2 z 2 A 1 A 3 A 1 ->ba 3 A 2 A 1 A 3 A 1 ->b z 2 A 1 A 3 A 1 ->b A 1 A 3 A 1 ->ba 3 z 2 -> ba 3 A 2 z 2 A 1 A 3 A 3 A 2 z 2 -> ba 3 A 2 z 2 A 1 A 3 A 3 A 2 z 2 -> az 2 A 1 A 3 A 3 A 2 z 2 z 2 -> az 2 A 1 A 3 A 3 A 2 z 2 -> ba 3 A 3 A 2 z 2 z 2 -> ba 3 A 3 A 2 z 2 -> ba 3 A 2 A 1 A 3 A 3 A 2 z 2 z 2 -> ba 3 A 2 A 1 A 3 A 3 A 2 z 2 -> aa 1 A 3 A 3 A 2 z 2 z 2 -> aa 1 A 3 A 3 A 2 J. Veerendeswari/IT/RGCET Page 19

AMBIGUOUS AND UNAMBIGUOUS GRAMMARS Ambiguous Grammar: If the production contains more than one derivation or derivation tree then it is termed as an ambiguous grammar. For example consider the production, S->S+S/S*S/a/bWe have to derive the string a+a*b The derivation is as follows: S->S+S ->S+S*S ->a+a*b Hence derived. The same string can also be derived in the following manner S->S*S ->S+S*S ->a+a*b Hence from the above two derivations it is obvious that the string can be derived from more than one derivation. Therefore it is known as ambiguous grammar. Unambiguous Grammar: If the production contains exactly one derivation or one derivation tree then it is termed as an unambiguous grammar. Consider an example: S->a/abSb/aAb A->bS/aAAb, the word to be derived is W=abab Derivation: S->abSb ->abab Hence derived. Now consider S->aAb ->absb ->ababsbb ->ababbbb S->abSb ->ababsbb ->ababbbb A->aAAb ->absbsb ->ababab A->bS ->ba J. Veerendeswari/IT/RGCET Page 20

From the above derivations it is obvious that the string abab can have only one derivation. Hence the grammar can be termed as an unambiguous grammar. THE MYHILL-NERODE THEROEM :ALGORITHM Begin for p in F and q in Q-F do mark (p,q); for each pair of distinct states (p,q) in F*F or (Q-F)*(Q-F) do if for some input a,(δ(p,a), δ(q,a)) is marked then begin Mark(p,q) recursively mark all unmarked pairs on the list for (p,q) and on the lists of the other pairs that are marked at this step. end else for all input symbols a do Put(p,q) on the list for,(δ(p,a), δ(q,a)) unless δ(p,a)= δ(q,a) end PROCEDURE FOR MINIMIZATION OF AUTOMATA: 1. Write down all the states of DFA. 2. Divide all the states by accepting states and non- accepting states (Q1 0 AND Q2 O = Q- Q1 0 ) Which Contains Final State S= Q1 0 U Q2 O. 3. By using transition again divide the state and find the transition in number of same group. 4. Repeat the same procedure until there is no possibility to divide. 5. Finally take one representative from each group in transition table substitute representative for other states in same group for transition. 1 1 o 1 EXAMPLE: Construct Minimal DFA 1 o o o o o a b 1 c d e f g h o 1 J. Veerendeswari/IT/RGCET Page 21 1 1 o

Let DFA have a grammer, M= (Q,, δ, q 0, F) Where, Q={a,b,c,d,e,f,g,h} ={0,1} Initial state= a Final state= d Class A= contains final state A={d} Class B= contains the states other than the final state B={a,b,c,e,f,g,h} Transition table for all classes. For class A A d O A For class B 1 B B a b c e f g h 0 B B A A B B B 1 B B B B B B A For class C={c,e} Class D={h} B a b f g 0 B B C B 1 B C C B For Class B B a g 0 E E 1 B B Class E={b,f} Class B={a,g} For Class C J. Veerendeswari/IT/RGCET Page 22

C c e 0 A A 1 E E For Class D D h 0 B 1 A For Class E E b f 0 B B 1 C C Reduced Transition diagram O 1 A 1 0 B 0 0 1 C 1 0 1 D E J. Veerendeswari/IT/RGCET Page 23