Address for Correspondence

Size: px
Start display at page:

Download "Address for Correspondence"

Transcription

1 Proceedings of BITCON-2015 Innovations For National Development National Conference on : Research and Development in Computer Science and Applications Research Paper SUBSTRING MATCHING IN CONTEXT FREE GRAMMAR 1 Pawan Kumar Patnaik, 1 M.V.Padmavati, 2 Jyoti Singh Address for Correspondence 1 Deptt. of Computer Science & Engg., BIT, Durg 2 Directorate of Technical Education,Raipur (C.G.), India ABSTRACT The purpose of this paper is to propose the complexity of the membership or substring testing problem, not necessarily contiguous for Context Free Grammar in the form of Chomsky Normal Form (CNF). We describe a new algorithm, which exts CYK algorithm for string languages and preserves the polynomial time complexity. 1. INTRODUCTION Pattern matching has a wide range of applications in the fields of pattern recognition, image processing, computer vision etc. In one dimension, this problem is referred to as string matching. String matching has got its applications in the fields of text editing, text searching, data base search, artificial intelligence, information retrieval etc. There are many instances in which one needs to find the occurrences of more than one user-defined pattern in the given text. This problem is known as multiple pattern matching. Library bibliographic search program is one such application. In two dimensions this problem is referred to as pattern matching. In many applications like computational biology, it is desirable to find the approximate matches of the pattern in the given text rather than the exact match. During recent years, many efficient algorithms to locate all occurrences of any of a finite number of keywords and phrases in an arbitrary text string have been developed. Recently, on the other hand, several authors [1-5] have investigated the problem to exactmatch Substring identification in Context Free Languages. Based on Decision problem of substrings in CFL [1], this paper presents a method for design of Substring Matching in Context Free Grammar whose running time is exactly O (n 3 ). First we consider the problem of substring testing. Let = be a string. A substring.,.. not necessarily contiguous is a string of if 1. The substring problem is: Given a CFG G and a substring, does there exist a ( ) such that is a substring of. We give algorithms for solving these problems by modifying CYK algorithm. Section II of this paper describes the background of the work presented. Section III deals CYK algorithm with complexity analysis and example. In Section IV, a procedure for substring matching problems in CFG has been discussed. Finally, the conclusions have been drawn in section V. 2. Preliminaries For simplicity, we will assume that the grammar is given in the Chomsky Normal Form (CNF). Let G = (V,T,P,S) be that grammar, where V denotes the set of Non Terminal symbols T denotes the set of Terminals P denotes the set of Production rules of the form A α S Є V is the start symbol The language generated by the grammar is defined as L(G) = { w w Є T *, S => * w} Note: 1. Two strings α, β are said to be related by => denoted as α => β, when the second string is obtained from the first by one application of some production rule P. 2. Suppose α 1, α 2,, α m are strings from (V U T) *, m 1 and α 1 => * α 2, α 2 => * α 3, α m-1 => * α m then we say α 1 => * α m. => * is the reflexive and transitive closure of =>. Definition 2.1 (Chomsky Normal Form or CNF) Any context free language without is generated by a grammar in which all production rules are of the form A BC or A b. Here A, B and C are nonterminals and b is a terminal. Membership testing of CFG Given a CFG G = < V, T, P, S> in CNF and a string s in T *, to test whether s Є L(G) or not? Solution: We shall present a simple cubic time algorithm known as the Cocke-Younger-Kasami or CYK algorithm. It is based on the dynamic programming technique. Given a string s of length n 1 and a grammar G, which we may assume is in CNF, determine for each i and j and for each non-terminal A, whether A=> * x ij, where x ij is the substring of s from i to j. We proceed by induction on the length. For length = 1 or i = j, A => * x ii If and only if A x ii is a production, since x ii is the i th symbol of string s. Proceeding to higher values of length, if length > 1 then A => * x ij if and only if there is some production A BC and some k, i k j-1, B => * x ik and C=> * x k+1,j. Hence by induction A => * x ij. Finally, when we reach i = 1 and j = n, we may determine whether S => * x 1n. As x 1n = s,s is in L(G) if and only if S => * x 1n 3. Algorithm CYK Algorithm 1. For i = 1 to n do V i,i { A A a i Є P } 2. For len = 2 to n do For i = 1 to n len + 1 do j i + len 1 V i,j = ɸ For k = i to j -1 do V i,j V i,j U { A A BC Є P, B Є V i,k & C Є V K+1,j } 3. If S Є V 1,n then the output is Yes else the output is No. Analysis: The time complexity of the algorithm is order O(n 3 ). Precisely the algorithm takes O(n 3 P ) time. Example: Consider the CFG S AB BC

2 A BA a B CC b C AB a And the input string is x = baaba. The V ij s are shown in table 3.1. S Є V 1,5 implies that the string x = baaba belongs to the language generated by the CFG. Table 3.1: CYK Algorithm j b a a b a i {B} {A,S} ɸ ɸ {S,A,C} 2 - {A,C} {B} {B} {S,C,A} {A,C} {S,C} {B} {B} {S,A} {A,C} 4. Design of Substring Matching in Context Free Grammar In this section, we present algorithms for substring testing problems in CFG. Let w = a 1 a 2..a n be a string. A string a i..a j is a substring of w if 1 i j n. The substring problem can be stated as follows: Given a CFG G and a string w, does there exist a w Є L(G) such that w substring of w. we give algorithms for solving these problems by modifying CYK algorithm. Problem: Given a CFG G =< V,T,P,S> in CNF and a string w=a 1 a 2 a n, to find whether there exists a string w Є L(G) such that w is substring of w? Solution: Let G = < V,T,P,S> be a CFG in CNF. Without loss of generality, we assume L (G). The algorithm is based on the CYK algorithm. We make use of the notion of Left Closure and Right Closure of a set of non-terminals. The algorithms for these are given below: LeftClosure (V N ) lc V (1.1) Add A to lc if A BC Є P for some C Є lc if no new non-terminal got added to lc in this iteration return (lc) RightClosure (V N ) rc V Add A to rc if A CD Є P for some C Є rc if no new non-terminal got added to rc in this iteration return (rc) Analysis: The complexity of the LeftClosure algorithm given above is O( N P ). Note that LeftClosure (V N ) = U A Є V LeftClosure({A}). The LeftClosure for the non-terminals can be precomputed. In that case using the above relation, the required LeftClosure algorithm can be implemented in O( N 2 ). The analysis of RightClosure algorithm is exactly similar to that LeftClosure algorithm and overall complexity of RightClosure algorithm is same as that of LeftClosure algorithm. 4.1 Algorithm Substring Input: A a CFG G = < V,T,P,S> in CNF and a string w Є T + where w = a 1 a 2.a n. Assumption: G does not contain any useless productions or useless symbols. Output: If w is a substring of w Є L(G) then output is Yes else the output is No. DataStructure: V[0:n+1,0:n+1] each entry is a set of non-terminals. Algorithm: Step 1: CYK Algorithm For i = 1 to n do V i,i { A A a i Є P } For len = 2 to n do For i = 1 to n len + 1 do j i + len 1 V i,j = ɸ For k = i to j -1 do V i,j V i,j U { A A BC Є P, B Є V i,k & C Є V K+1,j } Step 2: For j = 1 to n do For k = 0 to j-1 do If (k=0) V 0,j LeftClosure(V 1,j ) (2.1) Else V 0,j V 0,j U { A A BC Є P, B Є V 0,k, C Є V k+1,j } (2.2) V 0,j LeftClosure(V 0,j ) (2.3)

3 Step 3: Step 4: For i = n downto 1 do For k = n+1 downto i+1 do If (k=n+1) V i,n+1 RightClosure(V i,n ) Else V i,n+1 V i,n+1 U { A A BC Є P, B Є V i,k-1, C Є V k,n+1 } V i,n+1 RightClosure(V i,n+1 ) 4.1 if V 1,n ɸ Output Yes 4.2 if V 0,n ɸ Output Yes 4.3 if V 1,n+1 ɸ Output Yes 4.4 V 0,n+1 ɸ Output Yes For k = 1 to n-1 do V 0,n+1 V 0,n+1 U { A A BC, B Є V 0,k, C Є V k+1,n+1 } If (V 0,n+1 ɸ) Output Yes Else Output No Example: Grammar G for a + b + c + d + e + f + S AX X YF Y BZ Z WE W CD A AA B BB C CC D DD E EE F FF A a B b C c D d E e F f LeftClosure({S}) = ɸ LeftClosure({A}) = {A} LeftClosure({B}) = {B} LeftClosure({C}) = {C} LeftClosure({D}) = {D,W} LeftClosure({E}) = {E,Z,Y} LeftClosure({F}) = {F,X,S} LeftClosure({X}) = {S} LeftClosure({Y}) = ɸ LeftClosure({Z}) = {Y} LeftClosure({W}) = ɸ RightClosure({S}) = ɸ RightClosure({A}) = {A,S} RightClosure({B}) = {B,Y,X} RightClosure({C}) = {C,W,Z} RightClosure({D}) = {D} RightClosure({E}) = {E} RightClosure({F}) = {F} RightClosure({X}) = ɸ RightClosure({Y}) = {X} RightClosure({Z}) = ɸ RightClosure({W}) = {Z} Table 4.1: CYK Algorithm j i a b C d e f 1 {A} ɸ ɸ ɸ ɸ {S} 2 - {B} ɸ ɸ {Y} {X} {C} {W} {Z} ɸ {D} ɸ ɸ {E} ɸ {F} Input String: cd Output is Yes and V i,j s are shown in Table 4.2 Analysis: Let w = n Step 1 is CYK Algorithm. Hence it takes O( P n 3 ) time. Step 2.1 takes O( N 2 ) time and Step 2.2 takes O( P ) time. Hence, Step 2 and Step 3 take O(n N 2 +n 2 P ) time. Step 4 takes O(n P ). Thus the algorithm substring is of O( P n 3 ) time. In general for any grammar G, N and P are assumed to be constants, hence the overall complexity of the algorithm is O(n 3 ).

4 Table 4.2: Output of the Substring Algorithm j i C d 0 - {C} {W} {W} 1 - {C} {W} {Z,W} {D} {D} Algorithm Substring Problem: Given a CFG G = < V,T,P,S> in CNF and a string w = a 1 a 2.a n, to find whether there exists a string w Є L(G) such that w is substring of w? Solution: we make use of the notion of closure of a set of non-terminals. The algorithm is given below : Closure(V N) cc V Add A to cc if A CD Є P for some C Є cc or D Є cc If no new non-terminal got added to cc in this iteration return(cc) Algorithm: Input: A a CFG G = < V,T,P,S> in CNF and a string w Є T + where w = a 1 a 2.a n. Assumption: G does not contain any useless productions or useless symbols. Output: If w is a substring of w Є L(G) then output is Yes else the output is No. DataStructure : V[1:n,1:n] each entry is a set of non-terminals. Algorithm: 1. For i = 1 to n do V i,i {A A a i Є P} V i,i Closure(V i,i ) 2. For len = 2 to n do For i = 1 to n-len+1 do J i + len -1 V i,i ɸ For k = i to j-1 do V i,j V i,j U { A A BC Є P, B Є V i,k, C Є V k+1,j } V i,j Closure(V i,j ) 4. If S Є V 1,n then the output is Yes else the output is No. Example 4.2.1: Grammar G for a + b + c + d + e + f + S AX X YF Y BZ Z WE W CD A AA B BB C CC D DD E EE F FF A a B b C c D d E e F f Input String: bdf Output: Yes and V i,j s are shown in Table Analysis: By precomputing Closure({A}), A Є V, Closure algorithm can be implemented in O( N 2 ). Hence the substring algorithm is O( P n 3 +n 2 N 2 ). In general for any grammar G, N and P are considered as constants, the algorithm runs in O(n 3 ) time. Table : Output of substring algorithm j i B d f 1 {B,Y,X,S} {Y,X,S} {X,S} 2 - {D,W,Z,Y,X,S} {X,S} 3 - {F,X,S}

5 5. CONCLUSION: It has been concluded that the Substring matching problem can be efficiently solved by using CYK and Left Closure & Right Closure algorithm. The procedure described here and the behaviors of CYK imply that the Substring matching problem in Context Free Grammar can be solved in exactly O (n 3 ) time. This problem can also be solved by modifying the Grammar G, but we are doing without modifying the grammar. REFERENCES 1. Mauricio Osorio and Juan Antonio Navarro Perez. Decision problem of substrings in context freel anguages. In Juan Humberto Sossa Azuela, Herbert Freeman, and C. Vizcaino, editors, CIC-X: Memorias del X Congreso Interna- cional de Computacion, pages CIC-IPN, Heron Molina-Lozano A new fast fuzzy Cocke Younger Kasami algorithm for DNA strings analysis Int. J. Mach. Learn. & Cyber, 2: , R. Axelsson, K. Heljanko, and M. Lange. Analyzing context-free grammars using an incremental SAT Solver. In Proc. 35th Int. Coll. on Automata Languages and Programming, ICALP 08, Part II, volume 5126 of LNCS, , Stefano Crespi Reghizzi, Matteo Pradella. A CKY parser for picture grammars, information processing Letters 105: , D.C.Kozen. Automata and Computability. Springer Kamala Krithivasan and Rama R., Introduction to Formal languages, Automata Theory and Computation, Pearson Note: This Paper/Article is scrutinised and reviewed by Scientific Committee, BITCON-2015, BIT, Durg, CG, India

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Lecture 12 Simplification of Context-Free Grammars and Normal Forms Lecture 12 Simplification of Context-Free Grammars and Normal Forms COT 4420 Theory of Computation Chapter 6 Normal Forms for CFGs 1. Chomsky Normal Form CNF Productions of form A BC A, B, C V A a a T

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of

More information

Decision problem of substrings in Context Free Languages.

Decision problem of substrings in Context Free Languages. Decision problem of substrings in Context Free Languages. Mauricio Osorio, Juan Antonio Navarro Abstract A context free grammar (CFG) is a set of symbols and productions used to define a context free language.

More information

Properties of Context-Free Languages

Properties of Context-Free Languages Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

Properties of context-free Languages

Properties of context-free Languages Properties of context-free Languages We simplify CFL s. Greibach Normal Form Chomsky Normal Form We prove pumping lemma for CFL s. We study closure properties and decision properties. Some of them remain,

More information

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar TAFL 1 (ECS-403) Unit- III 3.1 Definition of CFG (Context Free Grammar) and problems 3.2 Derivation 3.3 Ambiguity in Grammar 3.3.1 Inherent Ambiguity 3.3.2 Ambiguous to Unambiguous CFG 3.4 Simplification

More information

Non-context-Free Languages. CS215, Lecture 5 c

Non-context-Free Languages. CS215, Lecture 5 c Non-context-Free Languages CS215, Lecture 5 c 2007 1 The Pumping Lemma Theorem. (Pumping Lemma) Let be context-free. There exists a positive integer divided into five pieces, Proof for for each, and..

More information

NPDA, CFG equivalence

NPDA, CFG equivalence NPDA, CFG equivalence Theorem A language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: ( ) L is recognized by a NPDA implies L is described by a CFG. ( ) L is described

More information

CSCI Compiler Construction

CSCI Compiler Construction CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule

More information

A parsing technique for TRG languages

A parsing technique for TRG languages A parsing technique for TRG languages Daniele Paolo Scarpazza Politecnico di Milano October 15th, 2004 Daniele Paolo Scarpazza A parsing technique for TRG grammars [1]

More information

Foundations of Informatics: a Bridging Course

Foundations of Informatics: a Bridging Course Foundations of Informatics: a Bridging Course Week 3: Formal Languages and Semantics Thomas Noll Lehrstuhl für Informatik 2 RWTH Aachen University noll@cs.rwth-aachen.de http://www.b-it-center.de/wob/en/view/class211_id948.html

More information

straight segment and the symbol b representing a corner, the strings ababaab, babaaba and abaabab represent the same shape. In order to learn a model,

straight segment and the symbol b representing a corner, the strings ababaab, babaaba and abaabab represent the same shape. In order to learn a model, The Cocke-Younger-Kasami algorithm for cyclic strings Jose Oncina Depto. de Lenguajes y Sistemas Informaticos Universidad de Alicante E-03080 Alicante (Spain) e-mail: oncina@dlsi.ua.es Abstract The chain-code

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

Formal Languages and Automata

Formal Languages and Automata Formal Languages and Automata Lecture 6 2017-18 LFAC (2017-18) Lecture 6 1 / 31 Lecture 6 1 The recognition problem: the Cocke Younger Kasami algorithm 2 Pushdown Automata 3 Pushdown Automata and Context-free

More information

Ogden s Lemma for CFLs

Ogden s Lemma for CFLs Ogden s Lemma for CFLs Theorem If L is a context-free language, then there exists an integer l such that for any u L with at least l positions marked, u can be written as u = vwxyz such that 1 x and at

More information

Properties of Context-free Languages. Reading: Chapter 7

Properties of Context-free Languages. Reading: Chapter 7 Properties of Context-free Languages Reading: Chapter 7 1 Topics 1) Simplifying CFGs, Normal forms 2) Pumping lemma for CFLs 3) Closure and decision properties of CFLs 2 How to simplify CFGs? 3 Three ways

More information

Even More on Dynamic Programming

Even More on Dynamic Programming Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence

More information

Chap. 7 Properties of Context-free Languages

Chap. 7 Properties of Context-free Languages Chap. 7 Properties of Context-free Languages 7.1 Normal Forms for Context-free Grammars Context-free grammars A where A N, (N T). 0. Chomsky Normal Form A BC or A a except S where A, B, C N, a T. 1. Eliminating

More information

Accepting H-Array Splicing Systems and Their Properties

Accepting H-Array Splicing Systems and Their Properties ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY Volume 21 Number 3 2018 298 309 Accepting H-Array Splicing Systems and Their Properties D. K. SHEENA CHRISTY 1 V.MASILAMANI 2 D. G. THOMAS 3 Atulya

More information

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26 Parsing Context-Free Grammars (CFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 26 Table of contents 1 Context-Free Grammars 2 Simplifying CFGs Removing useless symbols Eliminating

More information

Improved TBL algorithm for learning context-free grammar

Improved TBL algorithm for learning context-free grammar Proceedings of the International Multiconference on ISSN 1896-7094 Computer Science and Information Technology, pp. 267 274 2007 PIPS Improved TBL algorithm for learning context-free grammar Marcin Jaworski

More information

Introduction to Formal Languages, Automata and Computability p.1/42

Introduction to Formal Languages, Automata and Computability p.1/42 Introduction to Formal Languages, Automata and Computability Pushdown Automata K. Krithivasan and R. Rama Introduction to Formal Languages, Automata and Computability p.1/42 Introduction We have considered

More information

Einführung in die Computerlinguistik

Einführung in die Computerlinguistik Einführung in die Computerlinguistik Context-Free Grammars formal properties Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2018 1 / 20 Normal forms (1) Hopcroft and Ullman (1979) A normal

More information

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF) CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF) Announcement Homework 2 will be given soon (before Tue) Due date: Oct 31 (Tue), before class Midterm: Nov 3, (Fri), first hour

More information

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages CSCI 1010 Models of Computa3on Lecture 17 Parsing Context-Free Languages Overview BoCom-up parsing of CFLs. BoCom-up parsing via the CKY algorithm An O(n 3 ) algorithm John E. Savage CSCI 1010 Lect 17

More information

MTH401A Theory of Computation. Lecture 17

MTH401A Theory of Computation. Lecture 17 MTH401A Theory of Computation Lecture 17 Chomsky Normal Form for CFG s Chomsky Normal Form for CFG s For every context free language, L, the language L {ε} has a grammar in which every production looks

More information

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions

CFG Simplification. (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions CFG Simplification (simplify) 1. Eliminate useless symbols 2. Eliminate -productions 3. Eliminate unit productions 1 Eliminating useless symbols 1. A symbol X is generating if there exists: X * w, for

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable

More information

Theory of Computation - Module 3

Theory of Computation - Module 3 Theory of Computation - Module 3 Syllabus Context Free Grammar Simplification of CFG- Normal forms-chomsky Normal form and Greibach Normal formpumping lemma for Context free languages- Applications of

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form:

More information

Formal Languages, Grammars and Automata Lecture 5

Formal Languages, Grammars and Automata Lecture 5 Formal Languages, Grammars and Automata Lecture 5 Helle Hvid Hansen helle@cs.ru.nl http://www.cs.ru.nl/~helle/ Foundations Group Intelligent Systems Section Institute for Computing and Information Sciences

More information

This lecture covers Chapter 7 of HMU: Properties of CFLs

This lecture covers Chapter 7 of HMU: Properties of CFLs This lecture covers Chapter 7 of HMU: Properties of CFLs Chomsky Normal Form Pumping Lemma for CFs Closure Properties of CFLs Decision Properties of CFLs Additional Reading: Chapter 7 of HMU. Chomsky Normal

More information

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove

Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove Finite Automata Theory and Formal Languages TMV026/TMV027/DIT321 Responsible: Ana Bove Tuesday 28 of May 2013 Total: 60 points TMV027/DIT321 registration VT13 TMV026/DIT321 registration before VT13 Exam

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings 11-711: Algorithms for NLP October 10, 2014 Conversion to CNF Example grammar

More information

Properties of Context-Free Languages. Closure Properties Decision Properties

Properties of Context-Free Languages. Closure Properties Decision Properties Properties of Context-Free Languages Closure Properties Decision Properties 1 Closure Properties of CFL s CFL s are closed under union, concatenation, and Kleene closure. Also, under reversal, homomorphisms

More information

CPS 220 Theory of Computation

CPS 220 Theory of Computation CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RNA Secondary Structure Prediction 1 RNA structure prediction methods Base-Pair Maximization Context-Free Grammar Parsing. Free Energy Methods Covariance Models 2 The Nussinov-Jacobson Algorithm q = 9

More information

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4

Chomsky Normal Form and TURING MACHINES. TUESDAY Feb 4 Chomsky Normal Form and TURING MACHINES TUESDAY Feb 4 CHOMSKY NORMAL FORM A context-free grammar is in Chomsky normal form if every rule is of the form: A BC A a S ε B and C aren t start variables a is

More information

Grammars and Context Free Languages

Grammars and Context Free Languages Grammars and Context Free Languages H. Geuvers and A. Kissinger Institute for Computing and Information Sciences Version: fall 2015 H. Geuvers & A. Kissinger Version: fall 2015 Talen en Automaten 1 / 23

More information

CS 373: Theory of Computation. Fall 2010

CS 373: Theory of Computation. Fall 2010 CS 373: Theory of Computation Gul Agha Mahesh Viswanathan Fall 2010 1 1 Normal Forms for CFG Normal Forms for Grammars It is typically easier to work with a context free language if given a CFG in a normal

More information

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014 Q.2 a. Show by using Mathematical Induction that n i= 1 i 2 n = ( n + 1) ( 2 n + 1) 6 b. Define language. Let = {0; 1} denote an alphabet. Enumerate five elements of the following languages: (i) Even binary

More information

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols Finite Automata and Formal Languages TMV026/DIT321 LP4 2012 Lecture 13 Ana Bove May 7th 2012 Overview of today s lecture: Normal Forms for Context-Free Languages Pumping Lemma for Context-Free Languages

More information

CS311 Computational Structures. NP-completeness. Lecture 18. Andrew P. Black Andrew Tolmach. Thursday, 2 December 2010

CS311 Computational Structures. NP-completeness. Lecture 18. Andrew P. Black Andrew Tolmach. Thursday, 2 December 2010 CS311 Computational Structures NP-completeness Lecture 18 Andrew P. Black Andrew Tolmach 1 Some complexity classes P = Decidable in polynomial time on deterministic TM ( tractable ) NP = Decidable in polynomial

More information

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules). 1a) G = ({R, S, T}, {0,1}, P, S) where P is: S R0R R R0R1R R1R0R T T 0T ε (S generates the first 0. R generates

More information

Weak vs. Strong Finite Context and Kernel Properties

Weak vs. Strong Finite Context and Kernel Properties ISSN 1346-5597 NII Technical Report Weak vs. Strong Finite Context and Kernel Properties Makoto Kanazawa NII-2016-006E July 2016 Weak vs. Strong Finite Context and Kernel Properties Makoto Kanazawa National

More information

CS20a: summary (Oct 24, 2002)

CS20a: summary (Oct 24, 2002) CS20a: summary (Oct 24, 2002) Context-free languages Grammars G = (V, T, P, S) Pushdown automata N-PDA = CFG D-PDA < CFG Today What languages are context-free? Pumping lemma (similar to pumping lemma for

More information

Chomsky and Greibach Normal Forms

Chomsky and Greibach Normal Forms Chomsky and Greibach Normal Forms Teodor Rus rus@cs.uiowa.edu The University of Iowa, Department of Computer Science Computation Theory p.1/25 Simplifying a CFG It is often convenient to simplify CFG One

More information

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0, Introduction to Formal Language, Fall 2016 Due: 21-Apr-2016 (Thursday) Instructor: Prof. Wen-Guey Tzeng Homework 4 Solutions Scribe: Yi-Ruei Chen 1. Find context-free grammars for the language L = {a n

More information

Grammars and Context Free Languages

Grammars and Context Free Languages Grammars and Context Free Languages H. Geuvers and J. Rot Institute for Computing and Information Sciences Version: fall 2016 H. Geuvers & J. Rot Version: fall 2016 Talen en Automaten 1 / 24 Outline Grammars

More information

Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm

Testing Emptiness of a CFL. Testing Finiteness of a CFL. Testing Membership in a CFL. CYK Algorithm Testing Emptiness of a CFL As for regular languages, we really take a representation of some language and ask whether it represents φ Can use either CFG or PDA Our choice, since there are algorithms to

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form Plan for 2 nd half Pumping Lemma for CFLs The Return of the Pumping Lemma Just when you thought it was safe Return of the Pumping Lemma Recall: With Regular Languages The Pumping Lemma showed that if a

More information

Computational Models - Lecture 5 1

Computational Models - Lecture 5 1 Computational Models - Lecture 5 1 Handout Mode Iftach Haitner. Tel Aviv University. November 28, 2016 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice Herlihy, Brown University.

More information

Theory of Computation 8 Deterministic Membership Testing

Theory of Computation 8 Deterministic Membership Testing Theory of Computation 8 Deterministic Membership Testing Frank Stephan Department of Computer Science Department of Mathematics National University of Singapore fstephan@comp.nus.edu.sg Theory of Computation

More information

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013 Q.2 a. Prove by mathematical induction n 4 4n 2 is divisible by 3 for n 0. Basic step: For n = 0, n 3 n = 0 which is divisible by 3. Induction hypothesis: Let p(n) = n 3 n is divisible by 3. Induction

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic and Theory of Computing Fuhua (Frank) Cheng Department of Computer Science University of Kentucky 1 Table of Contents: Week 1: Preliminaries (set algebra, relations, functions) (read Chapters

More information

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions? Before We Start The Pumping Lemma Any questions? The Lemma & Decision/ Languages Future Exam Question What is a language? What is a class of languages? Context Free Languages Context Free Languages(CFL)

More information

Notes for Comp 497 (454) Week 10

Notes for Comp 497 (454) Week 10 Notes for Comp 497 (454) Week 10 Today we look at the last two chapters in Part II. Cohen presents some results concerning the two categories of language we have seen so far: Regular languages (RL). Context-free

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

The Pumping Lemma for Context Free Grammars

The Pumping Lemma for Context Free Grammars The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal

More information

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Salil Vadhan October 11, 2012 Reading: Sipser, Section 2.3 and Section 2.1 (material on Chomsky Normal Form). Pumping Lemma for

More information

Automata Theory CS F-08 Context-Free Grammars

Automata Theory CS F-08 Context-Free Grammars Automata Theory CS411-2015F-08 Context-Free Grammars David Galles Department of Computer Science University of San Francisco 08-0: Context-Free Grammars Set of Terminals (Σ) Set of Non-Terminals Set of

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION FORMAL LANGUAGES, AUTOMATA AND COMPUTATION DECIDABILITY ( LECTURE 15) SLIDES FOR 15-453 SPRING 2011 1 / 34 TURING MACHINES-SYNOPSIS The most general model of computation Computations of a TM are described

More information

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX Automata & languages A primer on the Theory of Computation Laurent Vanbever www.vanbever.eu Part 4 out of 5 ETH Zürich (D-ITET) October, 12 2017 Last week, we showed the equivalence of DFA, NFA and REX

More information

Chapter 6. Properties of Regular Languages

Chapter 6. Properties of Regular Languages Chapter 6 Properties of Regular Languages Regular Sets and Languages Claim(1). The family of languages accepted by FSAs consists of precisely the regular sets over a given alphabet. Every regular set is

More information

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union. We can show that every RL is also a CFL Since a regular grammar is certainly context free. We can also show by only using Regular Expressions and Context Free Grammars That is what we will do in this half.

More information

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3}, Code No: 07A50501 R07 Set No. 2 III B.Tech I Semester Examinations,MAY 2011 FORMAL LANGUAGES AND AUTOMATA THEORY Computer Science And Engineering Time: 3 hours Max Marks: 80 Answer any FIVE Questions All

More information

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor 60-354, Theory of Computation Fall 2013 Asish Mukhopadhyay School of Computer Science University of Windsor Pushdown Automata (PDA) PDA = ε-nfa + stack Acceptance ε-nfa enters a final state or Stack is

More information

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Notes for Comp 497 (Comp 454) Week 10 4/5/05 Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability

More information

CPSC 313 Introduction to Computability

CPSC 313 Introduction to Computability CPSC 313 Introduction to Computability Grammars in Chomsky Normal Form (Cont d) (Sipser, pages 109-111 (3 rd ed) and 107-109 (2 nd ed)) Renate Scheidler Fall 2018 Chomsky Normal Form A context-free grammar

More information

Context-Free Grammars: Normal Forms

Context-Free Grammars: Normal Forms Context-Free Grammars: Normal Forms Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Context-Free Grammar

Context-Free Grammar Context-Free Grammar CFGs are more powerful than regular expressions. They are more powerful in the sense that whatever can be expressed using regular expressions can be expressed using context-free grammars,

More information

CS481F01 Prelim 2 Solutions

CS481F01 Prelim 2 Solutions CS481F01 Prelim 2 Solutions A. Demers 7 Nov 2001 1 (30 pts = 4 pts each part + 2 free points). For this question we use the following notation: x y means x is a prefix of y m k n means m n k For each of

More information

Section 1 (closed-book) Total points 30

Section 1 (closed-book) Total points 30 CS 454 Theory of Computation Fall 2011 Section 1 (closed-book) Total points 30 1. Which of the following are true? (a) a PDA can always be converted to an equivalent PDA that at each step pops or pushes

More information

CYK Algorithm for Parsing General Context-Free Grammars

CYK Algorithm for Parsing General Context-Free Grammars CYK Algorithm for Parsing General Context-Free Grammars Why Parse General Grammars Can be difficult or impossible to make grammar unambiguous thus LL(k) and LR(k) methods cannot work, for such ambiguous

More information

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Normal forms (1) Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2013 normal form of a grammar formalism

More information

ON MINIMAL CONTEXT-FREE INSERTION-DELETION SYSTEMS

ON MINIMAL CONTEXT-FREE INSERTION-DELETION SYSTEMS ON MINIMAL CONTEXT-FREE INSERTION-DELETION SYSTEMS Sergey Verlan LACL, University of Paris XII 61, av. Général de Gaulle, 94010, Créteil, France e-mail: verlan@univ-paris12.fr ABSTRACT We investigate the

More information

Theory Of Computation UNIT-II

Theory Of Computation UNIT-II Regular Expressions and Context Free Grammars: Regular expression formalism- equivalence with finite automata-regular sets and closure properties- pumping lemma for regular languages- decision algorithms

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

Context-Free Languages (Pre Lecture)

Context-Free Languages (Pre Lecture) Context-Free Languages (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Context-Free Languages (Pre Lecture) Fall 2017 1 / 34 Outline Pumping Lemma

More information

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Proceedings of Machine Learning Research vol 73:153-164, 2017 AMBN 2017 On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Kei Amii Kyoto University Kyoto

More information

Theory of Computation Turing Machine and Pushdown Automata

Theory of Computation Turing Machine and Pushdown Automata Theory of Computation Turing Machine and Pushdown Automata 1. What is a Turing Machine? A Turing Machine is an accepting device which accepts the languages (recursively enumerable set) generated by type

More information

Grammars (part II) Prof. Dan A. Simovici UMB

Grammars (part II) Prof. Dan A. Simovici UMB rammars (part II) Prof. Dan A. Simovici UMB 1 / 1 Outline 2 / 1 Length-Increasing vs. Context-Sensitive rammars Theorem The class L 1 equals the class of length-increasing languages. 3 / 1 Length-Increasing

More information

Computational Models - Lecture 4 1

Computational Models - Lecture 4 1 Computational Models - Lecture 4 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. April 3/8, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY 15-453 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY REVIEW for MIDTERM 1 THURSDAY Feb 6 Midterm 1 will cover everything we have seen so far The PROBLEMS will be from Sipser, Chapters 1, 2, 3 It will be

More information

Tree Adjoining Grammars

Tree Adjoining Grammars Tree Adjoining Grammars TAG: Parsing and formal properties Laura Kallmeyer & Benjamin Burkhardt HHU Düsseldorf WS 2017/2018 1 / 36 Outline 1 Parsing as deduction 2 CYK for TAG 3 Closure properties of TALs

More information

CDM Parsing and Decidability

CDM Parsing and Decidability CDM Parsing and Decidability 1 Parsing Klaus Sutner Carnegie Mellon Universality 65-parsing 2017/12/15 23:17 CFGs and Decidability Pushdown Automata The Recognition Problem 3 What Could Go Wrong? 4 Problem:

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

Context-free Grammars and Languages

Context-free Grammars and Languages Context-free Grammars and Languages COMP 455 002, Spring 2019 Jim Anderson (modified by Nathan Otterness) 1 Context-free Grammars Context-free grammars provide another way to specify languages. Example:

More information

Chapter 3. Regular grammars

Chapter 3. Regular grammars Chapter 3 Regular grammars 59 3.1 Introduction Other view of the concept of language: not the formalization of the notion of effective procedure, but set of words satisfying a given set of rules Origin

More information

Discrete Mathematics. CS204: Spring, Jong C. Park Computer Science Department KAIST

Discrete Mathematics. CS204: Spring, Jong C. Park Computer Science Department KAIST Discrete Mathematics CS204: Spring, 2008 Jong C. Park Computer Science Department KAIST Today s Topics Sequential Circuits and Finite-State Machines Finite-State Automata Languages and Grammars Nondeterministic

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2017 http://cseweb.ucsd.edu/classes/sp17/cse105-ab/ Review of CFG, CFL, ambiguity What is the language generated by the CFG below: G 1 = ({S,T 1,T 2 }, {0,1,2}, { S

More information

Introduction to Formal Languages, Automata and Computability p.1/51

Introduction to Formal Languages, Automata and Computability p.1/51 Introduction to Formal Languages, Automata and Computability Finite State Automata K. Krithivasan and R. Rama Introduction to Formal Languages, Automata and Computability p.1/51 Introduction As another

More information

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES Contents i SYLLABUS UNIT - I CHAPTER - 1 : AUT UTOMA OMATA Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 2 : FINITE AUT UTOMA OMATA An Informal Picture of Finite Automata,

More information

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs

6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs CSC4510/6510 AUTOMATA 6.1 The Pumping Lemma for CFLs 6.2 Intersections and Complements of CFLs The Pumping Lemma for Context Free Languages One way to prove AnBn is not regular is to use the pumping lemma

More information

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where Recitation 11 Notes Context Free Grammars Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x A V, and x (V T)*. Examples Problem 1. Given the

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information