A representation of context-free grammars with the help of finite digraphs

Similar documents
Chapter 1. Formal Definition and View. Lecture Formal Pushdown Automata on the 28th April 2009

Peter Wood. Department of Computer Science and Information Systems Birkbeck, University of London Automata and Formal Languages

Introduction to Formal Languages, Automata and Computability p.1/42

Foundations of Informatics: a Bridging Course

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Chomsky and Greibach Normal Forms

Einführung in die Computerlinguistik

CS 275 Automata and Formal Language Theory

Theory of Computation

X-machines - a computational model framework.

Finite Automata and Regular Languages (part III)

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Introduction to Finite-State Automata

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Discrete Mathematics. CS204: Spring, Jong C. Park Computer Science Department KAIST

Introduction to Theory of Computing

Pushdown Automata (Pre Lecture)

Theory of Computation Turing Machine and Pushdown Automata

Finite Automata. Finite Automata

Parikh s theorem. Håkan Lindqvist

Left-Forbidding Cooperating Distributed Grammar Systems

Computability and Complexity

Introduction to Automata

Einführung in die Computerlinguistik

DM17. Beregnelighed. Jacob Aae Mikkelsen

Automata Theory, Computability and Complexity

Chapter 6. Properties of Regular Languages

Theory of Computer Science

CPS 220 Theory of Computation Pushdown Automata (PDA)

Derivation in Scattered Context Grammar via Lazy Function Evaluation

CPS 220 Theory of Computation

3515ICT: Theory of Computation. Regular languages

On Rice s theorem. Hans Hüttel. October 2001

Finite-State Transducers

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Context Sensitive Grammar

CMP 309: Automata Theory, Computability and Formal Languages. Adapted from the work of Andrej Bogdanov

CS Lecture 29 P, NP, and NP-Completeness. k ) for all k. Fall The class P. The class NP

INF210 Datamaskinteori (Models of Computation)

MTH401A Theory of Computation. Lecture 17

C1.1 Introduction. Theory of Computer Science. Theory of Computer Science. C1.1 Introduction. C1.2 Alphabets and Formal Languages. C1.

Improved TBL algorithm for learning context-free grammar

XMA2C011, Annual Examination 2012: Worked Solutions

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

CS243, Logic and Computation Nondeterministic finite automata

Theory of Computation

What You Must Remember When Processing Data Words

Section 7.1 Relations and Their Properties. Definition: A binary relation R from a set A to a set B is a subset R A B.

CISC4090: Theory of Computation

CSE 105 THEORY OF COMPUTATION

Decision, Computation and Language

Non Determinism of Finite Automata

Pushdown Automata: Introduction (2)

Theory of Computation

Music as a Formal Language

Automata Theory - Quiz II (Solutions)

Formal Languages and Automata

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Models of Computation. by Costas Busch, LSU

CSE 105 THEORY OF COMPUTATION

Closure under the Regular Operations

ω-automata Automata that accept (or reject) words of infinite length. Languages of infinite words appear:

UNIT-VIII COMPUTABILITY THEORY

Chapter 5. Finite Automata

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Formal Models in NLP

Theory of Computation (II) Yijia Chen Fudan University

Metamorphosis of Fuzzy Regular Expressions to Fuzzy Automata using the Follow Automata

Language Hierarchies and Algorithmic Complexity

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Computational Models - Lecture 4

Automata Theory (2A) Young Won Lim 5/31/18

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Hashing Techniques For Finite Automata

3130CIT Theory of Computation

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Nondeterministic Finite Automata

NOTES ON AUTOMATA. Date: April 29,

COM364 Automata Theory Lecture Note 2 - Nondeterminism

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

Automata and Languages

How to Pop a Deep PDA Matters

UNIT-I. Strings, Alphabets, Language and Operations

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

A parsing technique for TRG languages

COMP/MATH 300 Topics for Spring 2017 June 5, Review and Regular Languages

Finite Automata. Seungjin Choi

Incompleteness Theorems, Large Cardinals, and Automata ov

CSC173 Workshop: 13 Sept. Notes

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Transcription:

A representation of context-free grammars with the help of finite digraphs Krasimir Yordzhev Faculty of Mathematics and Natural Sciences, South-West University, Blagoevgrad, Bulgaria Email address: yordzhev@swu.bg Abstract: For any context-free grammar, we build a transition diagram, that is, a finite directed graph with labeled arcs, which describes the work of the grammar. This approach is new, and it is different from previously known graph models. We define the concept of proper walk in this transition diagram and we prove that a word belongs to a given context-free language if and only if this word can be obtained with the help of a proper walk. Keywords: Context-free grammar, Finite digraph, Transition diagram, Proper walk, Chomsky normal form 1. Introduction Context-free grammars are widely used to describe the syntax of programming languages and natural languages [1,2]. On the other hand, graph theory is a good apparatus for the modeling of computing devices and computational processes. So a lot of graph algorithms have been developed [4,5]. The purpose of this article is to show how for an arbitrary context-free grammar, we can construct a finite digraph, which describes the work of the grammar. As is well known, any finite state machine (deterministic or nondeterministic finite automaton) and any pushdown automaton can be described by a transition diagram, which is essentially a finite directed graph with labeled arcs [2]. The finite digraph, which we will construct in this work and who will simulate a given context-free grammar will be different from previously known graph models [1,2]. 2. The Main Definitions and Notations If Σ = {x 1, x 2,, x p } is a finite set, then with Σ we will denote the free monoid over Σ, ie the set of all sequences x i1 x i2 x in, including the empty one which we denote with ε. Elements of Σ is called words, and any subset of Σ (formal) language. Formal (or generative) grammar Γ is a triple Γ = (N, Σ, Π), where N and Σ are finite sets, N Σ =, which are called, respectively, alphabet of nonterminals (or variables) and alphabet of terminals, where Π is a finite subset of the Cartesian product (N Σ) (N Σ), and we will write u v if (u, v) Π (see [3]). We will call Π the set of productions or rules. Let Γ = (N, Σ, Π) be a formal grammar and let w, y (N Σ). We will write w y, if there are z 1, z 2, u, v (N Σ), such that w = z 1 uz 2, y = z 1 vz 2 and u v is a production of Π. We will write w y, if w = y, or there are w 0, w 1,, w r such that w 0 = w and w i w i+1 for all i = 0,1,, r 1. The sequence w 0 w 1 w r is called a derivation of length r. For every A N, a subset L(Γ, A) Σ defined by L(Γ, A) = {α Σ A α} is called the language generated by the grammar Γ with the start symbol A (see [3]). The formal grammar Γ = {N, Σ, Π} is called context-free if all the productions of Π are of the form A ω, where A N, ω (N Σ). The language L is called context-free if there is a context-free grammar Γ = (N, Σ, Π) and a nonterminal A N, such that L = L(Γ, A). Transition diagram is called a finite directed graph, all of whose arcs are labelled by an element of a semigroup (see also [2]). If π is a walk in a transition diagram, then the label of this walk l(π) is the product of the labels of

2 K. Yordzhev. A Representation of Context-free Grammars with the Help of Finite Digraphs the arcs that make up this walk, taken in passing the arcs. Let Γ = (N, Σ, Π) be a context-free grammar and let N = {A A N} Σ = {x x Σ}. We consider the monoid T with the set of generators N Σ N Σ, and with the defining relations AA = ε, xx = ε, (1) where A N, A N, x Σ, x Σ and ε is the empty word (the identity of the monoid T). We set by definition (x) = x, (A) = A, if ω = z 1 z 2 z k T, then ω will mean z k z k 1 z 1 and ε = ε. Let Figure 1. Production A α, A N, α Σ and the corresponding part of the transition diagram H(Γ). If the production is of the form A α 0 B 1 α 1 B 2 α 2 α k 1 B k α k, where α i Σ ( i = 0,1, k ), A, B j N ( j = 1,2,, k ), then the corresponding part of the transition diagram is shown in Figure 2 R = Σ T = {(ω, z) ω Σ, z T}. In R we introduce the operation '' '' as follows: If (ω 1, z 1 ), (ω 2, z 2 ) R, then (ω 1, z 1 ) (ω 2, z 2 ) = (ω 1 ω 2, z 2 z 1 ). (2) It is easy to see that R with this operation is a monoid with identity (ε, ε). 3. Finite Digraphs and Context-free Grammars Definition 1. For the context-free grammar Γ, we construct a transition diagram H(Γ) with the set of vertices V = {u A, v A A N} and the set of arcs E V V which is made as follows: 1. For each production A α, where A N, α Σ there is an arc from u A to v A with a bel(α, ε) (see Fig. 1); 2. For each production A αbz, where A, B N, α Σ, z (N Σ) there is an arc from u A to u B with a label(α, z); 3. For each production A z 1 B 1 αb 2 z 2 where A, B 1, B 2 N, α Σ, z 1, z 2 (N Σ), there is an arc from v B1 to u B2 with a label(α, B 2 α); 4. For each production A zbα, where A, B N, α Σ, z (N Σ) there is an arc from v B to v A with a label(α, α); 5. There are no other arcs in H(Γ) except described in 1 4. Figure 2. Production A α 0 B 1 α 1 B 2 α 2 α k 1 B k α k and the corresponding part of the transition diagram H(Γ). Definition 2. A walk π H(Γ) will be called a proper walk if: 1. For each A N the arcs from u A to v A (if there exist) are proper walks; 2. Let π 1, π 2,, π k be proper walks, A N and let: there is an arc ρ from the vertex u A to the start vertex of π 1 ; for every i = 1,2,, k 1 there are arcs σ i from the final vertex of π i to the start vertex of π i+1 ; there is an arc τ from the final vertex of π k to the vertex v A. If l(ρπ 1 σ 1 π 2 σ 2 σ k 1 π k τ) = (α, ε) for some α Σ, then the walk π = ρπ 1 σ 1 π 2 σ 2 σ k 1 π k τ is the proper walk (see Figure 3).

K. Yordzhev. A Representation of Context-free Grammars with the Help of Finite Digraphs 3 3. There are no other proper walks in H(Γ) except described in 1 and 2. with a label (α i, B i+1 α i ). c) According to 1, item 1 there is an arc τ from v Bk to v A with a label (α k, α k ). We consider the walk π = ρπ 1 σ 1 π 2 σ k 1 π k τ H(Γ). For this walk, we have: Figure 3. A proper walk in H(Γ). The following theorem is true. Theorem 1 Let Γ = (N, Σ, Π) be a context-free grammar and let A N. Then α L(Γ, A) if and only if there is a proper walk in H(Γ) with the start vertex u A, the final vertex v A and with a label(α, ε). Proof. Necessity. Let α L(Γ, A). Then there is a derivation A α.the proof we will make by induction on the length of the derivation. Let a production A α exists. Then according to Definition 1, item 1 there exists an arc u A v A with a label (α, ε). Suppose that for all A N and for all α L(Γ, A) for which there is a derivation A α no longer than t, there is a proper walk with the start vertex u A, the final vertex v A and labeled (α, ε). Let α L(Γ, A), and there is a derivation A α of length t + 1. Let A α 0 B 1 α 1 B k α k be the first production of this derivation, where α i Σ, B j N, i = 0,1, k, j = 1,2, k. Then there is a derivation A α 0 B 1 α 1 B k α k α 0 β 1 α 1 β k α k = α, where β i L(Γ, B i ) and for all β i there exist derivations B i β i with the length less than or equal to t, i = 1,2, k. By the inductive hypothesis, for all i = 1,2,, k, there are proper walks π i with the start vertices u Bi, the final vertices v Bi and labeled (β i, ε). For the first production A α 0 B 1 α 1 B k α k of the derivation A α we have: a) According to Definition 1, item 2 there exists an arc ρ from u A to u B1 with a label (α 0, α 1 B 2 B k α k ). b) According to Definition 1, item 3 for all i = 1,2, k 1 there are arcs σ i from v Bi to u Bi+1 l(π) = = (α 0, α 1 B 2 α 2 B k α k ) (β 1, ε) (α 1, B 2 α 1 ) (β 2, ε) (α k 1, B k α k 1 )(α k, α k ) = (α 0 β 1 α 1 β k α k, α k B k α k 1 B 2 α 1 α 1 B 2 α k 1 B k α k ) = (α, ε) Since π 1, π 2 π k are proper walks then by definition 2 π is a proper walk in H(Γ) with the start vertex u A, the final vertex v A and labelled l(π) = (α, ε). The necessity is proved. Sufficiency. Let π be a proper walk in H(Γ) with the start vertex u A, the final vertex v A and a label l(π) = (α, ε). If the length of π is equal to 1, then π is an arc and this is possible if and only if there is a production A α, ie α L(Γ, A). Suppose that for all A N and for all proper walks with the start vertex u A, the final vertex v A, length less than or equal to t and with a label (α, ε) is satisfied α L(Γ, A). Let π be a proper walk with the start vertex u A, the final vertex v A, labelled (α, ε) and length t + 1 > 1. Since π is a proper walk, then there is a vertex B 1 N such that the first arc ρ in the walk π starts at u A and finishes at u B1. Hence there is a production A α 0 B 1 α 1 B 2 B k α k in Γ such that l(u A u B1 ) = (α 0, α 1 B 2 B k α k ) for some B i N, α j Σ, i = 2,3,, k, j = 0,1,, k. According to Definition 2, π is given by π = ρπ 1 σ 1 π 2 σ s 1 π s τ, where π i are proper walks with the start vertices u Ci and the the final vertices v Ci for some C i N (obviously C 1 = B 1 ), i = 1,2,, s, arcs σ i = v Ci u Ci+1 from v Ci to u Ci+1, i = 1,2,, s 1, and an arc τ = v Cs v A from v Cs to v A. Let l(σ i ) = (γ i, C i+1 γ i ), γ i Σ, i = 1,2,, s 1, l(τ) = (γ s, γ s ), γ Σ. Then we obtain l(π) = = l u A u C1 l(π 1 ) l(σ 1 ) l(π 2 ) l(σ s 1 ) l(π s ) l(τ) = (ω, γ s C s γ s 1 C 2 γ 1 α 1 B 2 B k α k ), where ω = α and γ s C s γ s 1 C 2 γ 1 α 1 B 2 B k α k = ε. Having in mind (1), it is easy to see that this is possible if and only if s = k, C i = B i for i = 2,3,, k and

4 K. Yordzhev. A Representation of Context-free Grammars with the Help of Finite Digraphs γ j = α j for j = 1,2,, k. Therefore π = ρπ 1 σ 1 σ k 1 π k τ, where π i are proper walks with the start vertices u Bi and the the final vertices v Bi (i = 1,2,, k); σ i are arcs from v Bi to u Bi+1 (i = 1,2,, k 1); ρ is an arc from u A to u B1 ; τ is an arc from v Bk to v A and there are β i Σ (i = 1,2,, k) such that l(π i ) = (β i, ε). By the inductive hypothesis, we have β i L(Γ, B i ) ie there exist derivations B i β i for all i = 1,2,, k. Hence, we have the following derivation A α 0 B 1 α 1 B 2 B k α k α 0 β 1 α 1 β 2 β k α k = α, which implies that α L(Γ, A). The theorem is proved. 4. Context-free Grammars in Chomsky Normal Form Discussed above model is particularly useful when applied to some context-free grammars having a special form, for example, when the grammar is in Chomsky normal form. Definition 3. Let Γ be а context-free grammar without the empty word ε in which all productions are in one of two simple forms, either: 1. A BC, where A, B, and C, are each nonterminals, or 2. A x, where A is а nonterminal and x is а terminal. Such а grammar is said to be in Chomsky Normal Form. As is well known [2], every context-free language L can be generated by a grammar in Chomsky normal form. An algorithm that recognizes whether a word α Σ belongs to the language L, that is generated by a grammar in Chomsky normal form. This algorithm works in time O(n 2 ), where n = α. Described in section 3 graph model will help us to understand in detail this algorithm for more accurate description. If the production is of the form A BC, i.e. it is one of the productions of grammar in Chomsky normal form, then the appropriate fragment of the transition diagram is shown in Figure 4. Figure 4. Production A BB and the corresponding part of the transition diagram H(Γ). Obviously, if we consider a context-free grammar in Chomsky normal form, then the corresponding transition diagram is in quite simple and convenient form. 5. Conclusions and future work Described in this article a graph model of an arbitrary context-free grammar is new and original. This is different from the familiar until now graph models. Some polynomial algorithms for testing the inclusion of any regular or linear language in a group language are described in [7]. These algorithms are based on the search for a cycle in the corresponding transition diagram. This idea can be used for any context-free language L by the transition diagram described in Section 3. The difficulty here is to find, in general, the proper walk, which corresponds to any given word α L. We hope this open problem to be solved in the near future. The research is partly supported by the project SRP-B3/13, funded by the Ministry of Education, Youth and Science (Bulgaria) and South-West University Neofit Rilsky, Blagoevgrad. References [1] A.V. Aho, J.D. Ullman, The theory of parsing, translation and computing. Vol. 1, 2, Prentice-Hall, 1972. [2] J.E. Hopcroft, R. Motwani, J.D. Ullman, Introduction to automata theory, languages, and computation. Addison-Wesley, 2001. [3] G. Lallemant, Semigroups and combinatorial applications. John Wiley & Sons, 1979. [4] I. Mirchev, Graphs. Optimization algorithms in networks. SWU N. Rilsky, Blagoevgrad, 2001. [5] M. Swami, K. Thulasirman, Graphs, networks and algorithms. John Wiley & Sons, 1981. [6] K. Yordzhev, An n 2 Algorithm for Recognition of Context-free Languages. Cybern. Syst. Anal. 29,

K. Yordzhev. A Representation of Context-free Grammars with the Help of Finite Digraphs 5 No.6, (1993), 922-927. [7] K. Yordzhev, Inclusion of Regular and Linear Languages in Group Languages. International J. of Math. Sci. & Engg. Appls. (IJMSEA) ISSN 0973-9424, Vol. 7 No. I ( 2013), 323-336