Parikh s Theorem and Descriptional Complexity

Similar documents
Two-Way Automata and Descriptional Complexity

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

CS481F01 Prelim 2 Solutions

ACS2: Decidability Decidability

Theory of Computation (IX) Yijia Chen Fudan University

Two-Way Automata Characterizations of L/poly versus NL

CSE 105 THEORY OF COMPUTATION

What languages are Turing-decidable? What languages are not Turing-decidable? Is there a language that isn t even Turingrecognizable?

Nondeterministic Finite Automata

Nondeterminism. September 7, Nondeterminism

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

September 7, Formal Definition of a Nondeterministic Finite Automaton

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

CPS 220 Theory of Computation

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Theory of Computation (II) Yijia Chen Fudan University

Decision, Computation and Language

Theory of Computation

DM17. Beregnelighed. Jacob Aae Mikkelsen

NPDA, CFG equivalence

Foundations of Informatics: a Bridging Course

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Pushdown Automata (2015/11/23)

Decidability (intro.)

Universal Disjunctive Concatenation and Star

Pushdown Automata. Reading: Chapter 6

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

CP405 Theory of Computation

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Classes and conversions

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

The Separating Words Problem

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

C2.1 Regular Grammars

C2.1 Regular Grammars

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

Theory of Computation (IV) Yijia Chen Fudan University

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

V Honors Theory of Computation

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

acs-07: Decidability Decidability Andreas Karwath und Malte Helmert Informatik Theorie II (A) WS2009/10

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Complementing two-way finite automata

Pushdown Automata (Pre Lecture)

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Decidability (What, stuff is unsolvable?)

The Separating Words Problem

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Theory of Computation

Push-down Automata = FA + Stack

(pp ) PDAs and CFGs (Sec. 2.2)

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

CS 455/555: Finite automata

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Final exam study sheet for CS3719 Turing machines and decidability.

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Before We Start. The Pumping Lemma. Languages. Context Free Languages. Plan for today. Now our picture looks like. Any questions?

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Context-Free Languages (Pre Lecture)

Theory of Computation (I) Yijia Chen Fudan University

Results on Transforming NFA into DFCA

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Midterm Exam 2 CS 341: Foundations of Computer Science II Fall 2018, face-to-face day section Prof. Marvin K. Nakayama

CSE 105 THEORY OF COMPUTATION

CISC4090: Theory of Computation

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Lecture 1: Finite State Automaton

MA/CSSE 474 Theory of Computation

Introduction to Theory of Computing

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Intro to Theory of Computation

Chapter 5. Finite Automata

Pushdown Automata: Introduction (2)

Lecture 3: Nondeterministic Finite Automata

Chapter 6: NFA Applications

Parikh s theorem. Håkan Lindqvist

Nondeterministic Finite Automata

OPTIMAL SIMULATIONS BETWEEN UNARY AUTOMATA

CPS 220 Theory of Computation Pushdown Automata (PDA)

CSCE 551: Chin-Tser Huang. University of South Carolina

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Theory of Computation p.1/?? Theory of Computation p.2/?? We develop examples of languages that are decidable

Theory of Languages and Automata

The Pumping Lemma. for all n 0, u 1 v n u 2 L (i.e. u 1 u 2 L, u 1 vu 2 L [but we knew that anyway], u 1 vvu 2 L, u 1 vvvu 2 L, etc.

Reversal of Regular Languages and State Complexity

NODIA AND COMPANY. GATE SOLVED PAPER Computer Science Engineering Theory of Computation. Copyright By NODIA & COMPANY

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Lecture 17: Language Recognition

Chapter 2: Finite Automata

Einführung in die Computerlinguistik

Grammars and Context Free Languages

I have read and understand all of the instructions below, and I will obey the University Code on Academic Integrity.

AC68 FINITE AUTOMATA & FORMULA LANGUAGES JUNE 2014

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Midterm Exam 2 CS 341: Foundations of Computer Science II Fall 2016, face-to-face day section Prof. Marvin K. Nakayama

TDDD65 Introduction to the Theory of Computation

2017/08/29 Chapter 1.2 in Sipser Ø Announcement:

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Transcription:

Parikh s Theorem and Descriptional Complexity Giovanna J. Lavado and Giovanni Pighizzini Dipartimento di Informatica e Comunicazione Università degli Studi di Milano SOFSEM 2012 Špindlerův Mlýn, Czech Republic January 21 27, 2012

Parikh s Image Σ = {a 1,..., a m } alphabet of m symbols Parikh s map ψ : Σ N m : for each string w Σ ψ(w) = ( w a1, w a2,..., w am ) w and w are Parikh equivalent iff ψ(w ) = ψ(w ) (in symbols w = π w ) Parikh s image of a language L Σ : ψ(l) = {ψ(w) w L} L and L are Parikh equivalent iff ψ(l ) = ψ(l ) (in symbols L = π L )

Parikh s Theorem Theorem ([Parikh 66]) The Parikh image of a context-free language is a semilinear set, i.e, each context-free language is Parikh equivalent to a regular language Example: L = {a n b n n 0} R = (ab) ψ(l) = ψ(r) = {(n, n) n 0} Different proofs after the original one of Parikh, e.g. [Goldstine 77]: a simplified proof [Aceto&Ésik&Ingólfsdóttir 02]: an equational proof...

Purpose of the Work Recent works investigating complexity aspects of Parikh s Theorem: [Kopczyński&To 10]: size of the semilinear descriptions of Parikh images of languages defined by NFAs and by CFGs [Esparza&Ganty&Kiefer&Luttenberger 11]: new proof of Parikh s Theorem solution to the problem below in the case of nondeterministic automata Problem Given a CFG G compare the size of G with the sizes of finite automata accepting languages that are Parikh equivalent to L(G) Our aim is to study the same problem for deterministic automata

Why this Problem? We came to this problem from the investigation of automata over a one letter alphabet Costs in states of optimal simulations between different variant unary automata (one-way/two-way, deterministic/nondeterministic) [Chrobak 86, Mereghetti&Pighizzini 01] Context-free languages over a unary terminal alphabet are regular [Ginsburg&Rice 62] The regularity of unary CFLs is also a corollary of Parikh s Theorem Hence, unary PDAs and unary CFGs can be transformed into finite automata

Size: Descriptional Complexity Measures Finite Automata number of states Context-Free Grammars number of variables after converting into Chomsky Normal Form [Gruska 73]

Unary Context-Free Languages Theorem ([Pighizzini&Shallit&Wang 02]) For each unary CFG in Chomsky normal form with h variables there are an equivalent NFA with at most 2 2h 1 + 1 states an equivalent DFA with less than 2 h2 Both bounds are tight states Can we extend this result to larger alphabets? The class of CLFs is larger than the class of regular: we cannot have a result of exactly the same form! However, we can ask about the number of states of DFAs or NFAs Parikh equivalent to the given grammar

Upper and Lower Bounds Problem Given a CFG G compare the size of G with the sizes of finite automata accepting languages that are Parikh equivalent to L(G) Nondeterministic automata (number of states wrt s, size of G) Upper bound: 2 2O(s2 ) (implicit construction from classical proof of Parikh s Th.) O(4 s ) [Esparza&Ganty&Kiefer&Luttenberger 11] Lower bound: Ω(2 s )

Upper and Lower Bounds Problem Given a CFG G compare the size of G with the sizes of finite automata accepting languages that are Parikh equivalent to L(G) Deterministic automata (number of states wrt s, size of G) Upper bound: 2 O(4s ) Lower bound: 2 s2 (subset construction) (from the unary case) Is it possible to reduce the gap between the upper and the lower bound? We reduced the upper bound to 2 so(1) in the following cases: bounded context-free languages i.e, context-free subsets of a 1 a 2... a m (m 2)

First Contribution: Bounded Context-Free Languages Theorem Σ = {a 1, a 2,..., a m } fixed alphabet G grammar in Chomsky normal form with h variables s.t. L(G) a 1 a 2... a m There exists a DFA A with at most 2 ho(1) states s.t. L(G) = π L(A)

First Contribution: Proof Outline Σ = {a 1, a 2,..., a m } Restriction to strongly bounded grammars G = (V, Σ, P, S) is strongly bounded iff for all A V, there are i j s.t. L A = {x Σ A x} a + i a i+1 a j 1 a+ j A V is said to be unary iff L A a + i for some i in this case L A is accepted by a DFA with < 2 h2 states [Pighizzini&Shallit&Wang 02] The use of nonunary variables is very restricted: If S α then α contains m 1 nonunary variables Hence a finite control of size O(h m 1 ) can keep track of them

Example Σ = {a, b, c} S a 5 b 6 c 3 S A Y a Z W A Z B W a Z B b B W A Z b b W C a A B c B C A A b B B C C Unary variables: A, A, B, B, C, C L S, L Y a + b c + L Z, L Z a + b + L W, L W b + c + a a b b c c

Example Σ = {a, b, c} S a 5 b 6 c 3 S A Y a Z W A Z B W a Z B b B W A Z b b W C a A B c B C A A b B B C C a a b b c c Our automaton recognizes a 2 baba 2 b 2 c 3 b 2 by simulating a particular derivation from S S a 2 Z W a 2 ZbW a 2 az bw a 3 AbW a 3 a 2 b 2 W a 5 b 2 b 2 W a 5 b 4 Bc 3 a 5 b 4 b 2 c 3 = a 5 b 6 c 3 = π a 2 baba 2 b 2 c 3 b 2

First Contribution: Proof Outline This derivation process is simulated by an automaton which tests the matching between generated terminals and input symbols At each step the automaton needs to remember at most #Σ 1 variables The process is nondeterministic It can be implemented using O(h #Σ 1 ) states Hence, a deterministic control can be implemented with 2 poly(h) states The unary parts can be simulated within the same state bound

Second Contribution: Binary Context-Free Languages Theorem Let G grammar in Chomsky normal form with h variables with a binary terminal alphabet. Then there is a DFA A with at most 2 ho(1) states s.t. L(A)= π L(G) The proof relies the following results: Lemma ([Kopczyński&To 10]) For G as in the theorem, it holds that ψ(l(g)) = i I Z i where: I is a set of indices with #I = O(h 2 ) Z i = α 0 W i {α 0 + α 1,i n + α 2,i m n, m 0} W i N 2 is finite integers in W i, α 1,i, α 2,i do not exceed 2 hc, where c > 0 From sets Z i it is possible to derive small DFAs and, by standard constructions, the DFA A s.t. L(A)= π L(G)

Optimality For each CFG in Chomsky normal form with h variables we provided a Parikh equivalent DFA with 2 ho(1) states in the following cases: bounded languages binary languages This upper bound cannot be reduced (consequence of the unary case)

Open Questions Is it possible to extend these results to all context-free languages? Bounded case crucial argument: it is enough to remember #Σ 1 variables Binary case the main lemma does not hold for alphabets with 3 letters Other questions: What about word bounded CFLs? i.e., subsets of w 1 w 2... w m, where each w i is a string In our construction the cost is double exponential in the size of the alphabet: state whether or not this is optimal