Parsing Regular Expressions and Regular Grammars

Similar documents
Lecture 1: Finite State Automaton

Introduction to Formal Languages, Automata and Computability p.1/51

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

Chapter 6. Properties of Regular Languages

Einführung in die Computerlinguistik

Lecture 2: Regular Expression

Algorithms for NLP

Regular Expressions. Definitions Equivalence to Finite Automata

Mildly Context-Sensitive Grammar Formalisms: Embedded Push-Down Automata

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Foundations of Informatics: a Bridging Course

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Nondeterministic Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

UNIT II REGULAR LANGUAGES

Automata and Languages

AC68 FINITE AUTOMATA & FORMULA LANGUAGES DEC 2013

Einführung in die Computerlinguistik

Inf2A: Converting from NFAs to DFAs and Closure Properties

Chapter 5. Finite Automata

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Deterministic Finite Automaton (DFA)

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Recitation 2 - Non Deterministic Finite Automata (NFA) and Regular OctoberExpressions

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Theory of Computation (I) Yijia Chen Fudan University

Equivalence of DFAs and NFAs

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

C2.1 Regular Grammars

Chapter Five: Nondeterministic Finite Automata

C2.1 Regular Grammars

Hellis Tamm Institute of Cybernetics, Tallinn. Theory Seminar, April 21, Joint work with Janusz Brzozowski, accepted to DLT 2011

CS 121, Section 2. Week of September 16, 2013

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Formal Models in NLP

Lecture 3: Nondeterministic Finite Automata

Finite Automata. Seungjin Choi

DFA to Regular Expressions

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Introduction to Finite-State Automata

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Peter Wood. Department of Computer Science and Information Systems Birkbeck, University of London Automata and Formal Languages

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Finite Automata and Formal Languages

Computational Models Lecture 2 1

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

2. Elements of the Theory of Computation, Lewis and Papadimitrou,

COMP4141 Theory of Computation

CSE 105 THEORY OF COMPUTATION

Extended transition function of a DFA

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Theory of Languages and Automata

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Computational Models Lecture 2 1

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

CSE 105 Homework 3 Due: Monday October 23, Instructions. should be on each page of the submission.

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Finite Automata and Regular Languages

Mathematics for linguists

What we have done so far

Chapter 6: NFA Applications

Theory of Computation (II) Yijia Chen Fudan University

September 7, Formal Definition of a Nondeterministic Finite Automaton

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Computational Models - Lecture 5 1

Finite Automata. Dr. Neil T. Dantam. Fall CSCI-561, Colorado School of Mines. Dantam (Mines CSCI-561) Finite Automata Fall / 35

Finite Automata. Dr. Neil T. Dantam. Fall CSCI-561, Colorado School of Mines. Dantam (Mines CSCI-561) Finite Automata Fall / 43

CS 455/555: Finite automata

Intro to Theory of Computation

Theory of Computation

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Decision, Computation and Language

Subset construction. We have defined for a DFA L(A) = {x Σ ˆδ(q 0, x) F } and for A NFA. For any NFA A we can build a DFA A D such that L(A) = L(A D )

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

Chap. 1.2 NonDeterministic Finite Automata (NFA)

Introduction to Theoretical Computer Science. Motivation. Automata = abstract computing devices

V Honors Theory of Computation

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

CS 275 Automata and Formal Language Theory

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

CS481F01 Solutions 8

Equivalence of Regular Expressions and FSMs

CS 154 Formal Languages and Computability Assignment #2 Solutions

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Regular Language Equivalence and DFA Minimization. Equivalence of Two Regular Languages DFA Minimization

C6.2 Push-Down Automata

Closure under the Regular Operations

Parsing Beyond Context-Free Grammars: Tree Adjoining Grammars

Finite Universes. L is a fixed-length language if it has length n for some

Sri vidya college of engineering and technology

Lecture 17: Language Recognition

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

Automata Theory CS F-08 Context-Free Grammars

Regular Expressions and Language Properties

Transcription:

Regular Expressions and Regular Grammars Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2011 Regular Expressions (1) Let Σ be an alphabet The set of regular expressions over Σ is recursively defined: is a regular expression denoting the set is a regular expression denoting the set {} For each a Σ: a is a regular expression denoting {a} If r and s are regular expressions denoting R and S, then (r + s), (rs) and (r ) are also regular expressions denoting R S, RS and R respectively Regular Grammars 1 12-13 April 2011 Regular Grammars 3 12-13 April 2011 Overview 1 Regular expressions 2 NFA with empty transitions 3 FSA and regular expressions 4 Regular Grammars 5 Regular Grammars and NFAs Regular Expressions (2) Assume that has a higher priority than concatenation which in turn has a higher priority than + a + is an abbreviation for aa For a regular expression x, we define L(x) as the set denoted by x Regular Grammars 2 12-13 April 2011 Regular Grammars 4 12-13 April 2011

Regular Expressions (3) The following holds: For each language L: there is a regular expression x with L = L(x) iff there is a FSA M with L = L(M) In order to show this, we define NFAs with empty transitions and show their equivalence to NFAs; show how to construct an equivalent NFA with empty transitions for a given regular expression; NFA with empty transitions (2) Let Q, Σ, δ, q 0, F be an NFA with empty transitions Definition of ˆδ: -closure(q) := {q there is a path from q to q containing only -transitions} {q} ˆδ(q, ) := -closure(q) ˆδ(q, wa) := p P -closure(p) where P = {p there is an r ˆδ(q, w) such that p δ(r, a)} show how to construct an equivalent regular expression for a given DFA Regular Grammars 5 12-13 April 2011 Regular Grammars 7 12-13 April 2011 NFA with empty transitions (1) A NFA with empty transitions is defined like an NFA except that δ allows for ε as second argument: A NFA with empty transitions M is a quintuple Q, Σ, δ, q 0, F such that Q, Σ, q 0 and F are like in a NFA δ : Q (Σ {ε}) P(Q) is the transition function NFA with empty transitions (3) Construction of an equivalent NFA Q, Σ, δ, q 0, F (without empty transitions) for a given NFA Q, Σ, δ, q 0, F with empty transitions: F := F {q 0 } if -closure(q 0 ) F Otherwise, F := F δ (q, a) := ˆδ(q, a) DFAs, NFAs and NFAs with empty transitions are equivalent Regular Grammars 6 12-13 April 2011 Regular Grammars 8 12-13 April 2011

FSA and regular expressions (1) To show: 1 For a given regular expression r there exists an NFA with empty transitions that accepts L(r) The set of languages denoted by regular expressions is a subset of the set of languages accepted by FSAs 2 For a given DFA, there exists a regular expression r such that L(r) is exactly the language acctepted by the DFA The set of languages accepted by FSAs is a subset of the set of languages denoted by regular expressions FSA and regular expressions (3) r = r 1 + r 2 : 0 r = r 1 r 2 : 1 2 1 f1 r = r 1 : q 0 qf1 qf2 2 1 q f1 qf2 qf0 Regular Grammars 9 12-13 April 2011 Regular Grammars 11 12-13 April 2011 FSA and regular expressions (2) Construction of an equivalent NFA with empty transitions for a given regular expression r: r = : q0 r = : 0 qf r = a, a Σ: a 0 qf FSA and regular expressions (4) To show: For each DFA, there is an equivalent regular expression Assume that {q 1,, q n } it the set of states Define Ri,j k as the set of sequences of input symbols that can be obtained from following all paths from q i to q j that do not pass through states q l with l > k Question: What do the Ri,j k look like and what are the corresponding regular expressions ri,j k? k = 0: Only paths of length 0 or 1 can be considered, since, between q i and q j, no other states q l are possible Each r 0 i,j has the form a 1 + a 2 + + a m or (if i = j) a 1 + a 2 + + a m + or (if i j and there is no edge from q i to q j, ) Regular Grammars 10 12-13 April 2011 Regular Grammars 12 12-13 April 2011

FSA and regular expressions (5) In general, R k ij contains all elements from R k 1 ij, all concatenation of a) an element from R k 1 ik, b) any number (including 0) of elements from R k 1 kk and c) an element from R k 1 Consequently, r k i,j = rk 1 ij + r k 1 ik (rk 1 kk ) r k 1 Finally, the regular expression for the language accepted by the automaton is r n 1f1 + rn 1f2 + + rn 1fm if F = {q f1,, q fm } FSA and regular expressions (7) The JFLAP conversion uses a different definition: Rij k is the set obtained by using no states q l with l < k Then ri,j k = rk+1 ij + r k+1 r 1 i,j ik (rk+1 kk ) r k+1 is the regular expression for the language accepted by the DFA With this, we obtain (ab c) ab for the sample automaton Regular Grammars 13 12-13 April 2011 Regular Grammars 15 12-13 April 2011 FSA and regular expressions (6) Regular Grammars (1) Example: abcsjff A type 2 grammar is called r 2 12 = r 1 12 + r 1 12(r 1 22) r 1 22 right-linear iff for all productions A β: β T or β = β X with β T, X N r12 1 = r0 12 + r0 11 (r0 11 ) r12 0 = a + ( )a = a left-linear iff for all productions A β: β T or β = Xβ with β T, X N r 1 22 = r0 22 + r0 21 (r0 11 ) r 0 12 regular if it is left-linear or right-linear = (b + ) + c() a = + b + ca For each left-linear grammar there is an equivalent right-linear grammar and vice versa r 2 12 = a + a( + b + ca)+ = a( + b + ca) = a(b + ca) Regular Grammars 14 12-13 April 2011 Regular Grammars 16 12-13 April 2011

Regular Grammars (2) The following holds: L = L(G) for a regular grammar G iff L is a regular set (ie, can be described by a regular expression) To show this, we show that for each right-linear grammar, there exists an equivalent NFA; and for each DFA, there is an equivalent right-linear grammar Regular Grammars (4) Construction of a right-linear grammar for given DFA: The states are the nonterminals The start state is the start symbol For each transition a Q Q add a production Q aq and, if Q F, a production Q a If we construct the right-linear grammar for L R (L reversed) and we mirror all right-hand sides of productions, we obtain a left-linear grammar for L Regular Grammars 17 12-13 April 2011 Regular Grammars 19 12-13 April 2011 Regular Grammars (3) Construction of equivalent NFA (with empty transitions) for given right-linear G: Start state [S] For all productions A β 1 β 2 there is a state [β 2 ] Transitions simulate top-down left-to-right traversal of derivation tree: For every A β: [A] [β] For every state [aβ] with a T: a [aβ] [β] Final state is [] Regular Grammars 18 12-13 April 2011