Regular Expressions Kleene s Theorem Equation-based alternate construction. Regular Expressions. Deepak D Souza

Similar documents
Regular Expressions Kleene s Theorem Equation-based alternate construction. Regular Expressions. Deepak D Souza

Algebraic Approach to Automata Theory

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 154, Lecture 3: DFA NFA, Regular Expressions

Recitation 2 - Non Deterministic Finite Automata (NFA) and Regular OctoberExpressions

CS 121, Section 2. Week of September 16, 2013

Constructions on Finite Automata

Finite Automata and Regular Languages

Lecture 4: More on Regexps, Non-Regular Languages

Theory of Computation (I) Yijia Chen Fudan University

UNIT-III REGULAR LANGUAGES

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

Finite Automata and Regular languages

Formal Definition of a Finite Automaton. August 26, 2013

Inf2A: Converting from NFAs to DFAs and Closure Properties

Closure under the Regular Operations

Introduction to Turing Machines

Regular Expressions [1] Regular Expressions. Regular expressions can be seen as a system of notations for denoting ɛ-nfa

Introduction to Theory of Computing

COSE312: Compilers. Lecture 2 Lexical Analysis (1)

(pp ) PDAs and CFGs (Sec. 2.2)

Constructions on Finite Automata

Lecture 1: Finite State Automaton

Lecture 3: Nondeterministic Finite Automata

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

TDDD65 Introduction to the Theory of Computation

Chapter 6: NFA Applications

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Nondeterministic Finite Automata. Nondeterminism Subset Construction

Computational Theory

Nondeterministic Finite Automata

Incorrect reasoning about RL. Equivalence of NFA, DFA. Epsilon Closure. Proving equivalence. One direction is easy:

Context Free Grammars

Finite State Automata

Regular expressions and Kleene s theorem

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Non-deterministic Finite Automata (NFAs)

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Recap from Last Time

Sri vidya college of engineering and technology

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

Author: Vivek Kulkarni ( )

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Introduction to Formal Languages, Automata and Computability p.1/42

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

CMSC 330: Organization of Programming Languages

Deterministic Finite Automata (DFAs)

Theory of Computation

Regular expressions and Kleene s theorem

(pp ) PDAs and CFGs (Sec. 2.2)

Computational Models - Lecture 5 1

Deterministic Finite Automata (DFAs)

Notes on State Minimization

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Closure Properties of Regular Languages

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Chomsky Normal Form for Context-Free Gramars

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

Author: Vivek Kulkarni ( )

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

Introduction to Formal Languages, Automata and Computability p.1/51

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CMPSCI 250: Introduction to Computation. Lecture #23: Finishing Kleene s Theorem David Mix Barrington 25 April 2013

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Deterministic Finite Automata (DFAs)

CS 133 : Automata Theory and Computability

FABER Formal Languages, Automata. Lecture 2. Mälardalen University

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

VTU QUESTION BANK. Unit 1. Introduction to Finite Automata. 1. Obtain DFAs to accept strings of a s and b s having exactly one a.

Uses of finite automata

Outline. Theorem 1. For every regular expression E, there exists an ε-nfa A such that L(E)=L(A).

Deterministic PDA s. Deepak D Souza. 21 October Department of Computer Science and Automation Indian Institute of Science, Bangalore.

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Chapter 5. Finite Automata

Mathematics. Exercise 9.3. (Chapter 9) (Sequences and Series) Question 1: Find the 20 th and n th terms of the G.P. Answer 1:

Properties of Context-Free Languages

Regular expressions and Kleene s theorem

1 Alphabets and Languages

Extended transition function of a DFA

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

Concatenation. The concatenation of two languages L 1 and L 2

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Theory of Computation (II) Yijia Chen Fudan University

(b) If G=({S}, {a}, {S SS}, S) find the language generated by G. [8+8] 2. Convert the following grammar to Greibach Normal Form G = ({A1, A2, A3},

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Languages, regular languages, finite automata

How do regular expressions work? CMSC 330: Organization of Programming Languages

BASIC MATHEMATICAL TECHNIQUES

Lecture 2: Regular Expression

Ogden s Lemma. and Formal Languages. Automata Theory CS 573. The proof is similar but more fussy. than the proof of the PL4CFL.

Transcription:

Regular Expressions Deepak D Souza Department of Computer Science and Automation Indian Institute of Science, Bangalore. 16 August 2012

Outline 1 Regular Expressions 2 Kleene s Theorem 3 Equation-based alternate construction

Examples of Regular Expressions Expressions built from a, b, ɛ, using operators +,, and. (a + b ) c Strings of only a s or only b s, followed by a c. (a + b) abb(a + b) contains abb as a subword. (a + b) b(a + b)(a + b) 3rd last letter is a b. (b ab a) b

Examples of Regular Expressions Expressions built from a, b, ɛ, using operators +,, and. (a + b ) c Strings of only a s or only b s, followed by a c. (a + b) abb(a + b) contains abb as a subword. (a + b) b(a + b)(a + b) 3rd last letter is a b. (b ab a) b Even number of a s.

Examples of Regular Expressions Expressions built from a, b, ɛ, using operators +,, and. (a + b ) c Strings of only a s or only b s, followed by a c. (a + b) abb(a + b) contains abb as a subword. (a + b) b(a + b)(a + b) 3rd last letter is a b. (b ab a) b Even number of a s. Ex. Give regexp for Every 4-bit block of the form w[4i, 4i + 1, 4i + 2, 4i + 3] has even parity.

Examples of Regular Expressions Expressions built from a, b, ɛ, using operators +,, and. (a + b ) c Strings of only a s or only b s, followed by a c. (a + b) abb(a + b) contains abb as a subword. (a + b) b(a + b)(a + b) 3rd last letter is a b. (b ab a) b Even number of a s. Ex. Give regexp for Every 4-bit block of the form w[4i, 4i + 1, 4i + 2, 4i + 3] has even parity. (0000 + 0011 + + 1111) (ɛ + 0 + 1 + + 111)

Formal definitions Syntax of regular expresions over an alphabet A: where a A. r ::= a r + r r r r Semantics: associate a language L(r) A with regexp r. L( ) = {} L(a) = {a} L(r + r ) = L(r) L(r ) L(r r ) = L(r) L(r ) L(r ) = L(r).

Formal definitions Syntax of regular expresions over an alphabet A: where a A. r ::= a r + r r r r Semantics: associate a language L(r) A with regexp r. L( ) = {} L(a) = {a} L(r + r ) = L(r) L(r ) L(r r ) = L(r) L(r ) L(r ) = L(r). Question: Do we need ɛ in syntax?

Formal definitions Syntax of regular expresions over an alphabet A: where a A. r ::= a r + r r r r Semantics: associate a language L(r) A with regexp r. L( ) = {} L(a) = {a} L(r + r ) = L(r) L(r ) L(r r ) = L(r) L(r ) L(r ) = L(r). Question: Do we need ɛ in syntax? No. ɛ.

Example: Semantics of regexp (a + b ) c + a b c

Example: Semantics of regexp (a + b ) c + {a} a b {b} c {c}

Example: Semantics of regexp (a + b ) c + {ɛ, a, aa,...} {ɛ, b, bb,...} {a} a b {b} c {c}

Example: Semantics of regexp (a + b ) c {ɛ, a, b, aa, bb,...} + {ɛ, a, aa,...} {ɛ, b, bb,...} {a} a b {b} c {c}

Example: Semantics of regexp (a + b ) c {c, ac, bc, aac, bbc,...} {ɛ, a, b, aa, bb,...} + {ɛ, a, aa,...} {ɛ, b, bb,...} {a} a b {b} c {c}

Kleene s Theorem: RE = DFA Class of languages defined by regular expressions coincides with regular languages. Proof RE DFA: Use closure properties of regular languages. DFA RE:

DFA RE: Kleene s construction Let A = (Q, s, δ, F ) be given DFA. Define L pq = {w A δ(p, w) = q}. Then L(A) = f F L sf. For X Q, define L X pq = {w A δ(p, w) = q via a path that stays in X except for first and last states} X p q Then L(A) = f F LQ sf.

DFA RE: Kleene s construction r X p q Advantage: L X {r} pq = L X pq + L X pr (L X rr ) L X rq.

DFA RE: Kleene s construction (2) Method: Begin with L Q sf for each f F. Simplify by using terms with strictly smaller X s: L X {r} pq For base terms, observe that L {} pq = = L X pq + L X pr (L X rr ) L X rq. { {a δ(p, a) = q} if p q {a δ(p, a) = q} {ɛ} if p = q. Exercise: convert NFA/DFA s below to RE s: a, b b a b b s f a

DFA RE: Kleene s construction (2) Method: Begin with L Q sf for each f F. Simplify by using terms with strictly smaller X s: L X {r} pq For base terms, observe that L {} pq = = L X pq + L X pr (L X rr ) L X rq. { {a δ(p, a) = q} if p q {a δ(p, a) = q} {ɛ} if p = q. Exercise: convert NFA/DFA s below to RE s: a, b b a b b s f e a o

DFA RE using system of equations Aim: to construct a regexp for L q = {w A δ(q, w) F }. Note that L(A) = L s. Example: b a b Set up equations to capture L q s: x e = b x e + a x o x o = a x e + b x o + ɛ. Solution is a RE for each x, such that languages denoted by LHS and RHS RE s coincide. a

DFA RE using system of equations Aim: to construct a regexp for L q = {w A δ(q, w) F }. Note that L(A) = L s. Example: b a b Set up equations to capture L q s: e a o x e = b x e + a x o x o = a x e + b x o + ɛ. Solution is a RE for each x, such that languages denoted by LHS and RHS RE s coincide.

Solutions to a system of equations L q s are a solution to the system of equations In general there could be many solutions to equations. Consider x = A x (Here A is the alphabet). What are the solutions to this equation? In the case of equations arising out of automata, L q s can be seen to the unique solution to the equations.

Computing the least solution to a system of equations Equations arising from our automaton can be viewed as: [ ] [ ] [ ] [ ] xe b a xe ɛ = + a b x o System of linear equations over regular expressions have the general form: X = AX + B where X is a column vector of n variables, A is an nxn matrix of regular expressions, and B is a column vector of n regular expressions. Claim: The column vector A B represents the least solution to the equations above. [See Kozen, Supplementary Lecture A]. Definition of A when A is a 2x2 matrix: [ ] [ a b (a + bd = c) (a + bd c) bd ] x o c d (d + ca b) ca (d + ca b)