Regular expressions and Kleene s theorem

Similar documents
Regular expressions and Kleene s theorem

Regular expressions and Kleene s theorem

Constructions on Finite Automata

Constructions on Finite Automata

Deterministic finite Automata

Lecture 3: Nondeterministic Finite Automata

Closure under the Regular Operations

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Deterministic Finite Automaton (DFA)

Regular Expressions. Definitions Equivalence to Finite Automata

Computational Models Lecture 2 1

CS 154, Lecture 3: DFA NFA, Regular Expressions

Inf2A: Converting from NFAs to DFAs and Closure Properties

Computational Models Lecture 2 1

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Nondeterministic Finite Automata. Nondeterminism Subset Construction

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

Uses of finite automata

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

UNIT II REGULAR LANGUAGES

Kleene Algebra and Arden s Theorem. Anshul Kumar Inzemamul Haque

CS 455/555: Mathematical preliminaries

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

COMP4141 Theory of Computation

Sri vidya college of engineering and technology

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

CS 455/555: Finite automata

Foundations of

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

How do regular expressions work? CMSC 330: Organization of Programming Languages

Formal Language and Automata Theory (CS21004)

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

UNIT-III REGULAR LANGUAGES

Regular Expressions [1] Regular Expressions. Regular expressions can be seen as a system of notations for denoting ɛ-nfa

CMPSCI 250: Introduction to Computation. Lecture #23: Finishing Kleene s Theorem David Mix Barrington 25 April 2013

Non-deterministic Finite Automata (NFAs)

Regular Expressions and Language Properties

Nondeterminism. September 7, Nondeterminism

3515ICT: Theory of Computation. Regular languages

CS 121, Section 2. Week of September 16, 2013

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

1 Alphabets and Languages

Theory of computation: initial remarks (Chapter 11)

Regular expressions. Regular expressions. Regular expressions. Regular expressions. Remark (FAs with initial set recognize the regular languages)

Automata: a short introduction

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3

Lecture 2: Connecting the Three Models

Recap from Last Time

Formal Models in NLP

Formal Languages, Automata and Models of Computation

Nondeterministic Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

CMSC 330: Organization of Programming Languages

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Equivalence of DFAs and NFAs

Harvard CS 121 and CSCI E-207 Lecture 4: NFAs vs. DFAs, Closure Properties

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CS 275 Automata and Formal Language Theory

Equivalence of Regular Expressions and FSMs

Regular Expressions (Pre Lecture)

Unit 6. Non Regular Languages The Pumping Lemma. Reading: Sipser, chapter 1

CONCATENATION AND KLEENE STAR ON DETERMINISTIC FINITE AUTOMATA

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

Obtaining the syntactic monoid via duality

Finite Automata and Regular languages

This Lecture will Cover...

Languages. A language is a set of strings. String: A sequence of letters. Examples: cat, dog, house, Defined over an alphabet:

CS21 Decidability and Tractability

Finite Automata and Regular Languages

Theory of Computation (I) Yijia Chen Fudan University

Finite Automata and Formal Languages

CS243, Logic and Computation Nondeterministic finite automata

Non-Deterministic Finite Automata

Concatenation. The concatenation of two languages L 1 and L 2

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

Extended transition function of a DFA

Computational Theory

Formal Languages and Automata

Kleene Algebras and Algebraic Path Problems

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Theory of Computation

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

CMPSCI 250: Introduction to Computation. Lecture #29: Proving Regular Language Identities David Mix Barrington 6 April 2012

Nondeterministic Finite Automata and Regular Expressions

Lecture 7 Properties of regular languages

컴파일러입문 제 3 장 정규언어

Transcription:

and Informatics 2A: Lecture 5 Alex Simpson School of Informatics University of Edinburgh als@inf.ed.ac.uk 25 September, 2014 1 / 26

1 More closure properties of regular languages Operations on languages ɛ-nfas Closure under concatenation and Kleene star 2 From regular expressions to regular languages 3 From DFAs to regular expressions 2 / 26

Recap of Lecture 4 Regular languages are closed under union, intersection and complement. These closure properties are proved using explicit constructions on finite automata (sometimes using NFAs, sometimes DFAs). Every regular language has a unique minimum DFA that recognises it. An algorithm for minimizing a DFA. 3 / 26

Concatenation Operations on languages ɛ-nfas Closure under concatenation and Kleene star We write L 1.L 2 for the concatenation of languages L 1 and L 2, defined by: L 1.L 2 = {xy x L 1, y L 2 } For example, if L 1 = {aaa} and L 2 = {b, c} then L 1.L 2 is the language {aaab, aaac}. Later we will prove the following closure property. If L 1 and L 2 are regular languages then so is L 1.L 2. 4 / 26

Kleene star More closure properties of regular languages Operations on languages ɛ-nfas Closure under concatenation and Kleene star We write L for the Kleene star of the language L, defined by: L = {ɛ} L L.L L.L.L... For example, if L 3 = {aaa, b} then L 3 contains strings like aaaaaa, bbbbb, baaaaaabbaaa, etc. More precisely, L 3 contains all strings over {a, b} in which the letter a always appears in sequences of length some multiple of 3 Later we will prove the following closure property. If L is a regular language then so is L. 5 / 26

Self-assessment question Operations on languages ɛ-nfas Closure under concatenation and Kleene star Consider the language over the alphabet {a, b, c} L = {x x starts with a and ends with c} Which of the following strings are valid for the language L.L? 1 abcabc Ans: yes 2 acacac Ans: yes 3 abcbcac Ans: yes 4 abcbacbc Ans: no 6 / 26

Self-assessment question Operations on languages ɛ-nfas Closure under concatenation and Kleene star Consider the (same) language over the alphabet {a, b, c} L = {x x starts with a and ends with c} Which of the following strings are valid for the language L? 1 ɛ Ans: yes 2 acaca Ans: no 3 abcbc Ans: yes 4 acacacacac Ans: yes 7 / 26

NFAs with ɛ-transitions Operations on languages ɛ-nfas Closure under concatenation and Kleene star We can vary the definition of NFA by also allowing transitions labelled with the special symbol ɛ (not a symbol in Σ). The automaton may (but doesn t have to) perform a spontaneous ɛ-transition at any time, without reading an input symbol. This is quite convenient: for instance, we can turn any NFA into an ɛ-nfa with just one start state and one accepting state: ε ε ε............... ε ε ε (Add ɛ-transitions from new start state to each state in S, and from each state in F to new accepting state.) 8 / 26

Equivalence to ordinary NFAs Operations on languages ɛ-nfas Closure under concatenation and Kleene star Allowing ɛ-transitions is just a convenience: it doesn t fundamentally change the power of NFAs. If N = (Q,, S, F ) is an ɛ-nfa, we can convert N to an ordinary NFA with the same associated language, by simply expanding and S to allow for silent ɛ-transitions. To achieve this, perform the following steps on N. For every pair of transitions q a q (where a Σ) and q ɛ q, add a new transition q a q. For every transition q ɛ q, where q is a start state, make q a start state too. Repeat the two steps above until no further new transitions or new start states can be added. Finally, remove all ɛ-transitions from the ɛ-nfa resulting from the above process. This produces the desired NFA. 9 / 26

Closure under concatenation Operations on languages ɛ-nfas Closure under concatenation and Kleene star We use ɛ-nfas to show, as promised, that regular languages are closed under the concatenation operation: L 1.L 2 = {xy x L 1, y L 2 } If L 1, L 2 are any regular languages, choose ɛ-nfas N 1, N 2 that define them. As noted earlier, we can pick N 1 and N 2 to have just one start state and one accepting state. Now hook up N 1 and N 2 like this: N1 ε N2 Clearly, this NFA corresponds to the language L 1.L 2. 10 / 26

Closure under Kleene star Operations on languages ɛ-nfas Closure under concatenation and Kleene star Similarly, we can now show that regular languages are closed under the Kleene star operation: L = {ɛ} L L.L L.L.L... For suppose L is represented by an ɛ-nfa N with one start state and one accepting state. Consider the following ɛ-nfa: ε N Clearly, this ɛ-nfa corresponds to the language L. ε 11 / 26

From regular expressions to regular languages We ve been looking at ways of specifying regular languages via machines (often presented as pictures). But it s very useful for applications to have more textual ways of defining languages. A regular expression is a written mathematical expression that defines a language over a given alphabet Σ. The basic regular expressions are ɛ a (for a Σ) From these, more complicated regular expressions can be built up by (repeatedly) applying the two binary operations +,. and the unary operation. Example: (a.b + ɛ) + a We use brackets to indicate precedence. In the absence of brackets, binds more tightly than., which itself binds more tightly than +. So a + b.a means a + (b.(a )) Also the dot is often omitted: ab means a.b 12 / 26

From regular expressions to regular languages How do regular expressions define languages? A regular expression is itself just a written expression. However, every regular expression α over Σ can be seen as defining an actual language L(α) Σ in the following way. L( ) =, L(ɛ) = {ɛ}, L(a) = {a}. L(α + β) = L(α) L(β) L(α.β) = L(α). L(β) L(α ) = L(α) Example: a + ba defines the language {a, b, ba, baa, baaa,...}. The languages defined by, ɛ, a are obviously regular. What s more, we ve seen that regular languages are closed under union, concatenation and Kleene star. This means every regular expression defines a regular language. (Formal proof by induction on the size of the regular expression.) 13 / 26

Self-assessment question From regular expressions to regular languages Consider (again) the language {x {0, 1} x contains an even number of 0 s} Which of the following regular expressions define the above language? 1 (1 01 01 ) Ans: no 1 does not match expression 2 (1 01 0) 1 Ans: yes 3 1 (01 0) 1 Ans: no 00100 does not match expression 4 (1 + 01 0) Ans: yes 14 / 26

From DFAs to regular expressions We ve seen that every regular expression defines a regular language. The major goal of the lecture is to show the converse: every regular language can be defined by a regular expression. For this purpose, we introduce : the algebra of regular expressions. The equivalence between regular languages and expressions is: DFAs and regular expressions give rise to exactly the same class of languages (the regular languages). As we ve already seen, NFAs (with or without ɛ-transitions) also give rise to this class of languages. So the evidence is mounting that the class of regular languages is mathematically a very natural class to consider. 15 / 26

From DFAs to regular expressions give a textual way of specifying regular languages. This is useful e.g. for communicating regular languages to a computer. Another benefit: regular expressions can be manipulated using algebraic laws (). For example: α + (β + γ) = (α + β) + γ α + β = β + α α + = α α + α = α α(βγ) = (αβ)γ ɛα = αɛ = α α(β + γ) = αβ + αγ (α + β)γ = αγ + βγ α = α = ɛ + αα = ɛ + α α = α Often these can be used to simplify regular expressions down to more pleasant ones. 16 / 26

Other reasoning principles From DFAs to regular expressions Let s write α β to mean L(α) L(β) (or equivalently α + β = β). Then αγ + β γ α β γ β + γα γ βα γ Arden s rule: Given an equation of the form X = αx + β, its smallest solution is X = α β. What s more, if ɛ L(α), this is the only solution. Beautiful fact: The rules on this slide and the last form a complete set of reasoning principles, in the sense that if L(α) = L(β), then α = β is provable using these rules. (Beyond scope of Inf2A.) 17 / 26

DFAs to regular expressions From DFAs to regular expressions We use an example to show how to convert a DFA to an equivalent regular expression. 1 1 0 p q 0 For each state r, let the variable X r stand for the set of strings that take us from r to an accepting state. Then we can write some simultaneous equations: X p = 1X p + 0X q + ɛ X q = 1X q + 0X p 18 / 26

Where do the equations come from? From DFAs to regular expressions Consider: X p = 1X p + 0X q + ɛ This asserts the following. Any string that takes us from p to an accepting state is: a 1 followed by a string that takes us from p to an accepting state; or a 0 followed by a string that takes us from q to an accepting state; or the empty string. Note that the empty string is included because p is an accepting state. 19 / 26

Solving the equations From DFAs to regular expressions We solve the equations by eliminating one variable at a time: X q = 1 0X p by Arden s rule So X p = 1X p + 01 0X p + ɛ = (1 + 01 0)X p + ɛ So X p = (1 + 01 0) by Arden s rule Since the start state is p, the resulting regular expression for X p is the one we are seeking. Thus the language recognised by the automaton is: (1 + 01 0) The method we have illustrated here, in fact, works for arbitrary NFAs (without ɛ-transitions). 20 / 26

Theory of regular languages: overview From DFAs to regular expressions solve equations DFAs subset construction NFAs expand, S reg exps closure properties ɛ-nfas 21 / 26

End-of-lecture question From DFAs to regular expressions N1 ε N2 Suppose the above ɛ-nfa defining concatenation is modified by identifying the final state of N 1 with the start state of N 2 (and removing the then-redundant ɛ-transistion linking the two states). 1 Find a pair of ɛ-nfas, N 1 and N 2, each with a single start state and single accepting state, for which the modified construction does not recognise L(N 1 ).L(N 2 ). 2 Show that if N 1 has no loops from the accepting state back to itself, then the modified ɛ-nfa does recognise L(N 1 ).L(N 2 ). 3 Which construction of an ɛ-nfa in this lecture violates the assumption above about N 1? 22 / 26

Reading More closure properties of regular languages From DFAs to regular expressions Relevant reading: : Kozen chapters 7,8; J & M chapter 2.1. (Both texts actually discuss more general patterns see next lecture.) From regular expressions to NFAs: Kozen chapter 8; J & M chapter 2.3. : Kozen chapter 9. From NFAs to regular expressions: Kozen chapter 9. Next time: Some applications of all this theory. Pattern matching Lexical analysis 23 / 26

From DFAs to regular expressions Appendix: (non-examinable) proof of Given an NFA N = (Q,, S, F ) (without ɛ-transitions), we ll show how to define a regular expression defining the same language as N. In fact, to build this up, we ll construct a three-dimensional array of regular expressions α X uv: one for every u Q, v Q, X Q. Informally, α X uv will define the set of strings that get us from u to v allowing only intermediate states in X. We shall build suitable regular expressions α X u,v by working our way from smaller to larger sets X. Eventually, the language defined by N will be given by the sum (+) of the languages α Q sf for all states s S and f F. 24 / 26

From DFAs to regular expressions Construction of α X uv Here s how the regular expressions αuv X are built up. If X =, let a 1,..., a k be all the symbols a such that (u, a, v). Two subcases: If u v, take αuv = a 1 + + a k If u = v, take αuv = (a 1 + + a k ) + ɛ Convention: if k = 0, take a 1 +... + a k to mean. If X, choose any q X, let Y = X {q}, and define α X uv = α Y uv + α Y uq(α Y qq) α Y qv Applying these rules repeatedly gives us α X u,v for every u, v, X. 25 / 26

From DFAs to regular expressions NFAs to regular expressions: tiny example Let s revisit our old friend: 1 1 0 p q Here p is the only start state and the only accepting state. By the rules on the previous slide: α {p,q} p,p = α {p} p,p 0 + α p,q {p} (α q,q {p} ) α q,p {p} Now by inspection (or by the rules again), we have α p,p {p} = 1 α p,q {p} = 1 0 α q,q {p} = 1 + 01 0 α q,p {p} = 01 So the required regular expression is 1 + 1 0(1 + 01 0) 01 (A bit messy!) 26 / 26