This Lecture will Cover...

Similar documents
Regular Expressions. Definitions Equivalence to Finite Automata

DFA to Regular Expressions

Equivalence of Regular Expressions and FSMs

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3

UNIT II REGULAR LANGUAGES

3515ICT: Theory of Computation. Regular languages

Deterministic Finite Automaton (DFA)

Finite Automata and Formal Languages

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

Finite Automata and Formal Languages TMV026/DIT321 LP4 2012

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

Sri vidya college of engineering and technology

Lecture 3: Nondeterministic Finite Automata

Formal Languages. We ll use the English language as a running example.

Theory of computation: initial remarks (Chapter 11)

Regular Expressions and Language Properties

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

This lecture covers Chapter 7 of HMU: Properties of CFLs

Concatenation. The concatenation of two languages L 1 and L 2

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Context-Free Grammars (and Languages) Lecture 7

Closure Properties of Regular Languages

Properties of Context-Free Languages. Closure Properties Decision Properties

Regular expressions and Kleene s theorem

Lexical Analysis. DFA Minimization & Equivalence to Regular Expressions

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Regular expressions and Kleene s theorem

Uses of finite automata

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

Regular Expressions [1] Regular Expressions. Regular expressions can be seen as a system of notations for denoting ɛ-nfa

Lecture 7 Properties of regular languages

UNIT-III REGULAR LANGUAGES

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

Inf2A: Converting from NFAs to DFAs and Closure Properties

Ogden s Lemma for CFLs

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

Foundations of

CS 455/555: Finite automata

Theory of computation: initial remarks (Chapter 11)

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Computational Models Lecture 2 1

CS 154, Lecture 3: DFA NFA, Regular Expressions

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Regular Expressions (Pre Lecture)

Formal Languages. We ll use the English language as a running example.

Finite Automata and Regular languages

CSE 105 Theory of Computation Professor Jeanne Ferrante

Kleene Algebra and Arden s Theorem. Anshul Kumar Inzemamul Haque

Formal Languages, Automata and Models of Computation

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

Computational Models Lecture 2 1

Kleene Algebras and Algebraic Path Problems

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Chapter 6. Properties of Regular Languages

Equivalence of DFAs and NFAs

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Closure under the Regular Operations

Algorithms for NLP

CS 455/555: Mathematical preliminaries

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

Automata and Formal Languages - CM0081 Algebraic Laws for Regular Expressions

Introduction to Kleene Algebras

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Formal Languages and Automata

Constructions on Finite Automata

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Extended transition function of a DFA

UNIT-VIII COMPUTABILITY THEORY

Formal Language and Automata Theory (CS21004)

CSC173 Workshop: 13 Sept. Notes

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

COM364 Automata Theory Lecture Note 2 - Nondeterminism

NPDA, CFG equivalence

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Non-deterministic Finite Automata (NFAs)

The Pumping Lemma and Closure Properties

FS Properties and FSTs

Lecture 2: Regular Expression

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

Theory of Computation 1 Sets and Regular Expressions

1 More finite deterministic automata

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CSE 105 THEORY OF COMPUTATION

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

CSE 105 THEORY OF COMPUTATION

Text Search and Closure Properties

Theory of Computation

Text Search and Closure Properties

CMSC 330: Organization of Programming Languages

Context-Free Languages (Pre Lecture)

Transcription:

Last Lecture Covered... DFAs, NFAs, -NFAs and the equivalence of the language classes they accept Last Lecture Covered... This Lecture will Cover... Introduction to regular expressions and regular languages Equivalence of classes of regular languages and languages accepted Algebraic laws of (abstract) regular expressions Background Reading: Chapter 3 of HMU. by DFAs 1

Regular Expressions So far, DFAs, NFAs were given a machine-like description Regular expressions are user-friendly and declarative formulation Regular expressions find extensive use. - Searching/finding strings/pattern matching or conformance in text-formatting systems (e.g., UNIX grep, egrep, fgrep) - Lexical analyzers (in compilers) use regular expressions to identify tokens (e.g., Lex, Flex) -In Web forms to (structurally) validate entries (passwords, dates, email IDs) 2

Regular Expressions Given an alphabet of symbols disjoint from {+; ; (; )}: -A regular expression over is a string over [ { ; +; (; )} i.e., a regular expression consists only of: (1) constants: ;; (2) symbols from (3) operators: +, (4) parantheses: (, ) Regular expressions are defined via induction. 3

Regular Expressions Regular expressions are defined inductively as follows: Basis: (B1) ; and are regular expressions. (B2) For each a 2, a is a regular expression. Induction: If E and F are regular expressions: Option 1: Option 2: HMU approach Mine (I1) So is E (I1 ) So is (E ) (I2) So is E+F (I3) So is EF (I2 ) So is (E+F) (I3 ) So is (EF) (I4) So is (E) Only those generated by the above induction are regular. Remark 1: Some authors/texts use instead of +. HMU uses +. Remark 3: Some expressions are regular according to Option 1 but not Option E.g., ((0)), 0 + 11 4

Regular Expressions: Examples Let = {0; 1}. ((((0 + 1)1) )0) is a regular expression 0 + 11 0 is a regular expression Expression Rule Expression Rule 0 (B2) 0 (B2) 1 (B2) 1 (B2) (0+1) (I2 ) 0+1 (I2) ((0+1)1) (((0+1)1)*) ((((0+1)1)*)0) (I3 ) (I1 ) (I3 ) 0 + 11 0 + 11 0 + 11 0 (I3) (I1) (I3) 5

What Do Regular Expressions Stand For? Each properly paranthesized regular expression E (i.e., a regular expression that is generated by Option 2) is a shorthand for a language L(E) A language is said to be regular if it corresponds to a regular expression. Basis: (B1) L(;) =; L( ) ={ } (B2) L(a) ={a}, a 2 [Empty Language] [Language with only the empty string] [Language with only the symbol a] Induction: (I1 ) L((E )) = L(E), { } [ L(E) [ L(E) 2 [ L(E) 3 (I2 ) L((E + F )) = L(E) [ L(F ) (I3 ) L((EF)) = L(E)L(F ) [Kleen- closure of L(E)] [Union] [Concatenation] What if a regular expression is generated by Option 1? 6

What if an Expression isn t Bracketed Properly? Improperly parenthesized regular expressions =? Is 0 + 11 the same as ((0 + 1)1)? Or is it equal to (0 + (11))? 7

What if an Expression isn t Bracketed Properly? Improperly parenthesized regular expressions (generated by Option 1) must be converted to properly paranthesized expressions 1. We remove unwanted parentheses by replacing ((E)) by (E) inductively. -Additionally, if E is a symbol or a constant, we replace it by E + ; e.g., ((0)) (0 + ;), ((0 + 1)) (0 + 1) 2. Apply precedence rules: First: applies to the smallest (properly bracketable) expression preceding. e.g., 01 (0(1 )) Second: concatenation applies from left to right. e.g., 010 ((01)0) Third: + applies from left to right e.g., a + b + c ((a + b)+c) Example: 0 + 11 (0 + (1(1 ))) ((0)) + 11 ((0 + ;) + (1(1 ))) L(0 + 11 ) =L(((0)) + 11 ) =(L(0) [ (L(1)L(1) ) = {0; 1; 11; 111; 1111;:::} 8

2: Let L 1 and L 2 be regular languages. Then, L 1 [ L 2 and L 1 L 2 are also Theorem 1: Let w 2. Then {w} is regular Proof: Languages { } and {a} for a 2 are regular (B1, B2). [Induction] Theorem 2: Let L 1 and L 2 be regular languages. Then, L 1, L 1 [ L 2 and L 1 L 2 are also regular languages. regular languages. and L 1 L k are also regular languages. Corollary 1: The class of regular languages is closed under finite union and concatenation, i.e., if L 1 ;:::;L k are regular languages for any k 2 N, then L 1 [ L k Corollary 2: Any finite language is regular. 9

DFAs and Regular Languages Theorem 3: For every regular language M, there exists a DFA A such that M = L(A). Proof: WLOG, let = {0; 1}. Let M be a regular language. Then, M = L(E). For each regular expression, we will devise an -NFA. Basis: E= ; A : 0; 1 A : 0; 1 q 0 q 1 q 0 q 1 E= E= 0 E= 1 A : q 2 1 q 0 0 q 1 q 2 A : 0 1 q 0 q 1 10

Proof of Theorem 3 [Continued] Induction: [I1 ] E. (E ) E. 11

Proof of Theorem 3 [Continued] Induction: [I2 ] E (E + F ). E. F F.. 12

Proof of Theorem 3 [Continued] Induction: [I3 ] E. (EF) E F F... 13

So far... Regular Languages Languages accepted by DFAs, NFAs, -NFAs Finite languages Is the inclusion strict? Are there languages accepted by DFAs that are not regular? 14

Regular Languages Theorem 4: For every DFA A, there is a regular expression E such that L(A) =L(E). Proof: 1) Let DFA A =(Q; ; ;q 0 ;F) be given. 2) Let us rename the states so that Q = {q 0 ;q 1 ;q 2 ;:::;q n 1 ) 3) For any string s 1 :::s k 2 L(A), there is a path s 1 s 2 s k q 0! q i1! q i2! q ik 2 F 4) Let R(i;j;k) be the set of all input strings that move the internal state of A from q i to q j using paths whose intermediate nodes comprise only of q, <k. States q k,...,q n 1 q i q j States q 0,...,q k 1 15

Proof of Theorem 4 [Continued] 5) Then L(A) = S j:q j 2F R(0;j;n). [i.e., paths that start in q 0 and end in an accepting state with intermediate nodes q 0 ;q 1 ;:::;q n 1 (all nodes)] 6) L(A) will be regular if each R(i;j;k) to be regular. We now proceed to show that each R(i;j;k) is regular. 7) Induction: Base: Consider R(i;j;0) for i;j 2 {0; 1;:::;n 1}. R(i;j;0) consists of strings whose corresponding paths start in q i and end in q j with intermediate nodes q, <0. ) NO INTERMEDIATE NODES! ) R(i;j;0) contains strings that change state q i to q j directly ) R(i;j;0) { } [ ) R(i;j;0) is a regular language [Corollary 2] 16

Proof of Theorem 4 [Continued] Induction: Let R(i;j; ) be regular for i;j 2 {0;:::;n Consider R(i;j;k) for i;j 2 {0;:::;n 1}. 1} and 0 apple <k. The strings in R(i;j;k) correspond either to paths whose intermediate nodes q 0 ;:::;q k 1. Partition R(i;j;k) as follows: Case (a): Strings whose paths do not have q k Case (b): Strings whose paths do pass through q k 1 as an intermediate node 1 as an intermediate node q i case (b) q j States q 0 ;:::;q k 2 17

Proof of Theorem 4 [Continued] R(i;j;k)= {Case (a) strings} [ {Case (b) strings} q i q k 1 q j States q 0 ;:::;q k 2 Case (a) Strings are exactly those in R(i;j;k 1) Hence, R(i;j;k)= R(i;k 1;k 1) [ {Case (b) strings} 18

Proof of Theorem 4 [Continued] q i 1 2 3 q k 1 q k 1 q k 1 q j States q 0 ;:::;q k 2 States q 0 ;:::;q k 2 Each case (b) string is the concatenation of 3 strings: States q 0 ;:::;q k 2 {z } Case (b) path 1 2 3 A string that changes the state from q i to q k 1 through a path whose intermediate nodes are q 0 ;:::;q k 2 i.e., R(i;k 1;k 1) A finite concatenation of strings, each of which take q k 1 back to q k 1 via paths that use only q 0 ;:::;q k 2 as intermediate nodes. i.e., R(k 1;k 1;k 1) A string that takes q k 1 back to q j via a path that uses only q 0 ;:::;q k 2 as intermediate nodes. i.e., (R(k 1;j;k 1) R(i;j;k)=R(i;j;k 1) [ [R(i;k 1;k 1)R(k 1;k 1;k 1) R(k 1;j;k 1)] From Theorem 2, it follows that R(i;j;k) is regular for any i;j;k. Consequently, L(A) is regular. 19

Equivalence The following are indeed equivalent: The class of regular languages The class of languages accepted by DFAs The class of languages accepted by NFAs The class of languages accepted by -NFAs 20

Properties of Regular Languages Regular languages are closed under finite union, concatenation, and Kleene- operation. [Theorem 2] They are also closed under: Complementation [Given DFA A =(Q; ; ;q 0 ;F), DFA A 0 =(Q; ; ;q 0 ;F c ) accepts L(A) c ] Intersection [De Morgan s Law: R 1 \ R 2 =(R c 1 [ Rc 2 )c ] 21

Abstract Regular Expressions We can also define abstract regular expressions over languages over. Let V be a set of variables (which will be interpreted as languages) Use the induction definition for regular languages replacing B2 alone by: (B2) M is an (abstract) regular expression for every M 2 V Remark: Even though V could be infinite, every regular expression consists only of finitely many variables. Unlike concrete regular expressions (such as 1, 0+1), abstract regular expressions (such as M, M + N) don t stand for a unique language However, we can evaluate abstract regular expressions by assigning any languages to variables, and inductively interpreting: Variable Sum of variables Concatenation of variables! Kleene- closure of its language! union of the languages assigned to them! concatenation of their the languages We can introduce a notion of equality of (abstract) regular expression: Abstract regular expressions E 1 = E 2, 22 For any assignment of languages to the variables contained in E 1 ;E 2, their evaluations equal (i.e., L(E 1 )=L(E 2 ))

Commutativity: L + M = M + L LM 6= ML [Union is commutative] [Concatenation is not commutative] Associativity: (L + M)+N = L +(M + N) (LM)N = L(MN) [Union is associative] [Concatenation is associative] Identity: ; + L = L + ; = L [; is the identity element for +] L = L = L [ is the identity element for concatenation] Annihilator: ;L = L; = ; Idempotent: L + L = L Distributive: L(M + N) =LM + LN (M + N)L = ML + NL [Concatenation distributes over +] Kleene : (L ) = L ; ; = ; =. 23