CSCI 340: Computational Models. Regular Expressions. Department of Computer Science

Similar documents
Chapter 4. Regular Expressions. 4.1 Some Definitions

In English, there are at least three different types of entities: letters, words, sentences.

FABER Formal Languages, Automata. Lecture 2. Mälardalen University

Author: Vivek Kulkarni ( )

Automata: a short introduction

The assignment is not difficult, but quite labour consuming. Do not wait until the very last day.

CS Automata, Computability and Formal Languages

The Binomial Theorem.

Recap from Last Time

Theory of Computation

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

Theory of Computer Science

Nondeterministic Finite Automata and Regular Expressions

CS 133 : Automata Theory and Computability

COSE212: Programming Languages. Lecture 1 Inductive Definitions (1)

TAFL 1 (ECS-403) Unit- IV. 4.1 Push Down Automata. 4.2 The Formal Definition of Pushdown Automata. EXAMPLES for PDA. 4.3 The languages of PDA

1 Alphabets and Languages

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

A canonical semi-deterministic transducer

FLAC Context-Free Grammars

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

Theory of Computation

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

CS A Term 2009: Foundations of Computer Science. Homework 2. By Li Feng, Shweta Srivastava, and Carolina Ruiz.

Lecture 1 09/08/2017

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

Theory of Computation (Classroom Practice Booklet Solutions)

Stream Codes. 6.1 The guessing game

C1.1 Introduction. Theory of Computer Science. Theory of Computer Science. C1.1 Introduction. C1.2 Alphabets and Formal Languages. C1.

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

CMSC 330: Organization of Programming Languages

Regular Expressions [1] Regular Expressions. Regular expressions can be seen as a system of notations for denoting ɛ-nfa

Harvard CS121 and CSCI E-121 Lecture 2: Mathematical Preliminaries

EXAMPLE CFG. L = {a 2n : n 1 } L = {a 2n : n 0 } S asa aa. L = {a n b : n 0 } L = {a n b : n 1 } S asb ab S 1S00 S 1S00 100

download instant at Assume that (w R ) R = w for all strings w Σ of length n or less.

CS6902 Theory of Computation and Algorithms

Homework 4. Chapter 7. CS A Term 2009: Foundations of Computer Science. By Li Feng, Shweta Srivastava, and Carolina Ruiz

How do regular expressions work? CMSC 330: Organization of Programming Languages

Automata Theory Final Exam Solution 08:10-10:00 am Friday, June 13, 2008

Semigroup presentations via boundaries in Cayley graphs 1

CSEP 590 Data Compression Autumn Arithmetic Coding

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

CS 154, Lecture 3: DFA NFA, Regular Expressions

CSE 105 THEORY OF COMPUTATION

On Parsing Expression Grammars A recognition-based system for deterministic languages

Regular Expressions and Finite-State Automata. L545 Spring 2008

SFWR ENG 2FA3. Solution to the Assignment #4

Automata Theory CS F-08 Context-Free Grammars

Closure Properties of Regular Languages

UNIT-I. Strings, Alphabets, Language and Operations

HW 3 Solutions. Tommy November 27, 2012

Chapter 1. Introduction

HW6 Solutions. Micha l Dereziński. March 20, 2015

The Probability of Winning a Series. Gregory Quenell

Section 1.3 Ordered Structures

60-354, Theory of Computation Fall Asish Mukhopadhyay School of Computer Science University of Windsor

Section 1 (closed-book) Total points 30

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Computational Theory

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Automata Theory and Formal Grammars: Lecture 1

Linear conjunctive languages are closed under complement

Automata, Computability and Complexity with Applications Exercises in the Book Solutions Elaine Rich

Figure 1: NFA N. Figure 2: Equivalent DFA N obtained through function nfa2dfa

CSE 105 THEORY OF COMPUTATION

6.4 Binomial Coefficients

Formal Definition of a Finite Automaton. August 26, 2013

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Chapter 6. Properties of Regular Languages

Chapter-6 Regular Expressions

Chapter 1. Formal Definition and View. Lecture Formal Pushdown Automata on the 28th April 2009

6.8 The Post Correspondence Problem

{a, b, c} {a, b} {a, c} {b, c} {a}

UNIT-III REGULAR LANGUAGES

Linear Classifiers (Kernels)

Finite State Machines. Languages g and Machines

Summary of Last Lectures

CS375 Midterm Exam Solution Set (Fall 2017)

Context-free Languages and Pushdown Automata

What is this course about?

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

Deciding Representability of Sets of Words of Equal Length

Theory Of Computation UNIT-II

CS 121, Section 2. Week of September 16, 2013

Grade 6 Math Circles October 20/21, Formalism and Languages: Beyond Regular Languages

Computability and Complexity

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

Fundamentele Informatica II

MA/CSSE 474 Theory of Computation

ÖVNINGSUPPGIFTER I SAMMANHANGSFRIA SPRÅK. 17 april Classrum Edition

Congruence based approaches

If f = ABC + ABC + A B C then f = AB C + A BC + AB C + A BC + A B C

Algebraic Expressions

Solutions to Problem Set 3

Pushdown Automata. Reading: Chapter 6

Multiplying Products of Prime Powers

Transcription:

CSCI 340: Computational Models Regular Expressions Chapter 4 Department of Computer Science

Yet Another New Method for Defining Languages Given the Language: L 1 = {x n for n = 1 2 3...} We could easily change the sequence for n: L 2 = {x n for n = 1 3 5 7...} But if we change the sequence for n it can be difficult: L 3 = {x n for n = 1 4 9 16...} Or just unwieldy / non-definitive: L 3 = {x n for n = 3 4 8 22...} We need a notation for something more precise than the ellipsis 1 / 21

Reappearance of Kleene Star Reconsider the language from Chapter 2: L 4 = {λ x xx xxx xxxx...} We presented one method for indicating this set as a closure: Let S = {x}. Then L 4 = S Or in shorthand: L 4 = {x} Let s now introduce a Kleene star applied to a letter rather than a set: We can think of the star as an unknown or undetermined power. x 2 / 21

Defining Languages We should not confuse x with L 4 as they are not equivalent L 4 is semantically a language, x is a language defining symbol We can define a language as follows: L 4 = language(x ) Example Σ = {a b} L = {a ab abb abbb abbbb...} L = language(a b ) L = language(ab ) Note: the Kleene star is applied to the letter immediately preceding 3 / 21

Applying Kleene Star to an Entire String Closure to entire substrings requires forced precedence We can accomplish this by grouping with parentheses For example: (ab) = λ or ab or abab or ababab... We can also use + to represent one-or-more Theorem xx = x + Proof. L 1 = language(xx ) L 2 = language(x + ) language(x ) = λ x xx xxx... language(x x ) = xλ xx xxx xxxx... language(xx ) = x xx xxx xxxx... language(xx ) = language(x + ) = x xx xxx xxxx... 4 / 21

Language Examples Example The language L 1 can be defined by any of the expressions below: xx x + xx x x xx x + x x x x xx Remember: x can always be λ Example The language defined by the expression ab a is the set of all strings of a s and b s that have at least two letters that 1 start and end with a 2 only have b s in between 5 / 21

Language Examples Example The language of the expression a b contains all of the strings of a s and b s in which all the a s (if any) come before all the b s (if any) language(a b ) = {λ a b aa ab bb aaa aab abb bbb aaaa... Note It is very important to note that a b (ab) 6 / 21

Language Examples Example Consider the language T defined over the alphabet Σ = {a b c} T = {a c ab cb abb cbb abbb cbbb abbbb cbbbb...} We may formally define the language as follows: T = language((a + c)b) Or in English as: T = language(either a or c followed by some b s) Note: parens force precedence change: selection before concatenation 7 / 21

Language Examples Example Consider the language L defined over the alphabet Σ = {a b} L = {aaa aab aba abb baa bab bba bbb} What is the pattern? How can we write a language expression for this? How can we generalize this? How can we represent choose any single character from Σ? 8 / 21

Regular Expressions Regular Language a language which can be expressed as a regular expression Definition for Regular Expression 1 Every letter of Σ can be made into a regular expression. λ is a regular expression. 2 If r 1 and r 2 are regular expressions, then so are: i (r 1 ) ii r 1 r 2 iii r 1 + r 2 iv (r 1 ) 3 Nothing else is a regular expression Note: we could add r 1 + but we can rewrite it as r 1 r 1 9 / 21

Defining Some Regular Expressions Chalkboard Problems 1 All words that begin with an a and end with a b 2 All words that contain exactly two a s 3 All words that contain exactly two a s and start with b 4 All words that contain two or more a s 5 All words that contain two or more a s that end in b 6 All words of length 3 or higher which contain two a s in a row 10 / 21

A More Complicated Example Language of all words that have at least one a and one b which can also be expressed as (a + b) a(a + b) b(a + b) <arbitrary> a <arbitrary> b <arbitrary> This mandates that a must be found before b. The unhandled case can be matched with: bb aa One of these must be true for our expression to be matched: (a + b) a(a + b) b(a + b) + bb aa 11 / 21

Confusing Equivalences Consider from the last slide (a + b) a(a + b) b(a + b) + bb aa If we wanted to include strings of all a s or b s we would use: a + b This would mean that we could define a regular expression which accepts any sequence of a s and b s: but this is simply just (a + b) a(a + b) b(a + b) + bb aa + a + b (a + b) These are not obviously equivalent 12 / 21

Algebraic Equivalence Need Not Apply An Analysis of (a + b) (a + b) = (a + b) + (a + b) (a + b) = (a + b) (a + b) (a + b) = a(a + b) + b(a + b) + λ (a + b) = (a + b) ab(a + b) + b a All of these are equal O_o 13 / 21

Some Algebra Works! Let V be the language of all strings of a s and b s in which the strings are either all b s or else there is an a followed by some b s. Let V also contain the word λ. V = {λ a b ab bb abb bbb abbb bbbb...} We can then define V by the expression: b + ab Where λ is embedded into the term b. Alternatively, we could define V by the expression (λ + a)b This gives us an option of having a a or nothing! Since we could always write b = λb, we demonstrate the distributive property λb + ab = (λ + a)b 14 / 21

Concatenation Definition If S and T are sets of strings of letters (whether they are finite or infinite sets), we define the product set of strings of letters to be ST = { all combinations of all string S followed with a string from T } Example S = {a aa aaa} T = {bb bbb} ST = {abb abbb aabb aabbb aaabb aaabbb} Rewritten as a Regular Expression (a + aa + aaa)(bb + bbb) = abb + abbb + aabb + aabbb + aaabb + aaabbb 15 / 21

Concatenation Definition If S and T are sets of strings of letters (whether they are finite or infinite sets), we define the product set of strings of letters to be ST = { all combinations of all string S followed with a string from T } Example S = {a bb bab} T = {a ab} ST = {aa aab bba bbab baba babab} Rewritten as a Regular Expression (a + bb + bab)(a + ab) = aa + aab + bba + bbab + baba + babab 16 / 21

Concatenation What are the regular expressions for the concatenation of the two sets in each example? Give both the simple and distributed forms Example P = {a bb bab} Q = {λ bbbb} Example M = {λ x xx} N = {λ y yy yyy yyyy...} 17 / 21

Associating a Language with Every RE The rules below define the language associated with any RE 1 The language associated with the regular expression that is just a single letter is that one-letter word alone and the language associated with λ is just { λ }, a one-word language 2 If r 1 is a regular expression associated with language L 1 and r 2 is a regular expression associated with the language L 2 then i RE (r 1 )(r 2 ) is associated with L 1 L 2 language(r 1 r 2 ) = L 1 L 2 ii RE r 1 + r 2 is associated with L 1 L 2 language(r 1 + r 2 ) = L 1 + L 2 iii RE r 1 is L 1 (the Kleene closure) language(r 1 ) = L 1 18 / 21

Expressing a Finite Language as RE Theorem If L is a finite language (a language with only finitely many words), then L can be defined by a regular expression Proof. To make one RE that defines the language L, turn all the words in L into boldface type and stick pluses between them. Violá. For example, the RE defining the language is L = {aa ab ba bb} aa + ab + ba + bb OR (a + b)(a + b) The reason this trick only works for finite languages is that an infinite language would yield an infinitely-long regular expression (which is forbidden) 19 / 21

EVEN-EVEN E = [ aa + bb + (ab + ba) (aa + bb) (ab + ba) ] This regular expression represents the collection of all words that are made up of syllables of three types: Question 1 type 1 = aa type 2 = bb type 3 = (ab + ba) (aa + bb) (ab + ba) E = [ type 1 + type 2 + type 3 ] What does this Regular Expression do? Question 2 What are the first 12 strings matched by this RE? 20 / 21

Homework 2a 1 For each of the problems below, give a regular expression which only accepts the following. Assume Σ = {a b} 1 All strings that begin and end with the same letter 2 All strings in which the total number of a s is divisible by 3 3 All strings that end in a double letter 2 Show the following pairs of regular expressions define the same language 1 (ab) a and a(ba) 2 (a bbb) a and a (bbba ) 3 Describe (in English phrases) the languages associated with the following regular expressions 1 (a + b) a(λ + bbbb) 2 (a(aa) b(bb) ) 3 ((a + b)a) 21 / 21