Concatenation. The concatenation of two languages L 1 and L 2

Similar documents
Recap from Last Time

Recap from Last Time

Finite Automata Part Two

Sri vidya college of engineering and technology

UNIT-III REGULAR LANGUAGES

CS 154, Lecture 3: DFA NFA, Regular Expressions

Closure under the Regular Operations

Regular Expressions. Definitions Equivalence to Finite Automata

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Finite Automata Part Two

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

CS 154. Finite Automata, Nondeterminism, Regular Expressions

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

How do regular expressions work? CMSC 330: Organization of Programming Languages

UNIT II REGULAR LANGUAGES

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Foundations of

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Closure Properties of Context-Free Languages. Foundations of Computer Science Theory

Text Search and Closure Properties

Lecture 2: Connecting the Three Models

Nondeterministic Finite Automata and Regular Expressions

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

Formal Languages. We ll use the English language as a running example.

Decision, Computation and Language

Announcements. Problem Set Four due Thursday at 7:00PM (right before the midterm).

CSC173 Workshop: 13 Sept. Notes

Automata: a short introduction

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

CS21 Decidability and Tractability

CS 455/555: Finite automata

Automata Theory and Formal Grammars: Lecture 1

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Computational Theory

CSE 105 THEORY OF COMPUTATION

COM364 Automata Theory Lecture Note 2 - Nondeterminism

CS243, Logic and Computation Nondeterministic finite automata

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CPS 220 Theory of Computation REGULAR LANGUAGES

1.3 Regular Expressions

Non-deterministic Finite Automata (NFAs)

Text Search and Closure Properties

CS375 Midterm Exam Solution Set (Fall 2017)

Formal Languages. We ll use the English language as a running example.

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

CSE 105 Theory of Computation Professor Jeanne Ferrante

Regular Expression Unit 1 chapter 3. Unit 1: Chapter 3

CS21 Decidability and Tractability

TDDD65 Introduction to the Theory of Computation

Computational Models Lecture 2 1

Nondeterministic Finite Automata

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

3515ICT: Theory of Computation. Regular languages

CS 121, Section 2. Week of September 16, 2013

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

CS311 Computational Structures Regular Languages and Regular Expressions. Lecture 4. Andrew P. Black Andrew Tolmach

jflap demo Regular expressions Pumping lemma Turing Machines Sections 12.4 and 12.5 in the text

Name: Student ID: Instructions:

1 More finite deterministic automata

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions

Computational Models Lecture 2 1

Regular Expressions and Language Properties

Theory of computation: initial remarks (Chapter 11)

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

This Lecture will Cover...

CSE 105 THEORY OF COMPUTATION

Deterministic Finite Automaton (DFA)

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Theory of Computation

Before we show how languages can be proven not regular, first, how would we show a language is regular?

CSE 105 THEORY OF COMPUTATION

What we have done so far

Inf2A: Converting from NFAs to DFAs and Closure Properties

Theory of Languages and Automata

1 Alphabets and Languages

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Finite Automata (contd)

Nondeterministic finite automata

Languages, regular languages, finite automata

Complexity Theory Part I

Formal Languages, Automata and Models of Computation

Regular expressions and Kleene s theorem

CS 154 Introduction to Automata and Complexity Theory

Mathematics for linguists

Regular expressions and Kleene s theorem

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Unit 6. Non Regular Languages The Pumping Lemma. Reading: Sipser, chapter 1

CSE 105 THEORY OF COMPUTATION

Lecture 3: Nondeterministic Finite Automata

Transcription:

Regular Expressions Problem Problem Set Set Four Four is is due due using using a late late period period in in the the box box up up front. front.

Concatenation The concatenation of two languages L 1 and L 2 over the alphabet Σ is the language L₁L₂ = { wx Σ* w L₁ x L₂ } Intuitively, the set of all strings formed by concatenating some string from L₁ and some string from L₂. Conceptually similar to the Cartesian product of two sets, only with strings.

Language Exponentiation We can define what it means to exponentiate a language as follows: L 0 = { ε } The set containing just the empty string. Idea: Any string formed by concatenating zero strings together is the empty string. L n+1 = LL n Idea: Concatenating (n+1) strings together works by concatenating n strings, then concatenating one more.

The Kleene Closure An important operation on languages is the Kleene Closure, which is defined as Mathematically: L * = L i i = 0 w L* iff n N. w L n Intuitively, all possible ways of concatenating any number of copies of strings in L together.

Closure Properties The regular languages are closed under the following operations: Complementation Union Intersection Concatenation Kleene closure

Another View of Regular Languages

Rethinking Regular Languages We currently have several tools for showing a language is regular. Construct a DFA for it. Construct an NFA for it. Apply closure properties to existing languages. We have not spoken much of this last idea.

Constructing Regular Languages Idea: Build up all regular languages as follows: Start with a small set of simple languages we already know to be regular. Using closure properties, combine these simple languages together to form more elaborate languages. A bottom-up approach to the regular languages.

Regular Expressions Regular expressions are a family of descriptions that can be used to capture the regular languages. Often provide a compact and human-readable description of the language. Used as the basis for numerous software systems (Perl, flex, grep, etc.)

Atomic Regular Expressions The regular expressions begin with three simple building blocks. The symbol Ø is a regular expression that represents the empty language Ø. The symbol ε is a regular expression that represents the language { ε } This is not the same as Ø! For any a Σ, the symbol a is a regular expression for the language { a }

Compound Regular Expressions We can combine together existing regular expressions in four ways. If R 1 and R 2 are regular expressions, R 1 R 2 is a regular expression for the concatenation of the languages of R 1 and R 2. If R 1 and R 2 are regular expressions, R 1 R 2 is a regular expression for the union of the languages of R 1 and R 2. If R is a regular expression, R* is a regular expression for the Kleene closure of the language of R. If R is a regular expression, (R) is a regular expression with the same meaning as R.

Operator Precedence Regular expression operator precedence: (R) R* R 1 R 2 R 1 R 2 So ab*c d is parsed as ((a(b*))c) d

Regular Expression Examples The regular expression trick treat represents the regular language { trick, treat } The regular expression booo* represents the regular language { boo, booo, boooo, } The regular expression candy!(candy!)* represents the regular language { candy!, candy!candy!, candy!candy!candy!, }

Regular Expressions, Formally The language of a regular expression is the language described by that regular expression. Formally: L(ε) = {ε} L(Ø) = Ø L(a) = {a} L(R 1 R 2 ) = L( R 1 ) L( R 2 ) L(R 1 R 2 ) = L( R 1 ) L( R 2 ) L(R*) = L( R)* L((R)) = L( R) Worthwhile Worthwhile activity: activity: Apply Apply this this recursive recursive definition definition to to a(b c)((d)) and and see see what what you you get. get.

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w contains 00 as a substring } (0 1)*00(0 1)* 11011100101 0000 11111011110011111

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w contains 00 as a substring } Σ*00Σ* 11011100101 0000 11111011110011111

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w = 4 } The The length of of a string w is is denoted w w

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w = 4 } ΣΣΣΣ 0000 1010 1111 1000

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w = 4 } Σ 4 0000 1010 1111 1000

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w contains at most one 0 } 1*(0 ε)1* 11110111 111111 0111 0

Regular Expressions are Awesome Let Σ = {0, 1} Let L = { w Σ* w contains at most one 0 } 1*0?1* 11110111 111111 0111 0

Regular Expressions are Awesome Let Σ = { a,., @ }, where a represents some letter. Regular expression for email addresses:

Regular Expressions are Awesome Let Σ = { a,., @ }, where a represents some letter. Regular expression for email addresses: aa*(.aa*)* @ aa*.aa* (.aa*)* cs103@cs.stanford.edu first.middle.last@mail.site.org barack.obama@whitehouse.gov

Regular Expressions are Awesome Let Σ = { a,., @ }, where a represents some letter. Regular expression for email addresses: a + (.a + )*@a + (.a + ) + cs103@cs.stanford.edu first.middle.last@mail.site.org barack.obama@whitehouse.gov

Regular Expressions are Awesome a + (.a + )*@a + (.a + ) + @,. a, @,. @,. @,. q 2 q 8 q 7. a @., @ @ @,.. a start q 0 a q 1 @ q 3 a q 04. q 5 a q 6 a a a

Shorthand Summary R n is shorthand for RR R (n times). Σ is shorthand for any character in Σ. R? is shorthand for (R ε), meaning zero or one copies of R. R+ is shorthand for RR*, meaning one or more copies of R.

Break for Announcements!

Midterm Logistics Midterm is tomorrow, October 29, from 7PM - 10PM Room determined by last name: A G: Go to Gates B01 H K: Go to Gates B03 L P: Go to 200-002 Q V: Go to 420-041 W Z: Go to Herrin T175

Your Questions

When writing a logic statement, do you have to include the universal or existential quantifier for every variable that you state? I thought you had to, but this one from lecture doesn't: Tallest(x) y. (x y IsShorterThan(y, x)) This This example example is is a a sentence sentence fragment fragment in in first-order first-order logic; logic; without without a a definition definition of of x, x, this this isn't isn't a a valid valid statement. statement. All All variables variables need need to to be be quantified. quantified.

"When writing first-order logic statements with quantifiers, which one out of the following would be correct? x P(x). y. R(y) or x. (P(x) y. R(y))

If you find that the function f: A B is not surjective, have you proven that A < B? Or do you still need to do additional proof steps? f : N N f(n) = 137

What is the best thing to do to prepare for the exam between now and 7PM tomorrow?

Is there some mathematical automaton that can determine whether or not two first-order logical statements are equivalent? More More on on that that later later in in the the quarter...

Back to Regular Expressions!

The Power of Regular Expressions Theorem: If R is a regular expression, then L( R) is regular. Proof idea: Show how to convert a regular expression into an NFA.

A Marvelous Construction The following theorem proves the language of any regular expression is regular: Theorem: For any regular expression R, there is an NFA N such that L(R) = L( N) N has exactly one accepting state. N has no transitions into its start state. N has no transitions out of its accepting state. start

Base Cases start ε Automaton for ε start Automaton for Ø start a Automaton for single character a

Construction for R 1 R 2 start ε Machine for R 1 Machine for R 2

Construction for R 1 R 2 ε ε start Machine for R 1 ε ε Machine for R 2

Construction for R* start ε ε ε Machine for R ε

Why This Matters Many software tools work by matching regular expressions against text. One possible algorithm for doing so: Convert the regular expression to an NFA. (Optionally) Convert the NFA to a DFA using the subset construction. Run the text through the finite automaton and look for matches. Runs extremely quickly!

The Power of Regular Expressions Theorem: If L is a regular language, then there is a regular expression for L. This is not obvious! Proof idea: Show how to convert an arbitrary NFA into a regular expression.

From NFAs to Regular Expressions start s 1 s 2 s n Regular expression: (s 1 s 2 s n )* Key Key idea: idea: Label Label transitions transitions with with arbitrary arbitrary regular regular expressions. expressions.

From NFAs to Regular Expressions start R Regular expression: R Key Key idea: idea: If If we we can can convert convert any any NFA NFA into into something something that that looks looks like like this, this, we we can can easily easily read read off off the the regular regular expression. expression.

From NFAs to Regular Expressions R 11 R 22 R 12 start q 1 R q 21 2

From NFAs to Regular Expressions R 11 R 22 R 12 start ε ε q s q 1 q 2 R 21 q f

From NFAs to Regular Expressions R 11 R 22 R 12 start ε ε q s q 1 q 2 R 21 q f Could Could we we eliminate this this state state from from the the NFA? NFA?

From NFAs to Regular Expressions ε R 11 * R 12 R 11 R 22 R 12 start ε ε q s q 1 q 2 R 21 q f Note: Note: We're We're using using concatenation concatenation and and Kleene Kleene closure closure in in order order to to skip skip this this state. state.

From NFAs to Regular Expressions R 11 * R 12 start q s q 2 ε q f R 22 R 21 R 11 * R 12 Note: Note: We're We're using using union union to to combine combine these these transitions transitions together. together.

From NFAs to Regular Expressions R 11 * R 12 (R 22 R 21 R 11 *R 12 )* ε start q s R 11 * R 12 q 2 ε q f R 22 R 21 R 11 * R 12

From NFAs to Regular Expressions R 11 * R 12 (R 22 R 21 R 11 *R 12 )* start q s q f

From NFAs to Regular Expressions R 11 * R 12 (R 22 R 21 R 11 *R 12 )* start q s q f R 11 R 22 R 12 start q 1 R q 21 2

The Construction at a Glance Start with an NFA for the language L. Add a new start state q s and accept state q f to the NFA. Add ε-transitions from each original accepting state to q f, then mark them as not accepting. Repeatedly remove states other than q s and q f from the NFA by shortcutting them until only two states remain: q s and q f. The transition from q s to q f is then a regular expression for the NFA.

Our Transformations direct conversion state elimination DFA NFA Regexp subset construction recursive transform

Theorem: The following are all equivalent: L is a regular language. There is a DFA D such that L( D) = L. There is an NFA N such that L( N) = L. There is a regular expression R such that L( R) = L.

Next Time Applications of Regular Languages Answering so what? Intuiting Regular Languages What makes a language regular? The Pumping Lemma Proving languages aren't regular.