Nonregular Languages

Similar documents
Nonregular Languages

Complexity Theory Part I

Recap from Last Time

Finite Automata Part One

Finite Automata Part One

A Note on Turing Machine Design

CS 154 Introduction to Automata and Complexity Theory

Congratulations you're done with CS103 problem sets! Please evaluate this course on Axess!

Announcements. Final exam review session tomorrow in Gates 104 from 2:15PM 4:15PM Final Friday Four Square of the quarter!

CSC236 Week 11. Larry Zhang

Turing Machines Part Two

Notes on State Minimization

Finite Automata Part One

CSE 105 THEORY OF COMPUTATION

Finite Automata Part Two

Theory of Computation

Turing Machines Part III

Congratulations you're done with CS103 problem sets! Please evaluate this course on Axess!

CSE 105 THEORY OF COMPUTATION

Proving languages to be nonregular

Decidability and Undecidability

Computational Models: Class 3

CS375 Midterm Exam Solution Set (Fall 2017)

Announcements. Problem Set Four due Thursday at 7:00PM (right before the midterm).

Congratulations you're done with CS103 problem sets! Please evaluate this course on Axess!

CSE 105 THEORY OF COMPUTATION

Turing Machines Part One

CS103 Handout 47 Spring 2018 Practice CS103 Final Exam V

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

Guide to Proofs on Sets

NP-Completeness Part II

Outline for Today. Binary Relations. Equivalence Relations. Partial Orders. Quick Midterm Review

(Refer Slide Time: 0:21)

What we have done so far

CSC236 Week 10. Larry Zhang

The Inductive Proof Template

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Computational Models - Lecture 3 1

A non-turing-recognizable language

CSE 105 THEORY OF COMPUTATION

Nondeterministic finite automata

Pushdown Automata. Reading: Chapter 6

Announcements. Using a 72-hour extension, due Monday at 11:30AM.

1 More finite deterministic automata

Lecture 4: More on Regexps, Non-Regular Languages

Finite Automata (contd)

Foundations of (Theoretical) Computer Science

Nondeterministic Finite Automata and Regular Expressions

Announcements. Problem Set 6 due next Monday, February 25, at 12:50PM. Midterm graded, will be returned at end of lecture.

CSE 311: Foundations of Computing. Lecture 26: More on Limits of FSMs, Cardinality

CS103 Handout 12 Winter February 4, 2013 Practice Midterm Exam

Outline for Today. Revisiting our first lecture with our new techniques! How do we know when two sets have the same size?

Lecture 23: Rice Theorem and Turing machine behavior properties 21 April 2009

Fooling Sets and. Lecture 5

Theory of Computation

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

10. The GNFA method is used to show that

CSCI 2670 Introduction to Theory of Computing

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Turing Machines Part II

Automata: a short introduction

Computational Theory

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Theory of Computation Lecture 1. Dr. Nahla Belal

MITOCW ocw f99-lec01_300k

Concatenation. The concatenation of two languages L 1 and L 2

Mathematical Induction Part Two

Finite Automata Part One

Name: Student ID: Instructions:

Nondeterministic Finite Automata

ECS120 Fall Discussion Notes. October 25, The midterm is on Thursday, November 2nd during class. (That is next week!)

Computational Models - Lecture 3 1

Instructor (Brad Osgood)

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Recap from Last Time

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

6.045: Automata, Computability, and Complexity Or, Great Ideas in Theoretical Computer Science Spring, Class 8 Nancy Lynch

False. They are the same language.

Turing Machines Part Three

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Time-bounded computations

I have read and understand all of the instructions below, and I will obey the University Code on Academic Integrity.

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Turing Machines Part II

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

CSE 105 Theory of Computation Professor Jeanne Ferrante

CISC4090: Theory of Computation

Mapping Reducibility. Human-aware Robotics. 2017/11/16 Chapter 5.3 in Sipser Ø Announcement:

MITOCW ocw f99-lec30_300k

Languages, regular languages, finite automata

More on Regular Languages and Non-Regular Languages

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Computational Models - Lecture 4

Chapter 6. Properties of Regular Languages

Introduction to the Theory of Computing

Regular Languages and Finite Automata

Transcription:

Nonregular Languages

Recap from Last Time

Theorem: The following are all equivalent: L is a regular language. There is a DFA D such that L( D) = L. There is an NFA N such that L( N) = L. There is a regular expression R such that L( R) = L.

New Stuff!

Why does this matter?

Buttons as Finite-State Machines: http://cs103.stanford.edu/tools/button-fsm/

http://www.tti.unipa.it/~gneglia/ip_networks06/slides/tcpip_state_transition_diagram.pdf

What exactly is a finite-state machine?

Ready! Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Working Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Thinking Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Thinking Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Working Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Ready! Finite-Memory Computing Device a b c

Working Finite-Memory Computing Device a b c

YES Finite-Memory Computing Device a b c

The Model The computing device has internal workings that can be in one of finitely many possible configurations. Each state in a DFA corresponds to some possible configuration of the internal workings. After each button press, the computing device does some amount of processing, then gets to a configuration where it's ready to receive more input. Each transition abstracts away how the computation is done and just indicates what the ultimate configuration looks like. After the user presses the done button, the computer outputs either YES or NO. The accepting and rejecting states of the machine model what happens when that button is pressed.

Computers as Finite Automata My computer has 12GB of RAM and 150GB of hard disk space. That's a total of 162GB of memory, which is 1,391,569,403,904 bits. There are only 2 1,391,569,403,904 possible configurations of the memory in my computer. You could in principle build a DFA representing my computer, where there's one symbol per type of input the computer can receive.

A Powerful Intuition Regular languages correspond to problems that can be solved with finite memory. At each point in time, we only need to store one of finitely many pieces of information. Nonregular languages, in a sense, correspond to problems that cannot be solved with finite memory. Since every computer ever built has finite memory, in a sense, nonregular languages correspond to problems that cannot be solved by physical computers!

Finding Nonregular Languages

Finding Nonregular Languages To prove that a language is regular, we can just find a DFA, NFA, or regex for it. To prove that a language is not regular, we need to prove that there are no possible DFAs, NFAs, or regexes for it. Claim: We can actually just prove that there's no DFA for it. Why is this? This sort of argument will be challenging. Our arguments will be somewhat technical in nature, since we need to rigorously establish that no amount of creativity could produce a DFA for a given language. Let's see an example of how to do this.

A Simple Language Let Σ = {a, b} and consider the following language: E = {a n b n n N } E is the language of all strings of n a's followed by n b's: { ε, ab, aabb, aaabbb, aaaabbbb, } Is this language regular? Let's see!

An Attempt Let's see if we can make a regular expression for the language Does a*b* work? How about (ab)*? E = {a n b n n N }. How about ε ab a 2b 2 a 3 b 3?

Another Attempt Perhaps we can make an NFA for E = {a n b n n N }. Does this machine work? start a b

Another Attempt Perhaps we can make an NFA for E = {a n b n n N }. How about this one? a b start ε ε

Another Attempt Perhaps we can make an NFA for E = {a n b n n N }. What about this? start a a b b b

We seem to be running into some trouble. Why is that?

Let's imagine what a DFA for the language { a n b n n N} would have to look like. Can we say anything about it?

This This isn't isn't a a single single transition. transition. Think Think of of it it as as after after reading reading aaaa, aaaa, we we end end up up at at this this state. state. aaaa bbbb bb aaaabbbb aaaabb start These These cannot cannot be be the the same same state! state! bbbb aabbbb aa bb aabb

A Different Perspective start aaaa aaaabbbb q₀ qₖ qₙ aa aabbbb

A Different Perspective start aaaa aaaabbbb q₀ qₖ qₙ aa aabbbb What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? We We accept accept aabbbb aabbbb E! E! a a rejecting rejecting state? state? We We reject reject aaaabbbb aaaabbbb E! E!

A Different Perspective start aaaa aaaabbbb q₀ qₖ qₙ aa aabbbb What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? We We accept accept aabbbb aabbbb E! E! a a rejecting rejecting state? state? We We reject reject aaaabbbb aaaabbbb E! E!

A Different Perspective start aaaa aaaabbbb q₀ qₖ qₙ aa aabbbb What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? We We accept accept aabbbb aabbbb E! E! a a rejecting rejecting state? state? We We reject reject aaaabbbb aaaabbbb E! E!

What s Going On? As you just saw, the strings a 4 and a 2 can't end up in the same state in any DFA for E = {a n b n n N}. Two proof routes: Direct: The states you reach for a4 and a 2 have to behave differently when reading b 4 in one case it should lead to an accept state, in the other it should lead to a reject state. Therefore, they must be different states. Contradiction: Suppose you do end up in the same state. Then a 4 b 4 and a 2 b 4 end up in the same state, so we either reject a 4 b 4 (oops) or accept a 2 b 4 (oops). Powerful intuition: Any DFA for E must keep a 4 and a 2 separated. It needs to remember something fundamentally different after reading those strings.

This idea that two strings shouldn't end up in the same DFA state is fundamental to discovering nonregular languages. Let's go formalize this!

Distinguishability Let L be an arbitrary language over Σ. Two strings x Σ* and y Σ* are called distinguishable relative to L if there is a string w Σ* such that exactly one of xw and yw is in L. We denote this by writing x L y. In our previous example, we saw that a 2 and a 4 are distinguishable relative to E = {a n b n n N}. Try appending b 4 to both of them. Formally, we say that x L y if the following is true: w Σ*. (xw L yw L)

Distinguishability start Theorem: Let L be an arbitrary language over Σ. Let x Σ* and y Σ* be strings where x L y. Then if D is any DFA for L, then D must end in different states when run on inputs x and y. Proof sketch: x xw q₀ qₖ qₙ y yw

bbbb aaaa bbb bb aaa bbbb bbb start bbbb bb aa bb bbb

bbbb aaaa bbb bb aaa bbbb bbb start bbbb bb aa bb bbb

Distinguishability Let's focus on this language for now: E = {a n b n n N } Lemma: If m, n N and m n, then a m E a n. Proof: Let a m and a n be strings where m n. Then a m b m E and a n b m E. Therefore, we see that a m E a n, as required.

A Bad Combination Suppose there is a DFA D for the language E = {a n b n n N }. We know the following: Any two strings of the form a m and a n, where m n, cannot end in the same state when run through D. There are infinitely many pairs of strings of the form a m and a n. However, there are only finitely many states they can end up in, since D is a deterministic finite automaton! If we put the pieces together, we see that...

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n E cannot end in the same state when run through D. But this is impossible, since we know that a m and a n must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n E cannot end in the same state when run through D. But this is impossible, since we know that a m and a n do end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n E cannot end in the same state when run through D. But this is impossible, since we know that a m and a n do end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n E cannot end in the same state when run through D. But this is impossible, since we know that a m and a n do end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E, and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m a n, so by our earlier theorem we know that a m and a n E cannot end in the same state when run through D. But this is impossible, since we know that a m and a n do end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular. We're We're going going to to see see a a simpler simpler proof proof of of this this result result later later on on once once we've we've built built more more machinery. machinery. If If (hypothetically (hypothetically speaking) speaking) you you want want to to prove prove something something like like this this on on the the problem problem set, set, we'd we'd recommend recommend not not using using this this proof proof as as a a template. template.

What Just Happened? We've just hit the limit of finitememory computation. To build a DFA for E = { a nb n n N }, we need to have different memory configurations (states) for all possible strings of the form a n. There's no way to do this with finitely many possible states!

Where We're Going We just used the idea of distinguishability to show that no possible DFA can exist for some language. This technique turns out to be pretty powerful. We're going to see one more example of this technique in action, then generalize it to an extremely powerful theorem for finding nonregular languages.

More Nonregular Languages

Another Language Consider the following language L over the alphabet Σ = {a, b, }: EQ = { w w w {a, b}*} EQ is the language all strings consisting of the same string of a's and b's twice, with a symbol in-between. Examples: ab ab EQ bbb bbb EQ EQ ab ba EQ bbb aaa EQ b EQ

Another Language EQ = { w w w {a, b}*} This language corresponds to the following problem: Given strings x and y, does x = y? Justification: x = y iff x y EQ. Is this language regular?

The Intuition EQ = { w w w {a, b}*} Intuitively, any machine for EQ has to be able to remember the contents of everything to the left of the so that it can match them against the contents of the string to the right of the. There are infinitely many possible strings we can see, but we only have finite memory to store which string we saw. That's a problem... can we formalize this?

The Intuition start x x x q₀ qₖ qₙ y y x

The Intuition start x x x q₀ qₖ qₙ y y x What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? We We accept accept aabbbb aabbbb L! L! a a rejecting rejecting state? state? We We reject reject aaaabbbb aaaabbbb L! L!

The Intuition start x x x q₀ qₖ qₙ y y x What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? We We accept accept y x y x EQ! EQ! a a rejecting rejecting state? state? We We reject reject aaaabbbb aaaabbbb L! L!

The Intuition start x x x q₀ qₖ qₙ y y x What What happens happens if if qₙ qₙ is is an an accepting accepting state? state? a a rejecting rejecting state? state? We We accept accept y x y x EQ! EQ! We We reject reject x x x x EQ! EQ!

Distinguishability Let's focus on this language for now: EQ = { w w w {a, b}*} Lemma: If x, y {a, b}* and x y, then x EQ y. Proof: Let x and y be two distinct, arbitrary strings from {a, b}*. Then we see that x x EQ and y x EQ, so we conclude that x EQ y, as required.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*. Because D only has k states, by the pigeonhole principle there must be at least two strings x and y that, when run through D, end in the same state. Our lemma tells us that x y. By our earlier theorem, EQ this means that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We're We're going going to to see see a a simpler simpler proof proof of of this this result result later on once we've built more machinery. If We have reached a contradiction, later once so we've our built assumption more machinery. must If (hypothetically (hypothetically speaking) speaking) you you want want to to prove prove have been wrong. Thus EQ is not regular. something something like like this this on on the the problem problem set, set, we'd we'd recommend recommend not not using using this this proof proof as as a a template. template.

Time-Out for Announcements!

WiCS Casual CS Dinner WiCS is holding their second Casual CS Dinner tonight from 6:00PM 7:00PM in the WCC. Show up, meet cool folks, get some dinner, and have fun! Everyone is welcome!

Midterm Exam Logistics The second midterm exam is next Tuesday, May 23rd, from 7:00PM 10:00PM. Locations are divvied up by last (family) name: Abb Pag: Go to Hewlett 200. Par Tak: Go to Sapp 114. Tan Val: Go to Hewlett 101. Var Yim: Go to Hewlett 102. You Zuc: Go to Hewlett 103. You re responsible for Lectures 00 13 and topics covered in PS1 PS5. Later lectures and problem sets won t be tested. The focus is on PS3 PS5 and Lectures 06 13. The exam is closed-book, closed-computer, and limited-note. You can bring a double-sided, 8.5 11 sheet of notes with you to the exam, decorated however you d like. Students with OAE accommodations: we ll be contacting you later this week with alternate exam logistics.

Preparing for the Exam We will be holding a practice midterm exam tomorrow night from 7PM 10PM in room 320-105. This is a great way to practice. We ll release three practice exams and the solutions to EPP4 EPP7 on Wednesday morning. As always, feel free to reach out to us if you have any questions. We re happy to help out! The advice in Handout 16 (Preparing for the Exam) is still really valuable. Read over it for some accumulated wisdom!

Your Questions

Besides nifty, what is your favorite word? Nixtamalization. Nixtamalization. It s It s a a great great word word with with a a great great history. history.

Back to CS103!

Comparing Proofs

Theorem: The language E = { a n b n n N } is not a regular language. Proof: Suppose for the sake of contradiction that E is regular. Let D be a DFA for E and let k be the number of states in D. Consider the strings a 0, a 1, a 2,, a k. This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings a m and a n that end in the same state when run through D. Our lemma tells us that a m E an. By our earlier theorem we know that a m and a n cannot end in the same state when run through D. But this is impossible, since we know that a m and a n do end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, E is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not a regular language. Proof: Suppose for the sake of contradiction that EQ is regular. Let D be a DFA for EQ and let k be the number of states in D. Consider any k+1 distinct strings in {a, b}*.these are k+1 strings and there are only k states in D. By the pigeonhole principle, there must be two distinct strings x and y from this group that end in the same state when run through D. Our lemma tells us that x y. By our earlier theorem we EQ know that x and y cannot end in the same state when run through D. But this is impossible, since specifically chose x and y to end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, EQ is not regular.

Theorem: The language L = [ fill in the blank ] is not a regular language. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for L and let k be the number of states in D. Consider [ some k+1 specific strings. ] This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings x and y that end in the same state when run through D. [ Somehow we know ] that x y. By our earlier theorem we L know that x and y cannot end in the same state when run through D. But this is impossible, since we know that x and y must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, L is not regular.

Theorem: The language L = [ fill in the blank ] is not a regular language. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for L and let k be the number of states in D. Consider [ some k+1 specific strings. ] This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings x and y that end in the same state when run through D. [ Somehow we know ] that x y. By our earlier theorem we L know that x and y cannot end in the same state when run through D. But this is impossible, since we know that x and y must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, L is not regular.

For For any any number number of of states states k, k, we we need need a a way way to to find find k+1 k+1 strings strings so so that that two two of of them them get get into into the the same same state. state. Theorem: The language L = [ fill in the blank ] is not a regular language. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for L and let k be the number of states in D. Consider [ some k+1 specific strings. ] This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings x and y that end in the same state when run through D. [ Somehow we know ] that x y. By our earlier theorem we L know that x and y cannot end in the same state when run through D. But this is impossible, since we know that x and y must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, L is not regular.

For For any any number number of of states states k, k, we we need need a a way way to to find find k+1 k+1 strings strings so so that that two two of of them them get get into into the the same same state. state. Theorem: The language L = [ fill in the blank ] is not a regular language. Proof: Suppose for the sake of contradiction that L is regular. Let D be a DFA for L and let k be the number of states in D. Consider [ some k+1 specific strings. ] This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings x and y that end in the same state when run through D. [ Somehow we know ] that x y. By our earlier theorem we L know that x and y cannot end in the same state when run through D. But this is impossible, since we know that x and y must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, L is not regular... and and all all those those strings strings need need to to be be distinguishable distinguishable so so that that we we get get a a contradiction. contradiction.

Imagine Imagine we we have have an an infinite infinite set set of of strings strings S S with with the the following following For For any any number number of of states states k, k, we we need need a a way way to to find find k+1 k+1 strings strings so so that that two two of of them them get get into into the the same same state. state. Theorem: property: The property: language L = [ fill in the blank ] is not a regular language. x x S. S. y y S. S. (x (x y y x x L y) L y) Proof: Suppose for the sake of contradiction that L is regular. What Let D be What happens? a happens? DFA for L and let k be the number of states in D. Consider [ some k+1 specific strings. ] This is a collection of k+1 strings and there are only k states in D. Therefore, by the pigeonhole principle, there must be two distinct strings x and y that end in the same state when run through D. [ Somehow we know ] that x y. By our earlier theorem we L know that x and y cannot end in the same state when run through D. But this is impossible, since we know that x and y must end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Therefore, L is not regular... and and all all those those strings strings need need to to be be distinguishable distinguishable so so that that we we get get a a contradiction. contradiction.

The Myhill-Nerode Theorem Theorem: Let L be a language over Σ. If there is a set S Σ* with the following properties, then L is not regular: S is infinite (that is, S contains infinitely many strings). The strings in S are pairwise distinguishable relative to L. That is, x S. y S. (x y x L y).

Proof: Let L be an arbitrary language over Σ. Let S Σ* be an infinite set of strings with the following property: if x, y S and x y, then x y. We will show that L is not regular. L Suppose for the sake of contradiction that L is regular. This means that there must be some DFA D for L. Let k be the number of states in D. Since there are infinitely many strings in S, we can choose k+1 distinct strings from S and consider what happens when we run D on all of those strings. Because there are only k states in D and we've chosen k+1 strings from S, by the pigeonhole principle we know that at least two strings from S must end in the same state in D. Choose any two such strings and call them x and y. Because x and y are distinct strings in S, we know that x y. L Our earlier theorem therefore tells us that when we run D on inputs x and y, they must end up in different states. But this is impossible we chose x and y precisely because they end in the same state when run through D. We have reached a contradiction, so our assumption must have been wrong. Thus L is not a regular language.

Using the Myhill-Nerode Theorem

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite because it contains one string for each natural number. Now, consider any strings a n, a m S where a n a m. Then a n b n E and a m b n E. Consequently, a n E am. Therefore, by the Myhill- Nerode theorem, L is not regular.

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite To because it To use contains use the the Myhill-Nerode Myhill-Nerode theorem, one string theorem, we for each we need natural need to to find find an an infinite infinite set set of of strings strings that number. Now, consider any strings a n that are are pairwise pairwise distinguishable distinguishable relative relative, to ato m E. E. S where a n We a m. Then a n b n E and a m b n E. We know know that Consequently, n that any any two two strings strings of of the the form form aa n E am. Therefore, by the Myhill- n and and aa m m,, where where n m, m, are are distinguishable. Nerode theorem, distinguishable. L not regular. So So build build the the set set S = { aa n n n N }. }. Notice Notice that that S isn't isn't a subset subset of of E. E. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of E! E!

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite To because it To use contains use the the Myhill-Nerode Myhill-Nerode theorem, one string theorem, we for each we need natural need to to find find an an infinite infinite set set of of strings strings that number. Now, consider any strings a n that are are pairwise pairwise distinguishable distinguishable relative relative, to ato m E. E. S where a n We a m. Then a n b n E and a m b n E. We know know that Consequently, n that any any two two strings strings of of the the form form aa n E am. Therefore, by the Myhill- n and and aa m m,, where where n m, m, are are distinguishable. Nerode theorem, distinguishable. L not regular. So So build build the the set set S = { aa n n n N }. }. Notice Notice that that S isn't isn't a subset subset of of E. E. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of E! E!

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite To because it To use contains use the the Myhill-Nerode Myhill-Nerode theorem, one string theorem, we for each we need natural need to to find find an an infinite infinite set set of of strings strings that number. Now, consider any strings a n that are are pairwise pairwise distinguishable distinguishable relative relative, to ato m E. E. S where a n We a m. Then a n b n E and a m b n E. We know know that Consequently, n that any any two two strings strings of of the the form form aa n E am. Therefore, by the Myhill- n and and aa m m,, where where n m, m, are are distinguishable. Nerode theorem, distinguishable. L not regular. So So pick pick the the set set S = { aa n n n N }. }. Notice Notice that that S isn't isn't a subset subset of of E. E. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of E! E!

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite To because it To use contains use the the Myhill-Nerode Myhill-Nerode theorem, one string theorem, we for each we need natural need to to find find an an infinite infinite set set of of strings strings that number. Now, consider any strings a n that are are pairwise pairwise distinguishable distinguishable relative relative, to ato m E. E. S where a n We a m. Then a n b n E and a m b n E. We know know that Consequently, n that any any two two strings strings of of the the form form aa n E am. Therefore, by the Myhill- n and and aa m m,, where where n m, m, are are distinguishable. Nerode theorem, distinguishable. L not regular. So So pick pick the the set set S = { aa n n n N }. }. Notice Notice that that S isn't isn't a subset subset of of E. E. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of E! E!

Theorem: The language E = { a n b n n N } is not regular. Proof: Let S = { a n n N }. This set is infinite because it contains one string for each natural number. Now, consider any strings a n, a m S where a n a m. Then a n b n E and a m b n E. Consequently, a n a m E. Therefore, by the Myhill- Nerode theorem, L is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set is infinite because it contains infinitely many strings of the form a n. Now, consider any x, y S where x y. Then x x EQ and y x EQ. Consequently, x EQ y. Therefore, by the Myhill-Nerode theorem, EQ is not regular.

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set is infinite because it To contains infinitely To use use the the Myhill-Nerode Myhill-Nerode theorem, many strings theorem, we of the we need form need a n. to Now, consider to find find an any infinite infinite set x, y set of S of strings where strings that x that are are pairwise y. Then pairwise distinguishable distinguishable relative relative to to EQ. EQ. x x EQ and y x EQ. Consequently, x We EQ y. We know know that that any any two two distinct distinct strings strings over over Therefore, the the by alphabet alphabet the Myhill-Nerode {a, {a, b} b} are are distinguishable. distinguishable. theorem, EQ is not regular. So So pick pick the the set set S = {a, {a, b}*. b}*. Notice Notice that that S isn't isn't a subset subset of of EQ. EQ. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of EQ! EQ!

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set is infinite because it To contains infinitely To use use the the Myhill-Nerode Myhill-Nerode theorem, many strings theorem, we of the we need form need a n. to Now, consider to find find an any infinite infinite set x, y set of S of strings where strings that x that are are pairwise y. Then pairwise distinguishable distinguishable relative relative to to EQ. EQ. x x EQ and y x EQ. Consequently, x We EQ y. We know know that that any any two two distinct distinct strings strings over over Therefore, the the by alphabet alphabet the Myhill-Nerode {a, {a, b} b} are are distinguishable. distinguishable. theorem, EQ is not regular. So So pick pick the the set set S = {a, {a, b}*. b}*. Notice Notice that that S isn't isn't a subset subset of of EQ. EQ. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of EQ! EQ!

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set is infinite because it To contains infinitely To use use the the Myhill-Nerode Myhill-Nerode theorem, many strings theorem, we of the we need form need a n. to Now, consider to find find an any infinite infinite set x, y set of S of strings where strings that x that are are pairwise y. Then pairwise distinguishable distinguishable relative relative to to EQ. EQ. x x EQ and y x EQ. Consequently, x We EQ y. We know know that that any any two two distinct distinct strings strings over over Therefore, the the by alphabet alphabet the Myhill-Nerode {a, {a, b} b} are are distinguishable. distinguishable. theorem, EQ is not regular. So So pick pick the the set set S = {a, {a, b}*. b}*. Notice Notice that that S isn't isn't a subset subset of of EQ. EQ. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of EQ! EQ!

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set is infinite because it To contains infinitely To use use the the Myhill-Nerode Myhill-Nerode theorem, many strings theorem, we of the we need form need a n. to Now, consider to find find an any infinite infinite set x, y set of S of strings where strings that x that are are pairwise y. Then pairwise distinguishable distinguishable relative relative to to EQ. EQ. x x EQ and y x EQ. Consequently, x We EQ y. We know know that that any any two two distinct distinct strings strings over over Therefore, the the by alphabet alphabet the Myhill-Nerode {a, {a, b} b} are are distinguishable. distinguishable. theorem, EQ is not regular. So So pick pick the the set set S = {a, {a, b}*. b}*. Notice Notice that that S isn't isn't a subset subset of of EQ. EQ. That's That's okay: okay: we we never never said said that that S needs needs to to be be a subset subset of of EQ! EQ!

Theorem: The language EQ = { w w w {a, b}*} is not regular. Proof: Let S = {a, b}*. This set contains infinitely many strings. Now, consider any x, y S where x y. Then x x EQ and y x EQ. Consequently, x EQ y. Therefore, by the Myhill-Nerode theorem, EQ is not regular.

Approaching Myhill-Nerode The challenge in using the Myhill-Nerode theorem is finding the right set of strings to use. General intuition: Start by thinking about what information a computer must remember in order to answer correctly. Choose a group of strings that all require different information. Prove that those strings are distinguishable relative to the language in question.

Tying Everything Together One of the intuitions we hope you develop for DFAs is to have each state in a DFA represent some key piece of information the automaton has to remember. If you only need to remember one of finitely many pieces of information, that gives you a DFA. You can formalize this! Take CS154 for details. If you need to remember one of infinitely many pieces of information, you can use the Myhill- Nerode theorem to prove that the language has no DFA.

Where We Stand

Where We Stand We've ended up where we are now by trying to answer the question what problems can you solve with a computer? We defined a computer to be DFA, which means that the problems we can solve are precisely the regular languages. We've discovered several equivalent ways to think about regular languages (DFAs, NFAs, and regular expressions) and used that to reason about the regular languages. We now have a powerful intuition for where we ended up: DFAs are finite-memory computers, and regular languages correspond to problems solvable with finite memory. Putting all of this together, we have a much deeper sense for what finite memory computation looks like and what it doesn't look like!

Where We're Going What does computation look like with unbounded memory? What problems can you solve with unbounded-memory computers? What does it even mean to solve such a problem? And how do we know the answers to any of these questions?

Next Time Context-Free Languages Context-Free Grammars Generating Languages from Scratch