The Pumping Lemma: limitations of regular languages

Similar documents
Inf2A: The Pumping Lemma

Proving languages to be nonregular

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

(Refer Slide Time: 0:21)

Turing machines and linear bounded automata

Deterministic finite Automata

Languages, regular languages, finite automata

Regular expressions and Kleene s theorem

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Unit 6. Non Regular Languages The Pumping Lemma. Reading: Sipser, chapter 1

Turing machines and linear bounded automata

Turing machines and linear bounded automata

Foundations of (Theoretical) Computer Science

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Introduction to Turing Machines. Reading: Chapters 8 & 9

We have seen many ways to specify Regular languages. Are all languages Regular languages?

1 More finite deterministic automata

11-711: Algorithms for NLP Homework Assignment #1: Formal Language Theory Solutions

Properties of Regular Languages. Wen-Guey Tzeng Department of Computer Science National Chiao Tung University

Notes on Pumping Lemma

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Computer Sciences Department

Regular expressions and Kleene s theorem

Mathematics for linguists

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

CSE 322, Fall 2010 Nonregular Languages

CSE 105 THEORY OF COMPUTATION

Chapter 6. Properties of Regular Languages

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Fooling Sets and. Lecture 5

Formal Languages, Automata and Models of Computation

CpSc 421 Homework 9 Solution

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

CS 154, Lecture 4: Limitations on DFAs (I), Pumping Lemma, Minimizing DFAs

Nondeterministic finite automata

Deterministic Finite Automata (DFAs)

2. Syntactic Congruences and Monoids

Deterministic Finite Automata (DFAs)

Finite Automata (contd)

What we have done so far

CSE 105 THEORY OF COMPUTATION

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Please give details of your calculation. A direct answer without explanation is not counted.

TURING MAHINES

Name: Student ID: Instructions:

Introduction to the Theory of Computing

CSCE 551 Final Exam, Spring 2004 Answer Key

Chapter 2 The Pumping Lemma

CS500 Homework #2 Solutions

Part 4 out of 5 DFA NFA REX. Automata & languages. A primer on the Theory of Computation. Last week, we showed the equivalence of DFA, NFA and REX

Lecture 2: Connecting the Three Models

Regular expressions and Kleene s theorem

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Deterministic Finite Automata (DFAs)

One-to-one functions and onto functions

More on Regular Languages and Non-Regular Languages

3515ICT: Theory of Computation. Regular languages

PGSS Discrete Math Solutions to Problem Set #4. Note: signifies the end of a problem, and signifies the end of a proof.

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Formal Language and Automata Theory (CS21004)

CSE 105 THEORY OF COMPUTATION

Constructions on Finite Automata

Warm-up Quantifiers and the harmonic series Sets Second warmup Induction Bijections. Writing more proofs. Misha Lavrov

Nondeterministic Finite Automata

Lecture 4: More on Regexps, Non-Regular Languages

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Constructions on Finite Automata

More Properties of Regular Languages

CS243, Logic and Computation Nondeterministic finite automata

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions

Computational Models - Lecture 3

Computational Theory

CSE 105 Theory of Computation Professor Jeanne Ferrante

Computational Models - Lecture 3 1

Computational Models #1

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers

Non-regular languages

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

CS5371 Theory of Computation. Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)

Homework 3: Solutions

Computational Models - Lecture 1 1

Finite Automata and Regular languages

Theory of Computation 3 Deterministic Finite Automata

Adam Blank Spring 2017 CSE 311. Foundations of Computing I. * All slides are a combined effort between previous instructors of the course

Deterministic Finite Automata (DFAs)

2.2 Some Consequences of the Completeness Axiom

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

CS154. Non-Regular Languages, Minimizing DFAs

CS5371 Theory of Computation. Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

Turing Machines Part III

Research Methods in Mathematics Homework 4 solutions

Computational Models: Class 3

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

1 Showing Recognizability

1. Induction on Strings

Transcription:

The Pumping Lemma: limitations of regular languages Informatics 2A: Lecture 8 John Longley School of Informatics University of Edinburgh jrl@inf.ed.ac.uk 6 October, 2016 1 / 16

Recap of Lecture 7 Showing a language isn t regular Lexical classes in programming languages may typically be specified via regular languages. The lexing algorithm runs a parallel NFA in order to find the next lexeme using the principle of longest match. Regular language theory can also be used in verifying subtle properties of large finite-state systems (e.g. those arising from interactions of several simpler systems). 2 / 16

Non-regular languages We ve hinted before that not all languages are regular. E.g. Java (or any other general-purpose programming language). The language {a n b n n 0}. The language of all well-matched sequences of brackets (, ). N.B. A sequence x is well-matched if it contains the same number of opening brackets ( and closing brackets ), and no initial subsequence y of x contains more ) s than ( s. But how do we know these languages aren t regular? Can we come up with a general technique for proving the non-regularity of languages? 3 / 16

The basic intuition: DFAs can t count! Consider L = {a n b n n 0}. Just suppose, hypothetically, there were some DFA M with L(M) = L. Suppose furthermore that M had just processed a n, and some continuation b m was to follow. Intuition: M would need to have counted the number of a s, in order to know how many b s to require. More precisely, let q n denote the state of M after processing a n. Then for any m n, the states q m, q n must be different, since b m takes us to an accepting state from q m, but not from q n. In other words, M would need infinitely many states, one for each natural number. Contradiction! 4 / 16

Exercises Showing a language isn t regular Which of the following languages are regular? 1 Strings with an odd number of a s and an even number of b s. 2 Strings containing strictly more a s than b s. 3 Strings such that (no. of a s) (no. of b s) 6 mod 24. 4 Strings over {0,..., 9} representing integers divisible by 43. 5 / 16

Exercises Showing a language isn t regular Which of the following languages are regular? 1 Strings with an odd number of a s and an even number of b s. 2 Strings containing strictly more a s than b s. 3 Strings such that (no. of a s) (no. of b s) 6 mod 24. 4 Strings over {0,..., 9} representing integers divisible by 43. Answer: 1 is regular (see similar example in Lecture 4). 2 isn t regular: intuitively, we d need to keep track of difference between no. of a s and noȯf b s, which could be any integer. 3 is regular: we only need to keep track of no. of a s mod 24 and no. of b s mod 24, which we can do with 24 24 = 576 states. 4 is regular: we can keep track of the number read so far mod 43. On reading a new digit d, we go from state i to state (10i + d) mod 43. 5 / 16

Loops in DFAs Showing a language isn t regular Let M be a DFA with k states. Suppose, starting from some state of M, we process a string y of length y k. We then pass through a sequence of y + 1 states. So there must be some state q that s visited at least twice: u q w v (Note that u and w might be ɛ, but v definitely isn t.) So any string y with y k can be decomposed as uvw, where u is the prefix of y that leads to the first visit of q v takes us once round the loop from q to q, w is whatever is left of y after uv. 6 / 16

A general consequence If L is any regular language, we can pick some DFA M for L, and it will have some number of states say k. Suppose we run M on a string xyz L, where y k. There must be at least one state q visited twice in the course of processing y: x u q w z (There may be other revisited states not indicated here.) v 7 / 16

The idea of pumping x u w z q v So y can be decomposed as uvw, where xu takes M from the initial state to q, v ɛ takes M once round the loop from q to q, wz takes M from q to an accepting state. But now M will be oblivious to whether, or how many times, we go round the v-loop! So we can pump in as many copies of the substring v as we like, knowing that we ll still end in an accepting state. 8 / 16

: official form basically summarizes what we ve just said. Pumping Lemma. Suppose L is a regular language. Then L has the following property. (P) There exists k 0 such that, for all strings x, y, z with xyz L and y k, there exist strings u, v, w such that y = uvw, v ɛ, and for every i 0 we have xuv i wz L. 9 / 16

: contrapositive form Since we want to use the pumping lemma to show a language isn t regular, we usually apply it in the following equivalent but back-to-front form. Suppose L is a language for which the following property holds: ( P) For all k 0, there exist strings x, y, z with xyz L and y k such that, for every decomposition of y as y = uvw where v ɛ, there is some i 0 for which xuv i wz L. Then L is not a regular language. N.B. can only be used to show a language isn t regular. Showing L satisfies (P) doesn t prove L is regular! To show that a language is regular, give some DFA or NFA or regular expression that defines it. 10 / 16

: a user s guide So to show some language L is not regular, it s enough to show that L satisfies ( P). Note that ( P) is quite a complex statement: k x, y, z u, v, w i Helpful intuition: Values for the variables quantified by are chosen by an imaginary opponent who is claiming that P is true. Values for the variables quantified by are chosen by you. We ll look at a simple example first, then offer some advice on the general pattern of argument. 11 / 16

Example 1 Showing a language isn t regular Consider L = {a n b n n 0}. We show that L satisfies ( P). Suppose k 0. (Opponent chooses the value of k. Our argument has to work for all k.) Consider the strings x = ɛ, y = a k, z = b k. Note that xyz L and y k as required. (We make a cunning choice of x, y, z.) Suppose now we re given a decomposition of y as uvw with v ɛ. (Opponent chooses this decomposition. Our argument has to work for all such u, v, w.) Let i = 0. (We make a cunning choice of i.) Then uv i w = uw = a l for some l < k. So xuv i wz = a l b k L. (We win!) Thus L satisfies ( P), so L isn t regular. 12 / 16

Use of pumping lemma: general pattern On the previous slide, the full argument is in black, whereas the parenthetical comments in blue are for pedagogical purposes only. The comments emphasise the care that is needed in dealing with the quantifiers in the property ( P). In general: You are not allowed to choose the number k 0. Your argument has to work for every possible value of k. You have to choose the strings x, y, z, which might depend on k. You must choose these to satisfy xyz L and y k. Also, y should be chosen cunningly to disallow pumping... You are not allowed to choose the strings u, v, w. Your argument has to work for every possible decomposition of y as uvw with v ɛ. You have to choose the number i ( 1) such that xuv i wz L. Here i might depend on all the previous data. 13 / 16

Example 2 Showing a language isn t regular Consider L = {a n2 n 0}. We show that L satisfies ( P): Suppose k 0. Let x = a k2 k, y = a k, z = ɛ, so xyz = a k2 L. Given any splitting of y as uvw with v ɛ, we have 1 v k. Take i = 2. Since xuvwz = a k2, we have that xuv 2 wz = a k2 + v. And 1 v k means that k 2 + 1 k 2 + v k 2 + k. However, there are no perfect squares between k 2 and k 2 + 2k + 1. So the length of xuv 2 wz isn t a perfect square. Thus xuv 2 wz L. Thus L satisfies ( P), so L isn t regular. 14 / 16

Subtle point: what are the x and z for? All the action seems to happen within y = uvw. Do we really need x and z? Often, we can get away with taking x = z = ɛ. But other choices may of x, z may give us more control over where the loop occurs. Example: L is the set of strings containing more a s than b s. First approach: Given k, take x = ɛ, y = a k+1 b k, z = ɛ. If y = uvw, we have three cases to consider, according to where v begins and ends. Second approach: Given k, take x = a k+1, y = b k, z = ɛ. Only one case to consider. Taking i = 2 always works. 15 / 16

Reading and prospectus Relevant reading: Kozen chapters 11, 12. That concludes the course material on (formal) regular languages. Next time, we start on the next level up in the Chomsky hierarchy: context-free languages. 16 / 16