TDDD65 Introduction to the Theory of Computation

TDDD65 Introduction to the Theory of Computation Lecture 2 Gustav Nordh, IDA gustav.nordh@liu.se 2012-08-31

Outline - Lecture 2 Closure properties of regular languages Regular expressions Equivalence of regular expressions and finite automata Pumping lemma for regular languages

Closure properties of regular languages The natural numbers N = {0, 1, 2, 3,...} are closed under multiplication in the sense that for any natural numbers x and y, x y is again a natural number The natural numbers are not closed under subtraction (3 5 = 2 which is not a natural number) Definition We say that a class of languagesc is closed under an operation op if applying op to any languages from C results in a language in C

Closure properties of regular languages Understanding under which operations a class of languagesc is closed is important!

Closure properties of regular languages Theorem The class of regular languages is closed under union (if L 1 and L 2 are regular languages, then so is L 1 L 2 ) Proof.

Closure properties of regular languages Theorem The class of regular languages is closed under concatenation (if L 1 and L 2 are regular languages, then so is L 1 L 2 ) Proof.

Closure properties of regular languages Theorem The class of regular languages is closed under star (if L 1 is a regular language, then so is L 1 ) Proof.

Regular expressions

Definition of regular expressions Definition L(R) denotes the language described by the regular expression R. R is a regular expression if R is 1 a for a Σ, L(a) = {a} 2, L() = {} 3, L( ) = 4 R 1 + R 2 where R 1 and R 2 are regular expressions, L(R 1 + R 2 ) = L(R 1 ) L(R 2 ) 5 R 1 R 2 where R 1 and R 2 are regular expressions, L(R 1 R 2 ) = L(R 1 )L(R 2 ) 6 R 1 where R 1 is a regular expression, L(R 1 ) = L(R 1) has higher precedence than concatenation and +, concatenation has higher precedence than +

Examples of regular expressions Example (0+1) 0 binary strings ending with 0 (0+1) 00(0+1) binary strings with at least two consecutive 0 s (0+1+2+3+4+5+6+7+8+9) 1234(0+ 1+2+ 3+4+5+6+7+8+9)

Equivalence with finite automata Theorem A language is regular if and only if some regular expression describes it

Equivalence with finite automata If a language is described by a regular expression then it is recognized by a NFA

Equivalence with finite automata If a language is described by a regular expression then it is recognized by a NFA Proof. R = a for a Σ, L(R) = {a} R =, L(R) = {} a R =, L(R) =

Equivalence with finite automata If a language is described by a regular expression then it is recognized by a NFA Proof. R = R 1 + R 2, L(R) = L(R 1 ) L(R 2 )

Equivalence with finite automata If a language is described by a regular expression then it is recognized by a NFA Proof. R = R 1 R 2, L(R) = L(R 1 )L(R 2 )

Equivalence with finite automata If a language is described by a regular expression then it is recognized by a NFA Proof. R = R 1, L(R) = L(R 1)

Equivalence with finite automata: Example Example (0+1) to NFA 0 0 1 1 0+1 0 1 (0+1) 0 1

Equivalence with finite automata RegExp Closure Properties NFA Subset Construction DFA

Equivalence with finite automata RegExp Closure Properties GNFA Construction NFA Subset Construction DFA

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Idea: Use a generalized NFA (GNFA) where the transition arrows can be labeled by regular expressions Given a DFA Add a new start state with an transition to the old start state Add a new accept state with transitions from all old accept states Replace transitions of the form a, b, c by a+b + c

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Add a new start state with an transition to the old start state Add a new accept state with transitions from all old accept states Replace transitions of the form a, b, c by a+b + c 0 1 1 q 1 q 2 0

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate a state different from the start and accept state (reducing the number of states by 1) R 2 R 1 R 3 q 1 q e q 2 R 4

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 2 0 1 q s 1 q 1 q 2 q F 0

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 2 0 1 q s 1 q 1 q 2 q F 0 Using the rule R 1 R 2 R 3 + R 4 the new transition from q 1 to q F is labeled 11 +

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 2 0 1 q s 1 q 1 q 2 q F 0 11 0+0 q s q 1 11 + q F

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 1 11 0+0 q s q 1 11 + q F

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 1 11 0+0 q s q 1 11 + q F Using the rule R 1 R 2 R 3 + R 4 the new transition from q s to q F is labeled (11 0+0) (11 + )+

Equivalence with finite automata If a language is recognized by a DFA then it is described by a regular expression Eliminate q 1 11 0+0 q s q 1 11 + q F q s (11 0+0) (11 + )+ q F

Equivalence with finite automata RegExp Closure Properties GNFA Construction NFA Subset Construction DFA

Nonregular languages

Nonregular languages Example L = {a n b n n 0} is not a regular language Why?

Nonregular languages Example L = {a n b n n 0} is not a regular language Why? Imagine a DFA that recognize L When reading a string the DFA needs to keep track of the number of a s seen so far so that it can check that the same number of b s follow The number of a s is unbounded so the DFA needs to remember an unbounded number This is not possible since the memory of the DFA is bounded (finite number of states)

Nonregular languages, pumping Given a sufficiently long string s in a regular language L Then the DFA that recognizes L will be in the same state q t several times s = s 1 s j 1 s j s k 1 s k s n L }{{}}{{}}{{} q t q t q accept

Nonregular languages, pumping Given a sufficiently long string s in a regular language L Then the DFA that recognizes L will be in the same state q t several times So for any i 0 s = s 1 s j 1 s j s k 1 s k s n L }{{}}{{}}{{} q t q t q accept s = s 1 s j 1 (s j s k 1 ) i s k s n L }{{}}{{}}{{} q t q t q accept

The Pumping for Regular Languages If L is a regular language, then there exists a positive integer p (the pumping length) such that every string s L, s p, can be partitioned into three pieces, s = xyz, such that the following conditions hold: y > 0, xy p, and for each i 0, xy i z L

Proof of the Pumping If L is a regular language, then there exists a positive integer p (the pumping length) such that every string s L, s p, can be partitioned into three pieces, s = xyz, such that the following conditions hold: y > 0, xy p, and for each i 0, xy i z L Proof. If L is regular, let M be a DFA recognizing L. Let p be the number of states of M and let s = s 1 s 2 s n L with n p. Let r 1 r n+1 be the sequence of states that M enters while reading s. Among the first p+1 states in this sequence, two must be the same. Let r j = r k (j < k) be such two. Let x = s 1 s j 1, y = s j s k 1, and z = s k s n.

Strategy for showing that a language L is not regular 1 Assume that L is regular (with the aim of reaching a contradiction). 2 Choose a string s L, such that s p where p is the pumping length given by the Pumping. The Pumping then says that s can be partitioned into three pieces s = xyz where y > 0 and xy p, such that xy i z L for all i 0. 3 Show that for ALL POSSIBLE partitions of s = xyz satisfying y > 0 and xy p there exists an i such that xy i z / L. If we succeed to show this, then we have a contradiction with the Pumping and our assumption that L is regular is wrong.

The standard example Example L = {a n b n n 0} is not regular. Proof. 1 Assume that L is regular 2 Choose s = a p b p where p is the pumping length given by the Pumping. s L and s p, so the Pumping says that s can be partitioned into three pieces s = xyz where y > 0 and xy p, such that xy i z L for all i 0. 3 s = a p b p = xyz where y > 0 and xy p. For all such partitions of s, y is a string of a s of length at least 1. Choose i = 2, xy 2 z contains more a s than b s, thus, xy 2 z / L. This contradicts the Pumping, hence, our assumption that L is regular is wrong.

Regular languages: Summary Regular languages are the languages recognized by DFAs For every NFA there is an equivalent DFA The subset construction A language can be described by a regular expression if and only if it can be recognized by a DFA GNFA construction, closure properties There are simple non-regular languages Pumping lemma