Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany
Overview Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma
Finite Automata An intuitive example : supermarket door controller Figures 1,2,3 Probabilistic counterparts exist Markov chains, bayesian nets, etc. Not in this course
A finite automaton Figure 1.4 States : q, q, q 1 2 3 Startstate : q Acceptstate : q Transitions Output : accept or 1 2 reject Formally A finite automaton is a 5-tuple ( Q, Σ, δ, q, F) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet 3. δ : Q Σ Q is the transition function 4. q Q is the start stat e o 5. F Q is the set of accept state s 0
Ais the language of machine M we write LM ( ) = A A= { w wcontains at least one 1 and an even number of 0s follow the last 1 } Describe Σ= {0,1} 1 1 1 2 3 δ defined by 0 1 1 1 2 2 3 2 3 2 2 start state = { q } 2 M Q = { q, q, q } q F q q q q q q q q q
7,8,9 Other examples
Another example A generalisation : A i is the language of all strings where th e sum of the numbers is a multiple of iexcept that the sum is reset to 0 whenever the symbol reset appears Automaton B = 1. Q = { q,..., q} i 0 2. Σ ={0,1,2, reset } 3. δ ( q,0) = q j i i δ ( q,1) = q where k= ( j+ 1) mod i j δ ( q,2) = q where k= ( j+ 2) mod i j δ ( q, reset ) = q j k j k 4. q Q is start and accept state o 0
Formal definition of computation Let M be a finite automaton ( Q, Σ, δ, q, F) Let w= w... w be a string over Σ 1 n 0 M accepts wif a sequence of states r,..., r exists i n Qsuch that 1. r 0 0 δ i i+ 1 i+ 1 2. ( r, w ) = r for all i = 0,..., n 1 3. r n = q F 0 n M recognie z s language Aif A = { w M accepts w} A language is regular if some finite automaton r ecognizes it.
Designing finite automata Design automaton for language consisting of binary strings with an odd number of 1s Design first states Then transitions Accept and reject states Fig. 12
Another example Design an automaton to recognize the language of binary strings containing the string 001 as substring Fig 13
The regular operations Let A and B be languages We define : Union : A B = { x x A or x B} Concatenation : A B = { xy x Aand y B} Star * : { 1,..., n 0 and A = x x n each x A} note: always ε A * i Example A = { good, bad} B = { boy, girl }
Regular languages are closed under A set S is closed under an operation oif applying oon elements of S yields elements of S. Example: multiplication on natural numbe rs Counterexample :division of nat ural numbers Theorem 1.12 The class of the regular languages is cl osed under the union operation. In other words, if A and A are regular lan guages, so is A A Proof 1 2 1 2
M = ( Q, Σ, δ, q, F) constructed from M = ( Q, Σ, δ, q, F) and M = ( Q, Σ, δ, q, F) Define 1. Q= {( r, r ) r Q and r Q } 2. Σ=Σ Σ 1 1 1 1 1 1 2 2 2 2 2 2 1 2 1 1 2 2 1 2 3. δ(( r, r ), a) = ( δ ( r, a), δ ( r, a)) 1 2 1 1 2 2 4. q = ( q, q ) 1 2 5. F = {( r, r ) r F or r F } 1 2 1 1 2 2
Theorem 1.13 The class of the regular languages is cl osed under the concatenation operation. In other words, if A and A are regular lan guages, so is A A 1 2 1 2
Non deterministic finite automata Deterministic One successor state ε transitions not allowed Non deterministic Several successor states possible ε transitions possible Figure 14
Deterministic versus non deterministic Figure 15 computation
Example
Every NFA has an equivalent DFA Figures 17-18
Another NFA
Nondeterministic finite automaton A nondeterministic finite automaton is a 5-tuple ( Q, Σ, δ, q, F) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet 3. δ : Q Σ P( Q) is the transition function 4. q Q is the start state o ε 5. F Qis the set of accept states 0 Σ ε includes ε P( Q) the powerset o f Q
Example 18 Example
Formal definition of computation Let M be a nondeterministic finite automaton ( Q, Σ, δ, q, F) Let w= w... w be a string over Σ 1 n 0 M accepts wif a sequence of states r,..., r exists i n Qsuch that 1. r = q 0 0 2. r δ ( r, w ) for all i = 0,..., n 1 i+ 1 i i+ 1 3. r n F 0 n
Equivalence NFA and DFA Two machines are equivalent if they recognize the same language Theorem 1.19 Every nondeterministic finite automaton has an equivalent finite automaton Corollary 1.20 A language is regular if and onl yi f automaton recognizes it. some nondeterministic finite
Proof
An example
Closure under the regular operations Theorem 1.12/1.22 The class of the regular languages is cl osed under the union operation. In other words, if A and A are regular lan guages, so is A A 1 2 1 2 Theorem 1.23 The class of the regular languages is cl osed under the concatenation operation. Theorem 1.24 The class of the regular languages is cl osed under the star operation.
Proof idea Theorem 1.12/1.22 The class of the regular languages is cl osed under the union operation. In other words, if A and A are regular lan guages, so is A A 1 2 1 2 INSERT FIG 1.24
PROOF P 60!
Proof idea Theorem 1.23 The class of the regular languages is cl osed under the concatenation operation.
Proof idea Theorem 1.24 The class of the regular languages is cl osed under the star operation.
Regular expressions
R R ε R ε R
Applications Design of compilers + ε DD DD D D DD * * * * {,, }(.. ) where D = {0,..., 9} awk, grep, vi in unix (search for strings) Perl programming language Bioinformatics So called motifs (patterns occurring in sequences, e.g. proteins)
Theorem1.28 A language is regular if and only if som e regular expression describes it Proof through : Lemma1.29 If a language is described by some regul ar expression, then it is regular Lemma 1.32 If a language is regular, then it is des cribed by some regular expression
Lemma1.29 If a language is described by some regul ar expression, then it is regular
Lemma 1.32 If a language is regular, then it is des cribed by some regular expression Two steps DFA into GNFA (generalized nondeterministic finite automaton) GNFA into regular expression
Two states q and r are connected in both directions Exception : Start state Accept state One direction only Labels are regular expressions GNFAs
A generalized nondeterministic finite au tomaton is a 5-tuple ( Q, Σ, δ, q, q ) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet Formally 3. δ : ( Q { q }) ( Q { q }) R is the transitio n function accept 4. q Q is the start state start 5. q Q the accept state accept 0 start A GNFA =... where each Σ accepts w w1 wk wi if a sequence of states r,..., r exists in Qsuch that 1. r 2. r k = q = q start accept 3. for all i = 0,..., n 1, we ha ve that w L( R ) where R = δ ( r, r) i i 1 i 0 n i i * start accept
Convert DFA into GNFA Add new start state, with εarrow to old s tart state Add new accept state, with εarrows from ol d accept states If any arrows have multiple labels aand b, replace by a b Add arrows with label between sta tes where necessary
Convert GNFA into regular expression
Claim 1.34 Induction Proof For any GNFA G, Convert( G) is equivalent t o G Induction proof Basis k = 2; straightforward Induction Assume claim holds for Suppose k 1 states
Nonregular Languages Finite Automata have a finite memory Are the following languages regular? n n B = {0 1 n 0} C = { w whas an equal number of 0s and 1s} D= { w w has an equal number of occurences of 0 1 and 10} Mathematical proof necessary
The pumping lemma The pumping lemma If Ais regular, then there is a number p, for which if sis any string in Aof length at least p then s may be divided into three pieces s = xyz such that i 1. for each i 0, xy z A 2. y > 0 3. xy p Note y ε
Proof Idea
Nonregular languages B n n = {0 1 n 0} p Choose s = 0 1 p 1. string y consists only of 0s 2. string yconsists only of 1s 3. string yconsists of both 0s and 1s
C = { w whas an equal number of 0s and 1s} p Choose s = 0 1 p Would seem possible without condition 3! However, condition 3 of lemma states xy p Thus y consists of 0s only Then xyyz C Choice of scrucial. Consider s= (01) Alternative proof : B is nonregular * * If were regular, then 0 1 regular C C = B Regular languages closed under intersection p
F * = { ww w {0,1}} p p Choose s = 0 10 1 Condition 3 of lemma states xy p Thus y consists of 0s only Then xyyz F p p 0 0 would not work!
Choose s = 1 2 D i i+ 1 xy z xy z n 2 = {1 n> 0} Perfect squares : 0,1,4,9,16,25,36,49,... Consider and i i+ 1 Prove that for large : and cannot both be squares Let m which should be true according to p umping lemma n 2 = = p ixyz xy z 2 2 Then ( 1) 2 1 2 1 4 <2 1 i xy z i Choose y < 2 m+ 1= 2 xy z + 1 by setting i= n+ n = n+ = m+ p Indeed, observe y s = p = p p 4 2 4 + i <2 xy z + 1
i j E = { w 0 1 where i > j} p+ p Choose s = 0 1 Condition 3 of lemma states xy p Thus y consists of 0s only Then 0 xy z F
Summary Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma