FORMAL LANGAUGES & AUTOMATA - PDF Free Download

FORMAL LANGAUGES & AUTOMATA VICKY G We re going to study the reltionship between specil kind of mchine (utomt), lnguges nd specil kind of lgebr (monoids). Automt Regulr Lnguges Monoids 1. Monoids nd Lnguges A monoid is set M together with n ssocitive binry opertion, which hs n identity. Thus, monoid hs the following lgebric reltions Note. The identity of M is unique., b M b M,, b, c M (bc) = (b)c, 1 M such tht 1 = = 1 M. 1.1. Alphbets, Words nd Lnguges Study (sets of) sequences of symbols. Definition: An lphbet is finite non-empty set A. A letter is n element of A nd word (or string) over A is finite sequence of elements of A. Exmple 1.1. If we hve the lphbet A = {0, 1} then the following re words over A: 0, 10, 01011. If A = {, b} then, b, b, b,, b,... re ll words over A. If A is the stndrd lphbet {, b,...,z} then ct nd tz re words over A. Note. If 1, 2,..., n, 1, 2,..., n A then 1 2... n = 1 2... m n = m nd i = i for 1 i n. We define A + = { 1 2... n n N, i A, 1 i n} to be the set of ll non-empty words over A. Also, introduce the empty word ε (in some books denoted 1 or λ). Now, A = A + {ε} is the set of ll words over A. A lnguge (over A) is subset of A. Definition: A lnguge L is finite if L <. A lnguge is cofinite if L c = A \ L Exmple 1.2., {ε}, {, b, b} re finite lnguges. 1

2 VICKY G Length of Words We hve tht the empty words hs no letters nd so ε = 0. Also, Note. xy = x + y 1 2... n = n. Exmple 1.3. bb = 4, = 1 nd = b = 2. Let us tke the lnguge L = {w A w 2}, then this is cofinite. This is becuse L c = {ε} A is finite. Conctention of Words Tke x, y A then we form new word xy by putting x nd y together, end to end. Exmple 1.4. Let x = b nd y = bc then by xy we refer to bbc. Note. εx = x = xε for ll x A. Also (xy)z = x(yz) for ll x, y, z A. Hence A is monoid with identity element ε, clled the free monoid on A. We note tht A is not group s only ε hs n inverse element. This is becuse given ny x there cn never be y such tht xy = ε. For A, n (n 0) is the word consisting of n s, i.e. 0 = ε, 1 =, 2 =, 3 =, etc. Note. {} = {ε,,,,... } = {ε,, 2, 3,... } = { n n 0}. For ny x A x n = xx...x }{{}. n times Exmple 1.5. x 0 = ε if x = b then x 3 = bbb. Clerly x n x m = x n+m nd (x n ) m = x nm for ll n, m 0, i.e. the index lws hold. Letter Count If A nd x A, then x = the number of occurrences of in x. Exmple 1.6. If A = {, b, c} then bc = 2, ε b = 0, ccc b = c 2 c b = 0 nd c 2 c c = 3. Prefix y is prefix of word x A if x = yz for some z A. We note tht ε is prefix of x for ny x A s x = εx. Also, ny word x A is prefix of itself becuse x = xε. If x = 2 b, then the prefixes of x re ε,, 2, 2 b.

1.2. Opertions on Lnguges FORMAL LANGAUGES & AUTOMATA 3 Recll tht lnguge over A is subset of A. We hve tht, A re lnguges over A nd L A for ny lnguge L. Boolen Opertions If L, K re lnguges then L K, L K, L \ K nd L c = A \ L re lso lnguges. Product: Let L, K A then we define LK = {xy x L, y K}. Exmple 1.7. If we hve {, b} nd {b, bc} re lnguges then {, b}{b, bc} = {b, bc, bb, bbc}. As n exercise show tht (KL)M = K(LM). For w A nd L A, usully write wl for {w}l nd Lw for L{w}, etc. So wl = {wv v L} nd KwL = K{w}L = {uwv K, v L}. As usul, L n denotes L }.{{.. L}. n times So L 1 = L, L 2 = LL, L 3 = LLL,..., L n+1 = L n L, L 0 = {ε}. The (Kleene) Str: of L A is L = {x 1 x 2...x n n 0 nd x i L, 1 i n} = L 0 L 1 L 2... = n 0 L n. Exmple 1.8. A, L = { 2 } then we hve L = {ε, 2, 4, 6,... } = { 2n n 0} Exmple 1.9., b A, L = {b, b} then we hve Exmple 1.10. {ε} = {ε} = L = {ε, b, b, bb, bb, bb, bb,... }

4 VICKY G If L = {w} sometimes write w for {w} but be creful: b = {}{b} = {}{b n : n 0} = {b n n 0} the str is only ttributed to the b. So, {b} is written s (b) = { (b) n n 0 } = {ε, b, bb, bbb,... }. Thus we hve A b mens A {}{b} {} = {wb n w A, n 0}. 2. Automt A point of grmmr the singulr form of utomt is utomton. 2.1. Vrious kinds We concentrte on two kinds of finite stte utomt. DFA: complete, deterministic, finite stte utomt NDA: non-deterministic finite stte utomt (usully not complete either). Formlly DFA is 5-tuple where we hve A = (A, Q, δ, q 0, F) A is n lphbet (so A < ), Q is finite set of sttes, q 0 Q is the initil stte, F Q is the set of finl (or ccepting, or terminl) sttes, δ : Q A Q is the stte trnsition function or next stte function. 2.2. Stte Trnsition Digrms. There re vrious different kinds of objects in Stte Trn- Sttes re represented by sition Digrm. Stte q is q, Finl stte is or, Initil stte by, Indicte δ(q 1, ) = q 2 by q 1 q 2. Exmple 2.1. Let A = {, b} then the following q 0 q 1, b, b

is the stte trnsition digrm of the DFA FORMAL LANGAUGES & AUTOMATA 5 Now we describe δ s We cn describe δ by tble A = ( {, b}, {q 0, q 1 }, δ, q 0, {q 1 } ). δ(q 0, ) = q 1 = δ(q 0, b), δ(q 1, ) = q 0 = δ(q 1, b). b q 0 q 1 q 1 q 1 q 0 q 0 For DFA A = (A, Q, δ, q 0, F) we extend δ to give function δ : Q A Q s follows δ(q, ε) = q q Q, δ(q, w) = δ ( δ(q, w), ) w A, A, q Q. Returning to the exmple bove we hve δ(q 0, b) = δ ( δ(q 0, b), ) = δ ( δ ( δ(q 0, ), b ), ) = δ ( δ(q 1, b), ) = δ(q 0, ) = q 1 A DFA A = (Q, A, δ, q 0, F) hs δ : Q A Q function. Thus becuse δ is function we hve for ll (q, ) Q A, δ(q, ) is defined, thus A is complete. Also for ll (q, ) Q A,! δ(q, ) mens A is deterministic. Recll: δ : Q A w A, A. Q is given by δ(q, ε) = q nd δ(q, w) = δ ( δ(q, w), ) where Fct: For ll u, v A we hve δ(q, uv) = δ ( (q, u), v ). The proof of this is done by induction on v, i.e. the number of letters in v. Definition: A word w A is ccepted by A if δ(q 0, w) F nd w A is rejected by A if δ(q 0, w) F. The lnguge recognised by A is

6 VICKY G L(A) = {w A δ(q 0, w) F }, i.e. the set of words tht A ccepts. A lnguge L A is recognisble if there exists DFA A with L = L(A) Exmple 2.2. Find DFA of A = {, b} which recognises L = {w A w begins with b} = ba q 0 q 1 b q 2 b q 3, b Thus we hve tht L(A) = L., b Exmple 2.3. Find DFA A which recognises L = {w A w b 2} q 0 b q 1 b q 2 q 3 b, b Note. Using different nottion we cn express L s Exmple 2.4. Given DFA of A L = {} {} {b}{} {} {b}{} {b}{} = b b b q 0 b, b q 1 q 2 find the lnguge tht is recognised by A. This is Exmple 2.5. Given DFA A, b L(A) = b = {} {b} = { n b n N 0 }

FORMAL LANGAUGES & AUTOMATA 7 q 0 q 1 b q 2 b find the lnguge tht is recognised by A. We cn see tht A ccepts words of the form (for n, m, h, k N 0 ) n+1 b, b m n+1 b, b m n+1 b h+2 k+1 b. We now guess tht b Suppose tht v L(A) then L(A) = A b = {wb w A }. δ(q 0, v) = q 2. For this to hppen we must hve v = v b where δ(q 0, v ) = q 1. For this to hppen we must hve v = v nd hence v = v b = v b v A b nd L(A) A b. Conversely let w A b so w = vb for some v A. Notice tht δ(q i, b) = q 2 for ny i = 0, 1, 2. Hence Hence A b L(A) nd so A b = L(A). δ(q 0, w) = δ(q 0, vb) = δ ( δ(q 0, v), b ) = q 2 F. Exmple 2.6 (A Bsic Automton). The following utomton represents vending mchine. The cost of goods is 20p nd it hs sttes {0, 5, 10, 15, 20, X}. The DFA A consists of A = {5, 10, 20}, q 0 = {0}, F = {20}, δ(x, ) = X, δ(u, v) = u + v.

8 VICKY G 0 5 5 10 5 20 10 20 10 5 10, 20 X 10, 20 15 5, 10, 20 20 5, 10, 20 5 We hve the lnguge recognised by A is L(A) = {555, 5510, 20, 1055,...}. Nottion: for n lphbet A write Rec A for the clss of recognisble lnguges over A. So, L Rec A mens L is recognisble, i.e. there exists DFA A with L = L(A) To show L Rec A we must find DFA A with L = L(A). How do we show tht L Rec A? Lemm 2.1 (Pumping Lemm - PL). Let L Rec A. Then there exists N N such tht for ll w L with w N there exists fctoristion w = uvx (u, v, x A ) with v ε nd uv N nd uv k x L for ll k 0 (i.e. ux, uvx, uv 2 x,... ll lie in L). Note. (1) u, v, x A ; usully not in L; u, x cn be mpty; we must hve v ε. (2) N is pumping length for L; if M N, then M is lso pumping length. (3) The conditions of the pumping lemm re necessry for L Rec A but not sufficient. Exmples of the use of the Pumping Lemm (1) A = {, b}; L = { n b n n 0} is not recognisble. Proof. Suppose L Rec A. Let N be pumping length for L. Choose w = n b n, so w L nd w = 2N N. So, there exists fctoristion w = uvx where uv N nd v ε. So, u = r, v = s nd x = t b N where r + s + t = N (u > w = uvx = N b N ) nd s 0. By the pumping lemm, uv 2 x L, i.e. r s s t b N = N+s b N L but this is contrdiction s N + s N. Hence L Rec A. (2) A = {, b}, L = {w A w = w b }. We clim tht L Rec A.

FORMAL LANGAUGES & AUTOMATA 9 Proof. If L Rec A, we pick pumping length N. Choose w = N b N then w L, w N nd proceed s in (1). Generl strtegy for use of PL Given L A, suppose we wnt to show L Rec A. Assume L Rec A nd im for contrdiction. Let N be pumping length for L. Choose w L with w N. By the pumping lemm w hs fct stisfying the conditions of PL. Use this to get contrdiction. Therefore L Rec A (Note: need only choose one w - choose n esy one!). (3) A = {}, L = { p p is prime}. Clim L Rec A. Proof. Suppose L Rec A. Let N be pumping length for L. Let p be prime, p N. Then w = p L nd w N. By PL there exists fctoristion w = uvx where uv N nd v ε. Then u = r, v = s, x = t where r + s N, s 0 nd r + s + t = p (s w = p = uvx). By PL, the words uv k x L for ll k 0. We hve uv k x = r sk t = r+sk+t = p+(k 1)s. Choose k = p + 1, then uv k x L; but uv k x = p+ps = p(1+s) nd p(1 + s) is not prime s s 0. Contrdiction nd hence L Rec A. Proof of PL. Let L Rec A. Then L = L(A) for some DFA A, where A = (A, Q, δ, q 0, F). Let N = Q, the number of sttes of A. If w L nd w N,then δ(q 0, w) F. Let w = 1 2... N... m where i A nd m = w N. As w L we hve q 1 0 q 2 1 q 3 N 2 q N+1 m N q m where q i Q, q 0 is the initil stte nd q m F nd δ(q i, i ) = q i+1 where 0 i m 1. Since N + 1 > N = Q, the list q 0, q 1...,q N must contin repets; sy q i = q j where 0 i < j N m. Then we hve 1 i q 0 q 1 q i q j q j+1 q m m j i+1 q i+1 Put u = 1... i, v = i+1... j, x = j+1... m (u = ε if i = 0, v ε s i < j, x = ε if j = N = m). We hve uv = j N, v ε, w = uvx. For ny k 0, δ(q 0, uv k x) = δ ( δ(q 0, u), v k x ) = δ(q i, v k x) = δ ( δ(q i, v k ), x ) = δ(q i, x) = δ(q N, x) = q m F. Therefore uv k L for ll k 0.

10 VICKY G 2.3. NDAs: Non-Determininstic finite stte Automt Exmple 2.7. In Exercises 2, you re sked to find DFA which ccepts L = {bwb w A } where A = {, b}. Wnt to write, b q 0 q 1 b q 2 q 3 b q 4 but this is not DFA (neither complete nor deterministic). It is n exmple of n NDA. Definition: An NDA A is 5-tuple (A, Q, E, I, F) where A is finite lphbet, Q is finite set of sttes, E is subset of Q A Q, I Q is set of initil sttes, F Q is set of finl sttes. Elements of E hve the form (p,, q where p, q Q nd A. These re clled edges. In the bove exmple we cn see tht our edges re (q 0,, q 1 ), (q 1, b, q 2 ), (q 2,, q 2 ), (q 2, b, q 2 ), (q 2,, q 3 ), (q 3, b, q 4 )). A pth in A (of length n 1) is finite sequence of edges. So, (q 1, 1, q 1 ), (q 1, 2, q 2 ),..., (q n 1, n, q n ) is pth. The lbel of the pth is 1 2... n ; in the stte trnsition digrm we hve n 1 p 1 1 q 2 1 q 3 2 q n n 1 q n often bbrevited by removing the circles from round the sttes. For ech q Q there exists pth ε q of length 0 t q, with lbel ε. We do not (usully) drw ε q t q in our stte trnsition digrm. p w q(w A ) mens tht there exists pth from p to q in A, with lbel w. Note tht there exists p ε p for ny p Q. Definition: w A is ccepted by the NDA A if there exists pth q 0 w q for some q 0 I nd q F. Definition: The lnguge recognised by the NDA A is L(A) = {w A w is ccepted by A}. Note tht in the exmple the lnguge recognised by the NDA is

FORMAL LANGAUGES & AUTOMATA 11 {bwb w A } We clim tht for lnguge L A we hve tht A = {, b}. L is recognised by DFA L is recognised by n NDA. Proposition. We strt by showing L Rec A L is recognised by n NDA. Proof. Let L = L(A) where A = (A, Q, δ, q 0, F) is DFA. Put nd I = {q 0 }. Now we hve n NDA nd L(A) = L(A ). E = { (q,, δ(q, )) q Q, A } Q A Q A = (A, Q, E, I, F) We cn think of DFA s specil kind of NDA, one in which there exists one initil stte nd for ll q Q, A, there exists exctly one triple (q,, p). Now, we need to prove the converse. First we define some nottion. Let A = (A, Q, E, I, F) be n NDA. For S Q, w A, we define Sw = {q Q p w q for some p S}. Note tht Sw Q (so there exists only finitely mny sets of the form Sw). Exmple 2.8. Given n NDA A 1 2 b 3 4 5 Now we hve tht {1, 2}b = {3} = {1}b, {2}b =, {1} = {5}, {1} = {2, 4}, = = w w.

12 VICKY G In generl, for A we hve tht Sε = S, S = {p Q (q,, p), q S}, S 1 2... n = (... ( (S 1 ) 2 )... ) n ), w = w A. We now construct DFA from our NDA A = (, Q, E, I, F) s follows. Put Q = {Iw w A }. Note. We hve Q P(Q) (set of ll subsets of Q). As Q nd hence P(Q) re finite, then Q is finite. How do we find Q? Sy A = { 1, 2..., n }. Now write down I = Iε nd clculte I i for ech i. Then for ech I i we clculte I i j for ll i nd for ll j. We continue this process until we hve list Iw 1, Iw 2,...,Iw k such tht Iw k i is lredy in the list for ll h nd for ll i. Now given n NDA A = (A, Q, E, I, F) we define DFA B = (A, Q, δ, I, F ) where δ(s, ) = S nd F = {s Q S F }. Note. δ(s, 1 2... n ) = δ ( δ(δ(s, 1 ), 2 )...), n ) = (...(S1 ) 2 )... n ) = S1... n Exmple 2.9 (Construction of DFA from nd NDA). Let our NDA A be 1 2 b 3 4 5 Then the lnguge ccepted by A is L(A) = {ε, b, }. We now clculte our set Q. We note tht the set of initil sttes is I = {1, 3}. Then

FORMAL LANGAUGES & AUTOMATA 13 We hve DFA B where I = {1, 3} Ibb = Ib =, I = {2, 4} I = {5} =, Ib = Ib = {5}b =, I = {2, 4} = {5} Ib = {3} =, Ib = {2, 4}b = {3} Ibb = {3}b =. B = (A, Q, δ, q 0, F ). For this we hve Q = { I, {2, 4},, {3}, {5} } q 0 = I = {1, 3} F = {S Q S F } = { S Q S {3, 5} } = { I, {3}, {5} } nd δ is given s in the following stte trnsition digrm. {2, 4} b {3} I b, b, b, b {5} Then we hve L(B) = {ε, b, } Proposition. L A recognised by n NDA L Rec A. Proof. Let L = L(A) where A = (A, Q, E, I, F). Define DFA B = (A, Q, δ, q 0, F ) s bove. Now recll tht Q = {Iw w A }, δ(s, ) = S, q 0 = I, F = {δ Q S F }. Clim. L(A) = L(B) We hve tht

14 VICKY G w L(B) δ(q 0, w) F δ(i, w) F Iw F Iw F there exists pth p w q for some p I, q F w L(A). Hence this gives us our theorem. Theorem 2.1. L Rec A L = L(A) for some NDA A. Exmple 2.10. Any A such tht A Rec A s the DFA q 0 for ll A recognises A. Exmple 2.11. The Rec A s recognised by the NDA q 0 Exmple 2.12. {ε} Rec A s {ε} is recognisble by the NDA q 0 Exmple 2.13. For w = 1 2... n A + ( i A) then {w} is recognisble by the NDA n 1 q 1 0 q 2 1 q 3 2 q n n 1 q n So ll singleton lnguges lie in RecA. 3. Rtionl Lnguges nd Kleene s Theorem 3.1. Closure Properties of Rec A Proposition (1). L Rec A L c Rec A

FORMAL LANGAUGES & AUTOMATA 15 Proof. If L Rec A then L = L(A) where A = (A, Q, δ, q 0, F). Let A c = (A, Q, δ, q 0, F c ). Then w L(A c ) δ(q 0, w) F c δ(q 0, w) F w L(A) = L w L c. Therefore L(A c ) = L c nd L c Rec A. Proposition (2). L, K Rec A L K Rec A Proof. Let L = L(A) nd K = L(B) where A = (A, Q, E, I, F) nd B = (A, P, E, I, F ) re NDAs. Assume Q P =. Put C = (A, Q P, E E, I I, F F ). Then w L L w L or w K pth q w o q in A with q 0 I nd q F or pth p w 0 p in B with p 0 I nd p F. pth r w 0 r in C with r 0 I I nd r F F (since P Q = ) w L(C). Therefore L K Rec A. Corollry 3.1. L 1, L 2,..., L m Rec A L 1 L 2 L m Rec A. Proof. Proposition 2 nd Induction. Corollry 3.2. L, K Rec A L K Rec A. Proof. L K = ( L c K c)c ; hence result by propositions 1 nd 2. Corollry 3.3. L 1, L 2,..., L m Rec A L 1 L 2 L m Rec A Proof. Corollry 3.2 nd Induction. Corollry 3.4. L, K Rec A L \ K Rec A Proof. Exercise Sheet 4. Note. Rec A is not closed under infinite nd (Exercise Sheet 4). Proposition (3). Let L, K Rec A. Then LK Rec A (Recll LK = {wv w L, v K}). Proof. First ssume ε K. Let L = L(A) nd K = L(B) where re NDAs nd P Q =. [We would like A = (A, Q, E, I, F) nd B = (A, P, E, I, F )

16 VICKY G w q 0 q p 0 v p I F I F but this would not seprte A nd B dequtely]. Put C = (A, Q P, Ẽ, I, F ) where Ẽ = E E {(q, r) q F nd (p 0,, r) E for some p 0 I}. A B q 0 w q p 0 r v p I F I F (p 0,, r) E (q,, r) Ẽ. Now we hve w LK w = uv, some u L, v K w = uv, some u L, v = v K, A (s ε K) q o I, q F, q 0 u q in A nd p 0 I, p F, p 0 v p in B q 0 I, q F, q 0 u q in A nd p o I, r P, p F with p 0 r v p in B q 0 I, p F, q 0 uv p in C uv = uv L(C). Hence L(C) = LK nd so LK Rec A. Hence, if ε K, then LK Rec A. Finlly, if ε K, then K = K \ {ε} is recognisble by Corollry 3.4. We hve LK = L ( K {ε} ) = LK L{ε} (Exercise 1) = LK L nd LK Rec A by the first prt of the proof, so LK Rec A by Proposition 2. Proposition. L Rec A L Rec A

FORMAL LANGAUGES & AUTOMATA 17 Proof. Recll tht L = n 0L n = L 0 L 1 L 2... = {ε} L L 2 L 3... Since L is recognisble, L = L(A) for some DFA A = (A, Q, δ, q 0, F). Clim. We clim, L = L(B) where B = (A, P, σ, p 0, G) for DFA B with σ(p, ) p 0 for ny p P, A. Proof. Put P = Q {p 0 } where p 0 Q nd σ(q, ) = δ(q, ) for ll q Q, A, σ(p 0, ) = δ(q 0, ) q 0 q p 0 Note. σ(p, ) p 0 for ll p P, A. Now put G = { F F {p 0 } if ε L(A) (i.e. q 0 F), if ε L(A) (i.e. q 0 F). Now check tht L(A) = L(B) Let L = L(B) where B = (A, P, σ, p 0, G) is DFA with σ(p, ) p 0 for ll p P, A. Put C = (A, P, E, {p 0 }, {p 0 }) where E = { (p,, σ(p, )) p P, A } { (p,, p 0 ) p P, σ(p, ) G } p 0 p p Note. ε L nd ε L(C)

18 VICKY G Suppose w ε. Then w L w = w 1 w 2...w t with t 1, w i L \ {ε} for ll i, w = w 1 w 2...w t, t 1, σ(p 0, w i ) G i, w w = w 1 w 2...w t, t 1, p i o p in B i, p G, w w = w 1...w t, t 1, p i 0 p0 in C i, p 0 w p 0 in C, w L(C). Hence we hve L L(C). Conversely let w L(C) p 0 w p 0 in C. Let w = 1 2... n ( i A) nd p 1 0 p 2 1 p n 2 p n = p 0 mongst 1,..., n. Let i 1, i 2,..., i t = n be such tht Put 0 < i 1 < i 2 < < i t nd p ij = p 0. w 1 = 1 2... i1, w 2 = q i1 +1... i2,. w t = it 1 +1... it=n. w j v j ij v j ij Then w = w 1 w 2...w t nd p 0 p0 in C for ll j (p 0 p p0 in C, so in B, p 0 p p w j G in B). So, w = w 1 w 2...w t nd p 0 p G in B, i.e. w = w 1 w 2...w t where w j L(B) = L for ll j w L. Therefore L(C) L nd so L(C) = L. Exmples of using Closure Properties Exmple 3.1. L finite L Rec A. Proof. L finite L = or L = {w 1, w 2,...,w n } for some w i A. We know Rec A nd {w i } Rec A for ll i. Therefore L = {w 1 } {w 2 } {w n } is recognisble by Corollry 3.5. Exmple 3.2. L cofinite L Rec A. Proof. L cofinite L c is finite L c Rec A by bove exmple. Hence L = (L c ) c Rec A by proposition 1.

FORMAL LANGAUGES & AUTOMATA 19 Exmple 3.3. A = {, b}. Then L = A A A bba Rec A. Proof. A, {}, {bb} Rec A so A A, A bba Rec A by proposition 7 (twice). Hence L = A A A bba Rec A. Exmple 3.4. L = { n n is not prime Rec A }. Proof. L Rec A L c Rec A (by Proposition 1). But L c = { p p is prime} is not in Rec A. Contrdiction. Hence L Rec A. Exmple 3.5. L = { w {, b} w = w b } n lternte rgument for L Rec A. Let L = { m b n m 0, n 0} Rec A (See Exercises 4). Suppose L Rec A, then L K Rec A by Corollry 6. But L K = { n b n n 0} we know this is not recognisble by the Pumping Lemm. This is contrdiction nd hence L is not recognisble. Note. B A then for L B we hve L Rec B L Rec A (check). Exmple 3.6. () L = { n b p n 0, p prime } Rec A, A = {, b}. Now, L Rec A L B Rec A {b p p is prime} Rec A contrdiction nd hence L is not recognisble. In fct, L = { n b p n 1, p prime} is not recognisble (see lter for why). (b) L b Rec A Proof. L b Rec A (L b ) + b Rec A. Recll tt for n lphbet A, A + = A \ {ε}, so + = { n n 1} note + b = ( \ {ε} ) b Rec A. But (L b ) + b = L nd L is not recognisble, contrdiction. Hence L b Rec A. (c) L b stisfies the conditions of the Pumping Lemm (exercise). 3.2. Rtionl Opertions Let A be n lphbet. The rtionl opertions (on lnguges over A) re union, product nd str, i.e. L, K L K, L, K LK nd L L. The Boolen opertions re union, intersection nd complement, i.e. L, K L K, L, K L K nd L L c. We hve seen tht RecA is closed under the rtionl opertions nd the Boolen opertions. Definition: L A is rtionl if (i) L is finite or

20 VICKY G (ii) L cn be obtined from finite lnguges by pplying rtionl opertions finite number of times. RtA is the set of ll rtionl lnguges over A. Observtion: We hve lredy proved tht ny finite lnguge lies in RecA nd if L, K Rec A then L K, LK, L Rec A consequently RtA Rec A. Exmple 3.7. (), {w}, {b, b, 6 bc} re finite nd so rtionl. (b) {b, b, 6 bc}, b = {}{b} {} RtA. (c) L = {bwb w A } = {b}{, b} {b} RtA (d) L = { x {, b} x 1 } = b b b RtA. Theorem 3.1 (Kleene s Theorem). RtA = Rec A. Proof. We hve lredy observed tht RtA Rec A. Let L Rec A. Then L = L(A) for some NDA A = (A, Q, E, I, F). We prove by induction on E tht L RtA. If E = 0 then L = {ε} if I F nd L = if I F =. So L is finite, hence L RtA. Now let E = n > 0 nd suppose L(B) RtA for ll NDAs B with the number of edges of B < n. Let e E, so e = (p,, q) nd define 4 new NDAs s follows: A 0 = (A, Q, E \ {e}, I, F), A 1 = (A, Q, E \ {e}, I, {p}), A 2 = (A, Q, E \ {e}, {q}, {p}), A 3 = (A, Q, E \ {e}, {q}, F). Let L i = L(A i ). By our induction hypothesis ech L i RtA (s ech A i hs n 1 edges). Hence L 4 = L 0 L 1 {} ( L 2 {} ) L 3 RtA. We clim tht L = L 4. First we note tht L 0 = L(A 0 ) = {w L q o w r, q o I, r F not involving the edge e}, L = L(A). Let w L 1 {} ( L 2 {} ) L3. Then w = u(v 1 v 2...v m )x, where u L 1, v i L 2, x L 3 with 1 i m nd there exists pth in A u q 0 p q x r v i

FORMAL LANGAUGES & AUTOMATA 21 Therefore w L(A) = L. We hve shown tht L 4 L. Conversely suppose w L(A). w Then there exists pth q 0 r in A. I F w If the edge e is not used in this pth, we hve q 0 r in A in A 0 so w L(A 0 ) = L 0 L 4. I F Suppose now tht w = 1 2... n nd q 1 0 q 2 1 q 2... n q n = r where the edge e occurs. Suppose tht (q i1 1, i1, q i1 ),...,(q it 1, it, q it ) re ll the occurrences of e = (p,, q) (ech ij = ). Then w = w 0 w 1...w t where w 0 q 0 p w 1 q p q p w t q r Hence w 0 L(A 1 ) = L 1, w i L(A 2 ) = L 2 (1 i < t), w t L(A 3 ) = L 3. Hence w = w 0 w 1...q t 1 w t L 1 (L 2 ) L 3 L 4. Therefore L L 4. Hence L = L 4 nd L Rec A. Hence for L A we know (i) L = L(A) for some DFA A (L Rec A ), (ii) L = L(A) for some NDA A, (iii) L is rtionl (L Rec A ). 3.3. Specilistion to A = {} Let A = (A, Q, δ, 0, F) be DFA. Then we hve tht q Q is ccesible if δ(q 0, w) = q for some w A. A is ccesible if every stte of A is ccesible. Clerly if A hs inccesible sttes, these cn be removed to give DFA A with L(A ) = L(A) so we lose nothing by ssuming our DFAs re ccesible. We ssume from now on our DFAs re ccesible. Proposition. Let L ( {} ). Then L Rec A L = K J( p ) for some finite K, J A. Proof. ( ) Kleene s theorem. ( ) Let L = L(A) where A = ( {}, Q, δ, q 0, F ) is n ccesible DFA. Put q k = δ(q 0, k ) s A is ccesible Q = {q 0, q 1,... }. We hve Q is finite set, so let m 0 be the lest number such tht q m = q m+r for some r 1 nd let p 1 be lest such tht q m = q m + p.

22 VICKY G q m+1 q 0 q 1... q m q m+p 1 Let F = {q 0, q 1,...,q m 1 } F nd F = {q m, q m+1,...,q m+p 1 } F, F = F F. Put ( ) K = { i q i F } = { i δ(q 0, i ) F } K < J = { i m i < m + p, q i F } = { i m i < m + p, δ(q 0, i ) F } ( J < ) Then K, J L(A) = L nd K J = L { 0, 1,..., m+p 1 }. For n m + p we hve n L(A) δ(q 0, n ) F δ(q 0, n ) = q i some q i F q n = q i, some q i F n = i + tp, for some t 1 n = i+tp = i( p) t, some t 1, i J n J( p ) + We lso hve for n m + p 1 n L(A) n K J. Hence L(A) = K J J( p ) + = K J( p ). 3.4. Revision of Equivlence Reltions A reltion on set A is n equivlence reltion if (1) for ll A (Reflexive), (2) b b for ll, b A (Symmetric), (3) b, b c c for ll, b, c A (Trnsitive). Then -equivlence clss (or just -clss) of A is the set {b A b}. Often write [] for this set. Note. [] = {b A b} = {b A b } s is symmetric [] s (reflexive).

Fcts: FORMAL LANGAUGES & AUTOMATA 23 (1) [] = [b] [] [b], so the equivlence clsses prtition A, i.e. cut up A into disjoint non-empty subsets. (2) [] = [b] b [] b( [] [b] ) or [] [b] b [] b( [] [b] = ). 4. Reduced DFAs Given DFA A = (A, Q, δ, q 0, F) with L(A) = L we find DFA Ā = (A, Q, δ, q 0, F) with L(Ā) = L such tht Ā hs the smllest number of sttes of ny DFA ccepting L. Two DFA s, A nd B (with the sme lphbet), re equivlent if L(A) = L(B). We hve remrked tht ny DFA is equivlent to n ccessible DFA. We ssume tht ll DFA s re ccesible. Let A = (A, Q, δ, q 0, F). Define on Q by Note. is n equivlence reltion on Q. q q w A ( δ(q, w) F δ(q, w) F ). Definition: An (ccesible) DFA A is reduced if q q q = q. Theorem 4.1. Any DFA A is equivlent to reduced DFA. Proof. Let A = (A, Q, δ, q 0, F) be n (ccessible) DFA. [q] is the -clss of qnd Q = { [q] q Q }. Define δ : Q A Q by δ ( [q], ) = [ (q, ) ]. (1) δ is well-defined. Proof. We hve tht [q] = [q ] q q w A, δ(q, w) F δ(q, w) F A, w A, δ(q, w) F δ(q, w) F). A, w A, δ ( δ(q, ), w ) F δ ( δ(q, ), w ) F A, δ(q, ) δ(q, ) A, [ δ(q, ) ] = [ δ(q, ) ] Hence δ ( [q], ) = δ ( [q ], ), so δ is well-defined. (2) For q q, q F δ(q, ε) F δ(q, ε) F q F. So, in [q] either ll sttes re finl or none re finl. We put F = [ q] q F }, q 0 = [q 0 ]. So, Ā = (A, Q, δ, q 0, F) is DFA.

24 VICKY G (3) For ny w A we hve δ ( [q], w ) = [ δ(q, w) ]. Then δ ( [q], ε ) = [q] = [ δ(q, ε) ]. For w A, result is true by definition of δ. Suppose the result is true for ll w A with w = n. Then (4) Ā is reduced. δ ( [q], w ) = δ ( δ([q], w), ) Proof. We hve tht by definition of extended δ, = δ ( [δ(q, w)], ) inductive ssumption, = [ δ(δ(q, w), ] definition of δ, = [ δ(q, w) ] definition of extended δ. [q] [q ] w A, δ ( [q], w ) F δ ( [q ], w ) F w A, [ δ(q, w) ] F [ δ(q, w) ] F w A, δ(q, w) F δ(q, w) F by the definition of F q q [q] = [q ] nd so Ā is reduced. (5) Ā is equivlent to A w L(A) δ(q 0, w) F, [ (q 0, w) ] F, δ ( [q 0 ], w ) δ, by the definition of the extended δ w L(Ā). Hence we hve L(Ā) = L(A). Definition: Let A = (A, Q, δ, q 0, F), B = (A, P, σ, p 0, T) be DFAs. Then A is isomorphic to B if there exists bijection θ : Q P such tht q 0 θ = p 0, Fθ = T nd δ(q, )θ = σ(qθ, ) q Q, A.

FORMAL LANGAUGES & AUTOMATA 25 We write mps on the right of the their rguments, with the exception of the next stte function. So, we write f insted of f(). Clim. If A nd B re reduced nd equivlent, then A is isomorphic to B. (So, in theorem bove A is the unique (up to isomorphism) reduced DFA equivlent to A). Proof. A nd B re ccessible. Define θ : Q P by δ(q 0, w)θ = σ(p 0, w). Certinly θ is everywhere defined nd onto. Is θ well-defined? δ(q 0, w) = δ(q 0, w ) δ(q 0, w) δ(q 0, w ) A is reduced, v A δ ( δ(q 0, w), v ) F δ ( δ(q 0, w ), v ) F, v A δ(q 0, wv) F δ(q 0.w v) F, v A v A wv L(A) w v L(A), wv L(B) w v L(B), v A σ(p 0, wv) T σ(p 0, w v) T, v A σ ( σ(p 0, w), v) T σ ( σ(p 0, w ), v ) T, σ(p 0, w) σ(p 0, w ), σ(p 0, w) = σ(p 0, w ), δ(q 0, w)θ = δ(q 0, w )θ. Now gives us tht θ is well-defined nd gives θ is 1:1. Check: q 0 θ = p 0, Fθ = T nd δ(q, )θ = σ(qθ, ) for ll q Q, A. Proposition. If L = L(A) where A is DFA, then A hs the smllest number of sttes of ny DFA ccepting L. Proof. If L = L(B) for some DFA B, then there exists reduced DFA B with L = L(A) = L(A) = L(B) = L(B). Since A nd B re reduced then there exists bijection θ : Q A Q B. Therefore we hve Q A = Q B Q B. Given A how do we fine A? We must clculte. We find sequence 0, 1, 2,... of equivlence reltions on Q such tht there exists k with k =. Let A = (A, Q, δ, q o, F) nd k 0. Definition: q k w A, δ(q, w) F δ(q, w) F with w k. So q k q q k 1 q q 0 q nd q q q k q for ll k 0

26 VICKY G Fcts: (1) We hve tht, q 0 q for ll w A, δ(q, w) F δ(q, w) F where w 0 i.e. q 0 q q, q F or q, q F. So the 0 clsses re F nd Q \ F. (2) q k+1 q q k q nd for ll A, δ(q, ) k δ(q, ). So we cn find 0, 1, 2,..., in turn. Exmple 4.1. 4 1 b, b, b 0 b b 5 2, b 3 b 0 1 2 1 4 5 2 3 5 3 5 5 4 5 5 5 5 5 We hve tht the clsses re 0 clsses : {0, 1, 2, 5} {3, 4} 1 clsses : {0, 5} {1, 2} {3, 4} 2 clsses : {0} {5} {1, 2} {3, 4} 3 clsses : {0} {5} {1, 2} {3, 4} More Fcts: (1) k = k+1 k = k+1 = k+2... (2) there exists k such tht k = k+1 (3) k = k+1 k =

FORMAL LANGAUGES & AUTOMATA 27 We note tht (4) & (5) there exists k such tht k =. We clculte 0, 1,... by finding k such tht k = k+1 nd then = k. In our exmple we hve 2 = 3 = 2. The reduced DFA equivlent to our exmple hs four sttes [0] = {0} [5] = {5} [1] = {1, 2} [4] = {3, 4} with initil stte [0]. Unique finl stte [4]. Then we hve A 1, b 0 b 5, b 2 δ ( [0], ) = [ δ(0, ) ], b 5.1. Monoids 5. Monoids nd Trnsition Monoids Definition: A monoid M is set together with binry opertion (so M is closed under the opertion) such tht (i) (b)c = (bc) for ll, b, c M, (ii) there exists 1 M such tht 1 = = 1 for ll M. Exmple 5.1. (1) Groups re monoids. However N under is monoid which is not group. (2) Let X be set X. T X is the set of ll functions X X nd T X is monoid under (usully omitted) with identity I X, clled the full trnsformtion monoid on X. New Convention: This pplies to ll functions except next stte functions. If α : U V is function we write uα for the imge of u U under α (insted of α(u)). So, I X : X X is defined by xi X = x for ll x X. If α : U V nd β : V W then (uα)β is the

28 VICKY G imge of u U under first α nd then β. Nturlly, we write (uα)β = u(αβ), so αβ now mens do α, then do β. If X = {1, 2,..., n} we write T n for T X nd I n for I X. We my use two-row nottion for elements of T n. If α T 4 is given by 1α = 1 2α = 1 3α = 2 4α = 4. We cn write α = ( 1 1 2 1 3 2 4 4 ) nd for exmple ( ) ( ) 1 2 3 4 1 2 3 4 2 1 3 3 1 1 2 4 = ( ) 1 2 3 4. 1 1 2 2 Note tht T n = n n becuse for ech element in {1, 2,..., n} there re n choices for it s imge under mp in T n. There re n elements nd hence T n = n n. 5.2. Constnt Functions in T X x X, c x : X X is given by yc x = x for ll y X nd is clled the constnt function on x. For exmple c 1 = ( ) 1 2 3 4 T 1 1 1 1 4. Note tht αc x = c x for ll α T X. Since for ll y X we hve y(αc x ) = (yα)c x = x = yc x X α X c x X y yα x c x Also, c x α = c xα since for ll y X y(c x α) = (yc x )α = xα = yc xα Definition: Let M be monoid nd T M. Then T is submonoid if (1) 1 T nd (2), b T b T

FORMAL LANGAUGES & AUTOMATA 29 Definition: Let M be monoid nd X M. X = {x 1 x 2...x n n 0 nd x i X}. Notice tht 1 (empty product) lies in X nd if x 1 x 2...x n, y 1 y 2... y m X (so x i, y i X) then (x 1 x 2...x n )(y 1 y 2...y m ) = x 1 x 2...x n y 1 y 2...y m X. So, X is submonoid of M, the submonoid of M generted by X. if M = X, we sy M is generted by X. For exmple N = P, where P is the set of primes; A = A under X. 5.3. The Trnsition Monoid of DFA Let A = (A, Q, δ, q 0, F) be DFA. For ech w A let σ w T Q be defined by Clim. σ w σ v = σ wv for ll w, v A. Proof. We hve tht qσ w = δ(q, w). q(σ w σ v ) = (qσ w )σ v = δ(q, w)σ v = δ ( δ(q, w), v ) = δ(q, wv) = qσ wv. Therefore σ w σ v = σ wv. Now we note tht qσ ε = δ(q, ε) = q = qi Q nd therefore σ ε = I Q. Therefore M(A) = {σ w w A } is submonoid of T Q. Now M(A) si the trnsition monoid of the DFA A. Note tht the initil nd finl sttes do not mtter for M(A). Let w = 1 2... n A where i A. Then σ w = σ 1 2... n = σ 1 σ 2...σ n. Therefore M(A) = σ A. Now we note tht M(A) T Q = Q Q <. Exmples of Finding Trnsition Monoids (1) A = {, b} nd Q = {1, 2}

30 VICKY G, b 1 2 b Clculte σ, σ b then clculte ll products until we don t obtin ny new elements 1 2 σ 2 2 σ b 2 1 Now we hve σ = c 2, σ 2 = σ σ = c 2 = σb = σ b σ, σ b 2 = σ b σ b = I Q, σ σ b = σ b = c 1. Hence we hve M(A) = {I Q, σ b, c 2, c 1 }, which will hve multipliction tble I σ b c 2 c 1 I I σ b c 2 c 1 σ b σ b I c 2 c 1 c 2 c 2 c 1 c 2 c 1 c 1 c 1 c 2 c 2 c 1 (2) A = {} nd Q = {1, 2, 3, 4, 5}. Now the STD of our DFA is 4 1 2 3 5 We hve tht M(A) = σ = {σ n n 0}. Clculte σ, σ 2 = σ 2, σ3,... So we hve tht M(A) = {i, σ, σ 2, σ3, σ4 }. Note. We hve tht T = {σ 2, σ3, σ4 } is 3 element subgroup of M(A). (3) A = {, b} nd Q = {1, 2, 3}. The STD of our DFA is

FORMAL LANGAUGES & AUTOMATA 31 1 2 3 4 5 σ 2 3 4 5 3 σ 2 4 4 5 3 4 σ 3 4 5 3 4 5 σ 4 5 3 4 5 3 σ 5 3 4 5 3 4 I σ σ 2 σ 3 σ 4 I I σ σ 2 σ 3 σ 4 σ σ σ 2 σ 3 σ 4 σ 2 σ 2 σ 2 σ 3 σ 4 σ 2 σ 3 σ 3 σ 3 σ 4 σ 2 σ 3 σ 4 σ 4 σ 4 σ 2 σ 3 σ 4 σ 2 1, b 2 b 3 b We now hve our tble of trnsitions to be 1 2 3 σ 2 3 1 σ b 2 2 2 σ 2 3 1 2 σ b σ 3 3 3 σ b σ 2 1 1 1 σ b = c 2 σ b σ = c 3 σ b σ 2 = c 1 Thus we hve M(A) = {I, σ, σ 2, c 1, c 2, c 3 }. This hs multipliction tble

32 VICKY G I σ σ 2 c 1 c 2 c 3 I I σ σ 2 c 1 c 2 c 3 σ σ σ 2 I c 1 c 2 c 3 σ 2 σ 2 I σ c 1 c 2 c 3 c 1 c 1 c 2 c 3 c 1 c 2 c 3 c 2 c 2 c 3 c 1 c 1 c 2 c 3 c 3 c 3 c 1 c 2 c 1 c 2 c 3 Now {I, σ, σ 2 } is 3 element subgroup nd {I}, {c 1}, {c 2 }, {c 3 } re trivil groups. 5.4. Some Nottion for Functions Let θ : A B be function nd R A, S B. Then we define Rθ = {θ R} Sθ 1 = { A θ S} where Sθ 1 is the inverse imge of S under θ. The nottion Sθ 1 does NOT imply the function θ 1 exists. Exmple 5.2. A = {1, 2, 3}, B = {, b, c} nd θ : A B given by 1θ = b 2θ = b 3θ =. It is cler to see tht θ is not bijection nd hence θ 1 does not exist. Now we hve {1, 2}θ = {1θ, 2θ} = {b} {b}θ 1 = {1, 2} = {b, c}θ 1 {1}θ = {b} {}θ 1 = {3} θ = {c}θ 1 = θ 1 = Note. ( {1, 2}θ ) θ 1 = {b}θ 1 = {1, 2} so we hve {1}θθ 1 {1}. Let θ : A B, S 1, S 2 A nd R 1, R 2 B. Some fcts: Proof. (1) (S 1 S 2 )θ 1 = S 1 θ 1 S 2 θ 1, (2) (R 1 R 2 )θ = R 1 θ R 2 θ, (3) (R 1 R 2 )θ R 1 θ R 2 θ, (induction my be strict) (4) (S 1 S 2 )θ 1 = S 1 θ 1 S 2 θ 1. (1) We hve tht

FORMAL LANGAUGES & AUTOMATA 33 x (S 1 S 2 )θ 1 xθ S 1 S 2 xθ S 1 or xθ S 2 x S 1 θ 1 or x S 2 θ 1 x S 1 θ 1 S 2 θ 1 5.5. The Syntctic Monoid of Lnguge Let L be lnguge over A. For u A we hve c L (u) = { (w, z) A A wuz L } the context of u. Now define L on A by u L v iff c L (u) = c L (v). It is cler tht L is n equivlence reltion on A. Lemm 5.1. u L v nd u L v uu L vv. Proof. Suppose u L v nd u L v. Then Hence we hve uu L vv. (w, z) c L (uu ) wuu z L wu(u z) L (w, uz ) c L (u) (w, u z) c L (v) wvu z L (wv)u z L (wv, z) c L (u ) (wv, z) c L (v ) wvv z L (w, z) c L (vv ). Now set M(L) = { [w] w A } nd define product on M(L) by [u][v] = [uv]. If [u] = [u ] nd [v] = [v ] then u L u nd v L v, so by the Lemm bove uv L u v nd so [uv] = [u v ]. Hence our product bove is well-defined binry opertion on M(L). Lemm 5.2. M(L) is monoid under this binry opertion.

34 VICKY G Proof. For ll [u], [v], [w] M(L) then [u] ( [v][w] ) = [u][vw] = [ u(vw) ] = [ (uv)w ] = [uv][w] = ( [u][v] ) [w]. Also we hve tht [ε][u] = [εu] = [u] = [uε] = [u][ε] nd hence [ε] is the identity of M(L). Thus M(L) is monoid. Some terminology: L is the syntctic congruence of L M(L) is the syntctic monoid of L. Note. Suppose u L nd u L v. We hve (ε, ε) c L (u) = c L (v). We hve tht v = εvε v L. Therefore L is union of L -clsses. Clcultion of M(L) Exmple 5.3. Tke A = {, b} nd L = A. For w A with w > 1, we hve c L (w) =, So, there exists three L -clsses; c L (ε) = { (ε, ), (, ε), (ε, b), (b, ε) }, c L () = { (ε, ε) } = c L (b). {ε} = [ε] = 1 {, b} = [] = L {w A w 2} = T. So the multipliction tble of our monoid is 1 L T 1 1 L T L L T T T T T T becuse we hve LL = [][] = [ 2 ] = T, LT = [][ 2 ] = [ 3 ] = T. Note T is z ero for M(L) hd we known we could hve used 0 for T. Exmple 5.4. A = {, b} nd L = {b, b}. Now the contexts re

FORMAL LANGAUGES & AUTOMATA 35 c L (ε) = { (ε, b), (b, ), (b, ε), (ε, b), (, b), (b, ε) } c L () = { (b, ε), (ε, b) } c L (b) = { (ε, ), (, ε) } c L (b) = { (ε, ε) } = c L (b) c L ( 2 ) = = c L (b 2 ) = c L (w) for ll w with w 3. So, there exists 5 L -clsses: [ε] = {ε} = 1 {} = [] = P {b} = [b] = Q [b] = {b, b} = L [ 2 ] = { 2, b 2, w w 3} = 0. So, M(L) = {1, P, Q, L, 0} nd hs mulitipliction tble We know the bove becuse 1 P Q L 0 1 1 P Q L 0 P P 0 L 0 0 Q Q L 0 0 0 L L 0 0 0 0 0 0 0 0 0 0 P 2 = [][] = [ 2 ] = 0, PQ = [][b] = [b] = L, PL = [ 2 b] = 0. 6. Recognition of Monoid Definition: Let M, N be monoids. Then θ : M N is (monoid) morphism if (i) (b)θ = θbθ, (ii) 1 M θ = 1 N. Why is the free monoid clled free? Let A be n lphbet, M monoid nd ϕ : A M function. Then there exists unique morphism θ : A M such tht θ = ϕ for ll A. Proof. Define θ : A M by εθ = 1 ( 1... n )θ = 1 ϕ... n ϕ

36 VICKY G ( i A). Clerly θ is well-defined; it is esy to check tht θ is morphism. For ny A we hve θ = ϕ. If ψ : A M is morphism such tht ψ = ϕ for ll A, then εψ = 1 = εθ. Now for ll w = 1 2... n, i A, n 1 we hve wψ = ( 1... n )ψ = 1 ψ... n ψ (ψ is morphism) = 1 ϕ... n ϕ ( i ψ = i ϕ) = ( 1... n )θ (definition of θ) = wθ. Therefore ψ = θ nd θ : A M is the unique morphism such tht θ = ϕ for ll A. Definition: Let L A nd let M be monoid. Then L is recognised by M if there exists morphism θ : A M such tht L = (Lθ)θ 1. Remrk. We know L (Lθ)θ 1. For L = (Lθ)θ 1, we need tht w (Lθ)θ 1 w L. i.e. wθ Lθ w L wθ = vθ, some v L w L. Theorem 6.1. Let L be lnguge. Then L is recognised by M(L). Proof. Define ν L : A M(L) by wν L = [w]. then εν L = [ε], which is the identity of M(L) nd (wv)ν L = [wv] = [w][v] = wν L vν L. Hence ν L is morphism. Suppose w (Lν L )ν 1 L. Then wν L Lν L, so wν L = vν L for some v L. We hve [w] = [v] by definition of ν L, hence w L v. As (ε, ε) c L (v) we must hve (ε, ε) c L (w) so tht w L. Hence (Lν L )ν 1 L L is recognised by M(L). L so tht (Lν L)ν 1 L Theorem 6.2. The following re equivlent for lnguge L A : (i) M(L) is finite; (ii) L is recognised by finite monoid; (iii) L Rec A. Proof. (i) (ii) from the bove. = L nd hence (ii) (iii): Let M be finite monoid nd θ : A M morphism such tht L = (Lθ)θ 1. Let A = (A, M, δ, 1 M, Lθ) where δ(m, ) = m(θ). Check δ(m, w) = m(wθ) for ll w A. Then

FORMAL LANGAUGES & AUTOMATA 37 w L(A) δ(1, w) Lθ, Hence L(A) = L so L is recognisble by A. 1(wθ) Lθ, wθ Lθ, w (Lθ)θ 1, w L s (Lθ)θ 1 L. (iii) (i): If L Rec A then L = L(A) for some reduced (ccessible) DFA A = (A, Q, δ, q 0, F). Clim. We clim tht for u, v A, u L v σ u = σ v. So, the number of L -clsses = M(A) T Q Q Q <. Proof. We hve tht u L v c L (u) = c L (v), ( (w, z) c L (u) (w, z) c L (v) ), ( wuz L wvz L w, z A ), w, z A, δ(q 0, wuz) F δ(q 0, wvz) F q Q z A, δ ( δ(q 0, w), uz ) F δ ( δ(q 0, w), vz ) F q Q z A, δ(q, uz) F δ(q, vz) F q Q z A, δ ( δ(q, u), z ) F δ ( δ(q, v), z ) F q Q, δ(q, u) δ(q, v) q Q, δ(q, u) = δ(q, v) q Q, qσ u = qσ v σ u = σ v Hence ll sttements re equivlent. We hve now proved the following Theorem 6.3. Let L be lnguge over A. The following re equivlent; (i) L is recognisble (L Rec A ; L = L(A) for some DFA A); (ii) L = L(A) for some NDA A;

38 VICKY G (iii) L is rtionl (L RtA ); (iv) L is recognised by finite monoid M (i.e. there exists morphism θ : A M such tht L = (Lθ)θ 1 ); (v) M(L) is finite. Common terminology for lnguge stisfying ny of these equivlent conditions is regulr. Let L Rec A ; we know tht M(L) is finite. How do we clculte it? Either directly by finding contexts; or we find DFA A with L = L(A), reduce A to A with L = L(A ) nd find M(A ), then use the following. Proposition. If L = L(A) for reduced DFA A, then M(L) = M(A), i.e. there exists bijective morphism (n isomorphism) θ : M(L) M(A). Proof. We hve M(L) = { [u] u A } where u L v c L (u) = c L (v), M(A) = {σ u u A } where qσ u = δ(q, u). From n erlier result, θ : M(L) M(A) given by [u]θ = σ u is bijection. Let [u], [v] M(L). Then The identity of M(L) is [ε] nd ( [u][v] ) θ = [uv]θ = σuv = σ u σ v = [u]θ[v]θ. [ε]θ = σ ε = I Q (identity of M(A)). Therefore θ is morphism nd hence n isomorphism s required. 7. How do Monoids help us? Let L A, w A. Definition: w 1 L = {v A wv L}. Lemm 7.1. L Rec A w 1 L Rec A for ny w A. Proof. L Rec A L is recognised by finite monoid M. Hence there exists morphism θ : A M such tht We show ( (w 1 L)θ ) θ 1 = w 1 L. We know L = (Lθ)θ 1 Now w 1 L ( (w 1 L)θ ) θ 1.

FORMAL LANGAUGES & AUTOMATA 39 v ( (w 1 L)θ ) θ 1 vθ (w 1 L)θ, vθ = xθ, for some x w 1 L, vθ = xθ, for some x with wx L. Then (wv)θ = wθvθ = wθxθ = (wx)θ Lθ wv (Lθ)θ 1 = L. Hence v w 1 L nd so (w 1 L)θθ 1 w 1 L s required. Recll: We needed tht L = { n b p n 1, p prime} Rec A. We rgued tht K = { n b p n 0, p prime} Rec A. We hve tht u 1 L u L u K. Hence 1 L = K. If L Rec A, then we would hve 1 L Rec A, i.e. K Rec A contrdiction. Hence L Rec A s required. Lemm 7.2. L, K Rec A L K Rec A. Proof. There exists finite monoids M, N nd morphisms θ : A M nd ψ : A N such tht L = (Lθ)θ 1, K = (Kψ)ψ 1. Now we hve tht M N is finite monoid under (m, n)(m, n ) = (mm, nn ) with identity (1 M, 1 N ). Define ϕ : A M N by wϕ = (wθ, wψ). Check ϕ is morphism. We know L K ( (L K)ϕ ) ϕ 1. Let w ( (L K)ϕ ) ϕ 1. Then wϕ (L K)ϕ, so there exists u L K with wϕ = uϕ. Hence (wθ, wψ) = (uθ, uψ), so wθ = uθ nd wψ = uψ. As u K, w (Lθ)θ 1 = L nd s u K, w (Kψ)ψ 1 = K. Hence w L K so tht ( (L K)ϕ ) ϕ 1 L K. Hence L K = ( (L K)ϕ ) ϕ 1 nd L K is recognisble by M N, hence L K Rec A. 8. Schützenbergers Theorem Recll tht lnguge L A is rtionl if (i) L is finite or (ii) L cn be obtined from finite subsets of A by pplying rtionl opertions (, product, str) finite number of times. Note we cn replce (ii) bove with (ii) L cn be obtined from subsets of A by pplying,, c, product nd str finite number of times (s RtA = Rec A it is closed under nd c ). Definition: L A is str-free if (1) L is finite or

40 VICKY G (2) L cn be obtined from finite lnguges by pplying product nd the boolen opertions of,, c finite number of times. We hve tht if L is str-free then L Rec A (s Rec A contins the finite lnguges nd is closed under Boolen opertions nd product). L str-free L Rt A (by Kleene s Theorem). Exmple 8.1. () {b,, bb},, {ε} re finite, hence str-free. (b) {b, } c {b, b} ( {} c {bb} c) is str-free. (c) A = c so A is str-free. (d) Let A = {, b, c} then {} = (A ba A ca ) c = ( c b c c c c ) c is str-free. (e) L = {x A x 1} = A A = c c is str-free. (f) (b) = (ba A A A A bba ) c is str-free. (g) () is not str-free. Definition: Let M be monoid nd let G M then G is subgroup of M if (1) G is closed, b G b G; (2) there exists e G such tht e = = e for ll G; (3) for ll G there exists b G such tht b = e = b. i.e. G is group under the restriction of the binry opertion on M to the subset G. Definition: e M is idempotent if e = e 2 then we hve E(M) is the set of idempotents of M. Exmple 8.2. (i) e E(M) {e} is subgroup, trivil subgroup with identity e. (ii) S X is subgroup of T X. (iii) GL n (R) is subgroup of M n (R). (iv) We hve multipliction tble I α 0 I I α 0 α α I 0 0 0 0 0 {0}, {I} re subgroups nd {I,.α} is subgroup. (v) From the exmple of finding M(A) number 2

FORMAL LANGAUGES & AUTOMATA 41 I σ σ 2 σ 3 σ 4 I I σ σ 2 σ 3 σ 4 σ σ σ 2 σ 3 σ 4 σ 2 σ 2 σ 2 σ 3 σ 4 σ 2 σ 3 σ 3 σ 3 σ 4 σ 2 σ 3 σ 4 σ 4 σ 4 σ 2 σ 3 σ 4 σ 2 Let T = {σ 2, σ 3, σ 4}. By inspection T is closed, σ 3 is the identity nd (σ 3) 2 = σ 3, σ 2 nd σ 4 re mutully inverse. Now σ 2σ 4 = σ 3 = σ 4σ 2. So T is subgroup of M(A). Definition: A finite monoid M is periodic if ll of its subgroups re trivil. Theorem 8.1 (Schützenbergers Theorem). A lnguge L is str-free M(L) is finite n periodic. Proof. No proof in this course. Exmple 8.3. L = () {, b} with DFA 0 1 b b, b Now we hve tht L(A) = L. The -clsses re 2 0 clsses : {0}, {1, 2}, 1 clsses : {0}, {1}, {2}. Hence = 1 nd the -clsses re {0}, {1}, {2} nd so A is reduced. We hve tht M(L) = M(A), clerly M(L) is finite. The trnsition tble for this is 0 1 2 σ 1 0 2 σ b 2 2 2 σ 2 0 1 2

42 VICKY G Hence M(A) = {I, σ, c 2 }. Now M(A) hs tble I σ b c 2 I I σ b c 2 σ σ I c 2 c 2 c 2 c 2 c 2 Now {I, σ } is subgroup. So M(L) is not periodic hence L is not str-free by Schützenbergers theorem.