컴파일러입문 제 3 장 정규언어

목차 3.1 정규문법과정규언어 3.2 정규표현 3.3 유한오토마타 3.4 정규언어의속성 Regular Language Page 2

정규문법과정규언어 A study of the theory of regular languages is often justified by the fact that they model the lexical analysis stage of a compiler. Type 3 Grammar(N. Chomsky) RLG : A tb, A t LLG : A Bt, A t where, A,B V N and t V T *. It is important to note that grammars in which left-linear productions are intermixed with right-linear productions are not regular. For example, G : S ar S c R Sb L(G) = {a n cb n n 0} is a cfl. Regular Language Page 3

Definition (1) A grammar is regular if each rule is i) A ab, A a, where a V T, A, B V N. ii) if S ε P, then S doesn't appear in RHS. 우선형문법 A tb, A t 의형태에서 t 가하나의 terminal 로 이루어진경우로정규문법에관한속성을체계적으로전개하기위하 여바람직한형태이다. (2) A language is said to be a regular language(rl) if it can be generated by a regular grammar. ex) L = { a n b m n, m 1 } is rl. S as aa A ba b Regular Language Page 4

[Theorem] The production forms of regular grammar can be derived from those of RLG.(RLG => RG) (Text p.69) (proof) A tb, where t V T *. Let t = a 1 a 2... a n, a i V T. A a 1 A 1 A 1 a 2 A 2. A n-1 a n B. If t = ε, then A B (single production) or A ε (epsilon production). These forms of productions can be easily removed. (Text pp.175-181) ex) S abca S as 1, S 1 bs 2 S 2 ca A bca A ba 1, A 1 ca A cd A ca 1 ', A 1 ' d Right-linear grammar : A tb or A t, where A, B V N and t V T *. Regular Language Page 5

Equivalence 1. 언어 L 은우선형문법에의해생성된다. 2. 언어 L 은좌선형문법에의해생성된다. 3. 언어 L 은정규문법에의해생성된다. 정규언어 [ 예 ] L = {a n b m n,m 1} : rl S as aa A ba b Text p. 70 Regular Language Page 6

토큰의구조를정의하는데정규언어를사용하는이유 (1) 토큰의구조는간단하기때문에정규문법으로표현할수있다. (2) context-free 문법보다는정규문법으로부터효율적인인식기를구현할수있다. (3) 컴파일러의전반부를모듈러하게나누어구성할수있다. (Scanner + Parser) 문법의형태가정규문법이면그문법이나타내는언어의형태를체계적으로구하여정규표현으로나타낼수있다. G derivation L if G = rg, L: re. Regular Language Page 7

정규표현 A notation that allows us to describe the structures of sentences in regular language. The methods for specifying the regular languages (1) regular grammar(rg) (2) regular expression(re) (3) finite automata(fa) rg fa re Regular Language Page 8

Definition : Text p. 71 A regular expression over the alphabet T and the language denoted by that expression are defined recursively as follows : I. Basis : φ, ε, a T. (1) φ is a regular expression denoting the empty set. (2) ε is a regular expression denoting {ε}. (3) a where a T is a regular expression denoting {a}. II. Recurse : +,, * If P and Q are regular expressions denoting L p and L q respectively, then (1) (P + Q) is a regular expression denoting L p U L q. (union) (2) (P Q) is a regular expression denoting L p L q. (concatenation) (3) (P*) is a regular expression denoting (closure) {e} U L p U L p 2 U... U L p n... Note : precedence : + < < * II. Nothing else is a regular expression. Regular Language Page 9

ex) (0+1)* denotes {0,1}*. (0+1)*011 denotes the set of all strings of 0 s and 1 s ending in 011. Definition : if α is α regular expression, L(α) denotes the language associated with α. (Text p.72) Let a and b be regular expressions. Then, (1) L(α+ β) = L(α) L(β) (2) L(α β) = L(α) L(β) (3) L(α * ) = L(α) * examples : (1) L(a * ) = {ε, a, aa, aaa, } = {a n n 0} (2) L((aa) * (bb) * b) = {a 2n b 2m+1 n,m 0} (3) L((a+b) * b(a+ab) * ) --- 연습문제 3.2 (3) - text p.115 = { b, ba, bab, ab, bb, aab, bbb, } Regular Language Page 10

Definition : Two regular expressions are equal if and only if they denote the same language. α= β if L(α) = L(β). Axioms : Some algebraic properties of regular expressions. Let a, b and g be regular expressions. Then, (Text p.73) A1. α+β = β+α A2. (α+β) +γ = α+ (β+γ) A3. (αβ) γ = α (βγ) A4. α(β+γ) = αβ +αγ A5. (β + γ) α = βα + γα A6. α+α=α A7. α + φ = α A8. αφ = φ = φα A9. ε α = α = α ε A10. α * = ε +α α * A11. α * = (ε + α) * A12. (α * ) * = α * A13. α * + α = α * A14. α * + α + = α * A15. (α + β) * = (α * β * ) * Regular Language Page 11

All of these identities(=axioms) are easily proved by the definition of regular expression. A8. αφ = φ = φ α (proof) αφ = { xy x L α and y Lφ } Since y Lφ is false, (x L α and y Lφ) is false. Thus αφ = φ. Definitions : regular expression equations. ::= the set of equations whose coefficient are regular expressions. ex) α,β 가정규표현이면, X = αx+β 가정규표현식이다. 이때, X 의의미는 nonterminal 심볼이며우측의식이그 nonterminal 이생성하는언어의형태이다. Regular Language Page 12

The solution of the regular expression equation X = αx + β When we substitute X = α*β in both side of the equation, each side of the equation represents the same language. X = αx + β = α(α*β) + β = αα*β + β = (αα* + ε)β = α*β. fixed point iteration X = αx + β = α(αx + β) + β = α 2 X + αβ + β = α 2 X + (ε + α)β. = α k+1 X + (ε + α + α 2 +... α k )β = (ε + α + α 2 +... + α k +...)β = α*β. Regular Language Page 13

Not all regular expression equations have unique solution. X = αx + β (a) If ε is not in α, then X = α * β is the unique solution. (b) If ε is in α, then X = α * (β + L) for some language L. So it has an infinity of solutions. Smallest solution : X = α * β. ex) X = X + a : not unique solution X = a + b or X = b * a or X = (a + b) * etc. X = X + a X = X + a = a + b + a = b * a + a = a + a + b = (b * + ε) a = a + b. = b * a Regular Language Page 14

Finding a regular expression denoting L(G) for a given rg G. G derivation L if G = rg, L: re. L(A) where A V N denotes the language generated by A. By definition, if S is a start symbol, then L(G)= L(S). Two steps : 1. Construct a set of simultaneous equations from G. A ab, A a L(A) = {a} L(B) U {a} A = ab + a In general, X α β γ X = α + β + γ. 2. Solve these equations. X = αx + β X = α * β. Regular Language Page 15

ex1) S as S br S ε R as L(S) = {a}l(s) U {b}l(r) U{ε} L(R) = {a}l(s) ree: S = as + br + ε R = as S = as + bas + ε = (a + ba)s + ε = (a + ba) * ε = (a + ba) * ex2) S aa bb b A ba ε B bs ree: S = aa + bb + b A = ba + ε A = b * ε = b * B = bs S = ab * + bbs + b = bbs + ab * + b = (bb) * (ab * +b) Regular Language Page 16

ex3) A 0B 1A B 1A 0C C 0C 1C ε ex4) S aa bs A as bb B ab bb ε ex5) S 0A 1B 0 ex6) X 1 = 0X 2 + 1X 1 + ε A 0A 0S 1B X 2 = 0X 3 + 1X 2 B 1B 1 0 X 3 = 0X 1 + 1X 3 Text p.116 3.5(5) ex7) A 1 = (01* + 1) A 1 + A 2 A 2 = 11 + 1A 1 + 00A 3 A 3 = A 1 + A 2 + ε ex8) A ab ba B ab bc C bd ab D ba ab ε 풀이 ex9) X α 1 X + α 2 Y + α 3 ex10) PR b DL SL e Y β 1 X + β 2 Y + β 3 DL d ; DL ε SL SL ; s s Regular Language Page 17

인식기 (Recognizer) A recognizer for a language L is a program that takes as input string x and answers yes if x is a sentence of L and no otherwise. Turing Machine Linear Bounded Automata Pushdown Automata Finite Automata Regular Language Page 18

유한오토마타 Definition : fa Text p. 78 A finite automaton M over an alphabet is a system (Q,, δ, q 0, F) where, Q : finite, non-empty set of states. : finite input alphabet. δ : mapping function. q 0 Q : start(or initial) state. F Q : set of final states. mapping δ : Q x 2 Q. i,e. δ(q,a) = {p 1, p 2,..., p n } G = (V N, V T, P, S) re : φ, ε, a, +,, * M = (Q,, δ, q 0, F) DFA, NFA. Regular Language Page 19

목차 - FA 1. DFA 2. NFA 3. Converting NFA into DFA 4. Minimization of FA 5. Closure Properties of FA Regular Language Page 20

1. Deterministic Finite Automata(DFA) deterministic if δ(q,a) consists of one state. We shall write "δ(q,a) = p " instead of δ(q,a) = {p} if deterministic. If δ(q,a) always has exactly one number, We say that M is completely specified. extension of δ : Q x Q x * δ(q, ε ) = q δ(q,xa) = δ(δ(q,x),a), where x * and a. A sentence x is said to be accepted by M if δ(q 0, x) = p, for some p F. The language accepted by M : L(M) = { x δ(q 0,x) F } Regular Language Page 21

ex) M = ( {p, q, r}, {0, 1}, δ, p, {r} ) δ : δ(p,0) = q δ(q,0) = r δ(r,0) = r 1001 L(M)? δ(p,1) = p δ(q,1) = p δ(r,1) = r δ(p,1001) = δ(p,001) = δ(q,01) = δ(r,1) = r F. 1001 L(M). 1010 L(M)? δ(p,1010) = δ(p,010) = δ(q,10) = δ(p,0) = q F. 1010 L(M). δ : matrix 형태로 transition table. ex) δ Input symbols 0 1 p q p q r p r r r Regular Language Page 22

Definition : State (or Transition) diagram for automaton. The state diagram consists of a node for every state and a directed arc from state q to state p with label a if δ(q,a) = p. Final states are indicated by a double circle and the initial state is marked by an arrow labeled start. start 1 p 0 1 q 0 r 0, 1 (1+01) * 00(0+1) * Identifier : letter, digit start S letter A Regular Language Page 23

Algorithm : w? L(M). assume M = (Q,, δ, q 0, F); begin currentstate := q 0 ; (* start state *) get(nextsymbol); while not eof do begin currentstate := δ(currentstate, nextsymbol); get(nextsymbol) end; if currentstate in F then write( Valid String ) else write( Invalid String ); end. Text p. 82 Regular Language Page 24

2. Nondeterministic Finite Automata(NFA) nondeterministic if δ(q,a) = {p 1, p 2,..., p n } In state q, scanning input data a, moves input head one symbol right and chooses any one of p 1, p 2,..., p n as the next state. ex) NFA (Nondeterministic Finite Automata) M = ( {q 0,q 1,q 2,q 3,q f }, {0,1}, δ, q 0, {q f } ) δ 0 1 q 0 {q 1, q 2 } {q 1, q 3 } q 1 {q 1, q 2 } {q 1, q 3 } q 2 {q f } φ q 3 φ {q f } q f {q f } {q f } if δ(q,a) = φ, then δ(q,a) is undefined. Regular Language Page 25

To define the language recognized by NFA, we must extend δ. (i) δ : Q x * 2 Q δ( q, ε ) = { q } δ( q, xa ) = U δ(p,a), where a V T and x V T *. p δ( q, x ) (ii) δ : 2 Q x * 2 Q δ({p 1, p 2,..., p k }, x) = k i=1 δ (p i,x) Definition : A sentence x is accepted by M if there is a state p in both F and δ(q 0, x). ex) 1011 L(M)? δ(q 0, 1011) = δ({q 1,q 3 }, 011) = δ({q 1,q 2 },11) = δ({q 1,q 3 },1) = {q 1,q 3,q f } 1011 L(M) ( {q 1,q 3,q f } {q f } Φ) ex) 0100 L(M)? Regular Language Page 26

Nondeterministic behavior q 0 q 1 q 3 q 1 q 2 φ q 1 q 3 φ q 1 q 3 q f If the number of states Q = m and input length x = n, then there are m n nodes. In general, NFA can not be easily simulated by a simple program, but DFA can be simulated easily. And so we shall see DFA is constructible from the NFA. Regular Language Page 27

3. Converting NFA into DFA Text p. 86 NFA : easily describe the real world. DFA : easily simulated by a simple program. ===> Fortunately, for each NFA we can find a DFA accepting the same language. Accepting Sequence(NFA) δ(q 0, a 1 a 2... a n ) = δ({q 1,q 2,,q i }, a 2 a 3... a n )...... = δ({p 1,p 2,,p j }, a i... a n )...... = {r 1,r 2,...,r k } Since the states of the DFA represent subsets of the set of all states of the NFA, this algorithm is often called the subset construction. Regular Language Page 28

[Theorem] Let L be a language accepted by NFA. Then there exists DFA which accepts L. Text p.86 (proof) Let M = (Q,, δ, q 0, F) be a NFA accepting L. Define DFA M' = (Q',, δ', q 0 ', F') such that (1) Q' = 2 Q, {q 1, q 2,..., q i } Q', where q i Q. denote a set of Q' as [q 1, q 2,..., q i ]. (2) q 0 ' = {q 0 } = [q 0 ] (3) F' = {[r 1, r 2,..., r k ] r i F} (4) δ' : δ' ([q 1, q 2,...,q i ], a) = [p 1, p 2,..., p j ] if δ({q 1, q 2,..., q j }, a) = {p 1, p 2,..., p j }. Now we must prove that L(M) = L(M ) i.e, δ' (q 0 ',x) F' δ(q 0, x) F φ. we can easily show that by inductive hypothesis on the length of the input string x. Regular Language Page 29

ex1) M = ({q 0,q 1 }, {0,1}, δ, q 0, {q 1 }), δ 0 1 q 0 {q 0, q 1 } {q 0 } q 1 φ {q 0, q 1 } dfa M' = (Q',, δ', q 0 ', F'), where Q' = 2 Q = {[q 0 ], [q 1 ], [q 0,q 1 ]} q 0 ' = [q 0 ] F' = {[q 1 ], [q 0,q 1 ]} δ' :δ'([q 0 ],0) = δ({q 0 },0) = {q 0,q 1 } = [q 0,q 1 ] δ'([q 0 ],1) = {q 0 } = [q 0 ] δ' ([q 1 ],0) = δ(q 1,0) = φ δ' ([q 1 ],1) = δ(q 1,1) = {q 0,q 1 } = [q 0,q 1 ] δ' ([q 0,q 1 ],0) = δ({q 0,q 1 },0) = {q 0,q 1 } = [q 0,q 1 ] δ' ([q 0,q 1 ],1) = δ({q 0,q 1 },1) = {q 0,q 1 } = [q 0,q 1 ] Regular Language Page 30

State renaming : [q 0 ] = A, [q 1 ] = B, [q 0,q 1 ] = C. δ 0 1 A C A B φ C C C C 1 B 1 0, 1 start A 0 C Since B is an inaccessible state, it can be removed. 1 0, 1 start A 0 C Regular Language Page 31

Definition : we call a state p accessible if there is w such * that (q 0, w) (p, ε), where q 0 is the initial state. ex2) NFA DFA NFA : δ 0 1 q 0 {q 1,q 2 } {q 1,q 3 } q 1 {q 1,q 2 } {q 1,q 3 } q 2 {q f } φ q 3 φ {q f } q f {q f } {q f } DFA : δ 0 1 q 0 q 1 q 2 q 1 q 3 q 1 q 2 q 1 q 2 q f q 1 q 3 q 1 q 3 q 1 q 2 q 1 q 3 q f q 1 q 2 q f q 1 q 2 q f q 1 q 3 q f q 1 q 3 q f q 1 q 2 q f q 1 q 3 q f Regular Language Page 32

Definition : ε - NFA M = (Q,, δ, q 0, F) δ : Q ( {ε} ) 2 Q ε - CLOSURE : ε 을보고갈수있는상태들의집합 s 가하나의상태 ε-closure(s) = {s} {q (p, ε)=q, p ε-closure(s)} T 가하나이상의상태집합인경우 ε-closure(t) = ex) ε - NFA 에서 CLOSURE 를구하기 start a A ε B ε-closure(q) q T a b C ε D a ε CLOSURE (A) = {A, B, D} CLOSURE({A,C}) = CLOSURE(A) CLOSURE(C) = {A, B, C, D} Regular Language Page 33

Ex) ε - NFA DFA a 2 b start A a B b D start 1 ε c 3 ε 4 c c C δ CLOSURE(1) = {1,3,4} CLOSURE(2) = {2} [1,3,4] [2] a b φ c CLOSURE(3) = {3,4} [3,4] [2] φ CLOSURE(4) = {4} [4] φ [3,4] φ φ CLOSURE(3) = {3,4} [3,4] [4] φ φ φ A = [1,3,4], B = [2], C = [3,4], D = [4] Regular Language Page 34

4. Minimization of FA Text p. 95 State minimization => state merge Definition : ω * distinguishes q 1 from q 2 if δ(q 1,ω) = q 3, δ(q 2,ω) = q 4 and exactly one of q 3, q 4 is in F. Algorithm : equivalence relation( ) partition. (1) : final state 인가아닌가로 partition. (2) : input symbol 에따라다른 equivalence class 로가는가? 그 symbol 로 distinguish 된다고함. : (3) : 더이상 partition 이일어나지않을때까지. The states that can not be distinguished are merged into a single state. Regular Language Page 35

Ex) a b A a a F b a Text p. 119 3.11 D b b C b B a a E b : {A,F}, {B, C, D, E} : 처음에 final, nonfinal로분할한다. : {A,F}, {B,E}, {C,D} : {B, C, D, E} 가 input symbol b에의해 partition 됨 : {A,F}, {B,E}, {C,D}. δ [AF] [BE] [CD] a [AF] [BE] [CD] b [BE] [CD] [AF] Regular Language Page 36

How to minimize the number of states in a fa. <step 1> Delete all inaccessible states; <step 2> Construct the equivalence relations; <step 3> Construct fa M = (Q,, δ, q 0, F ), (a) Q : set of equivalence classes under Let [p] be the equivalence class of state p under. (b) δ ([p],a) = [q] if δ(p,a) = q. (c) q 0 is [q 0 ]. (d) F' = {[q] q F}. Definition : M is said to be reduced. if (1) no state in Q is inaccessible and (2) no two distinct states of Q are indistinguishable Regular Language Page 37

ex) Find the minimum state finite automaton for the language specified by the finite automaton M = ({A,B,C,D,E,F}, {0,1}, δ, A, {E,F}), where δ is given by δ A B C D E F 0 1 B E A F D D C F A E F E Text p. 119 3.11(2) : {A, B, C, D}, {E, F} : {A}, {C}, {B, D}, {E, F} δ 0 1 [A]=p [C]=q r p q p [B,D]=r s s [E, F]=s r s Regular Language Page 38

Programming < 연습문제 3.20> --- 교과서 121 쪽 NFA NFA to DFA DFA Minimization of DFA Reduced DFA Input Design Data Structure Regular Language Page 39

5. Closure properties of FA [Theorem] If L 1 and L 2 are finite automaton languages (FAL), then so are (i) L 1 U L 2 (ii) L 1 L 2 (iii) L 1 *. (proof) M 1 = (Q 1,, δ1, q 1, F 1 ) M 2 = (Q 2,, δ2, q 2, F 2 ), Q 1 Q 2 = φ ( renaming) (i) M = (Q 1 U Q 2 U {q 0 },, δ, q 0, F) where, (1) q 0 is a new state. (2) F = F 1 U F 2 if ε L 1 U L 2. F 1 U F 2 U {q 0 } if ε L 1 U L 2. (3) (a) δ(q 0,a) = δ(q 1,a) U δ(q 2,a) for all a. (b) δ(q,a) = δ1(q,a) for all q Q 1, a. (c) δ(q,a) = δ2(q,a) for all q Q 2, a. 새로운시작상태를만들어각각의 fa 에마치각 fa 의시작상태에서온것처럼연결한다. 그리고 ε 를인식하면새로만든시작상태도종결상태로만든다. ex) p.98 [ 예 28] Regular Language Page 40

(ii) M = (Q 1 U Q 2,, δ, q 0, F) (1) F = F 2 if q 2 F 2 F 1 U F 2 if q 2 F 2 (2) (a) δ(q,a) = δ1(q,a) for all q Q 1 - F 1. (b) δ(q,a) = δ1(q,a) U δ2(q 2,a) for all q F 1. (c) δ(q,a) = δ2(q,a) for all q Q 2. M 1 의종결상태에서 M 2 의시작상태에서온것처럼연결한다. 그리고 M 1 의시작상태가접속한오토마타의시작상태가된다. 1 0 M 1 : start A B => 01 * 1 0 M 2 : start X Y => 01 * 1 0 0 A B Y M 1 M 2 : start => 01 * 01 * 1 Regular Language Page 41

정규언어의속성 Regular grammar (rg) Finite automata (fa) Regular expression (re) re ===> fa : scanner generator Regular Language Page 42

목차 1. RG & FA 2. FA & RE 3. Closure Properties of Regular Language 4. The Pumping Lemma for Regular Language Regular Language Page 43

1. RG & FA Given rg, there exists a fa that accepts the same language generated by rg and vice versa. rg fa Given rg, G = (V N, V T, P, S), construct M = (Q,, δ, q 0, F). (1) Q = V N U {f}, where f is a new final state. (2) = V T. (3) q 0 = S. (4) F = {f} if ε L(G) = {S, f} otherwise. (5) δ : if A ab P then δ(a,a) B. if A a P then δ(a,a) f. Regular Language Page 44

(proof) If ω is accepted by fa then it is accepted in some sequence of moves through states, ending in f. But if δ(a,a) = B and B f, then A ab is a productions. Also if δ(a,a) = f then A a is a production. So we can use the same series of productions to generate ω in G * Thus S => ω. ex) p.101 [ 예 29] Regular Language Page 45

fa rg Given M = (Q,, δ, q 0, F), construct G = (V N, V T, P, S). (1) V N = Q (2) V T = ex) (3) S = q 0 (4) P : if δ(q,a) = r then q ar. start if p F then p ε. 1 p 0 1 q p 1p 0q q 1p 0r r 0r 1r ε 0 r 0, 1 L(P)=(1+01) * 00(0+1) * Regular Language Page 46

2. FA & RE fa rg re ex) p.118 3.10 (1) b a start A a B b b a C a b D A = ba + ab B = ab + bc C = ab + bd D = ab + ba + ε = A + ε A = (a+b)*abb Regular Language Page 47

re fa ( scanner generator) For each component, we construct a fa inductively : 1. basis ε : i ε f a Σ : i a f 2. induction - combine the components. (1) N 1 + N 2 ε N 1 ε i f ε N 2 ε Regular Language Page 48

(2) N 1 N 2 ε i N 1 N 2 f (3) N * ε i ε N ε f ε Regular Language Page 49

Definition : The size of a regular expression is the number of operations and operands in the expression. ex) size(ab + c*) = 6 decomposition: R6 R3 + R5 R1. R2 R4 * a b c The number of state is at most twice the size of the expression. ( each operand introduces two states and each operator introduces at most two states.) The number of arcs is at most four times the size of the expression. Regular Language Page 50

Simplifications : p.106 ε -arc 로연결된두상태는소스상태에서나가는다른 arc 가 없으면같은상태로취급될수있다. A ε a ex) p.105 [ 예 31] B A re ε-nfa ( 간단화 ) DFA ex) p.109 [ 예 33] a The following statements are equivalent : 1. L is generated by some regular grammar. 2. L is recognized by some finite automata. 3. L is described by some regular expression. Regular Language Page 51

p.120 3.14 (1) (b + a(aa* b)*b)* b X a b Y a b Z a (2) (b + aa + ac + aaa + aac)* b X a a, c Y a Z a, c (3) a(a+b)*b(a+b)*a(a+b)*b(a+b)* S a, b a, b a, b a, b a b a b W X Y Z Regular Language Page 52

3. Closure Properties of Regular Language [Theorem] If L 1 and L 2 are regular languages, then so are (i) L 1 U L 2, (ii) L 1 L 2, and (iii) L * 1. (proof) (ii) Since L 1 and L 2 are rl, rg G 1 = (V N1, V T1, P 1, S 1 ) and rg G 2 = (V N2,V T2, P 2, S 2 ), such that L(G 1 ) = L 1 and L(G 2 ) = L 2. Construct G=(V N1 U V N2,V T1 U V T2, P, S 1 ) in which P is defined as follows : (1) If A ab P 1, A ab P. (2) If A a P 1, A as 2 P. (3) All productions in P 2 are in P. We must prove that L(G) = L(G 1 ). L(G 2 ). Since G is rg, L(G) is rl. Therefore L(G 1 ). L(G 2 ) is rl. ex) P 1 : S as ba A aa a P 2 : X 0X 1Y Y 0Y 1 P : S as ba A aa ax X 0X 1Y Y 0Y 1 Regular Language Page 53

(iii) L : rl, rg G = (V N, V T, P, S) such that L(G) = L. Let G' = (V N U {S'}, V T, P', S') P' : (1) If A ab P, then A ab P'. (2) If A a P, then A a, A as' P'. (3) S' S ε P'. We must prove that L(G') = (L(G))*. * * * ω L(G), S => ω. S' => S => ws' => w * S' => w *. (L(G))* = L(G'). ex) P : S as, S b P' : S as, S b, S bs', S' S, S' ε. note P : S = as + b = a*b P' : S = as + b + bs' = a*(b+bs') = a*b + a*bs' S' = S + ε = a*bs' + a*b + ε = (a*b)*(a*b + ε ) = (a*b)*(a*b) + (a*b)* = (a*b)* Regular Language Page 54

4. The Pumping Lemma for Regular Language It is useful in proving certain languages not to be regular. [Theorem] Let L be a regular language. There exists a constant p such that if a string w is in L and ω p, then w can be written as xyz, where 0 < y p and xy i z L for all i 0. (proof) Let M = (Q,, δ, q 0, F) be a fa with n states such that L(M) = L. Let p = n. If ω L and ω n, then consider the sequence of configurations entered by M in accepting w. Since there are at least n+1 configurations in the sequence, there must be two with the same state among the first n+1 configurations. Thus we have a sequence of moves such that δ(q 0,xyz) = δ(q 1,yz) = δ δ(q 1,z) = q f F for some q 1. y q 0 x q 1 z q f But then, δ(q 0,xy i z) = δ(q 1,y i z) = δ(q 1,y i-1 z) =... = δ(q 1,z) = q f F. Since w = xyz L, xy i z L for all i 0. Regular Language Page 55

Consequently, we say that finite automata can not count, meaning they can not accept a language which requires that they count the number exactly. ex) L = {0 n 1 n n 1} is not type 3. (Proof) Suppose that L is regular. Then for a sufficiently large n, 0 n 1 n can be written as xyz such that y 0 and xy i z L for all i 0. If y 0 + or y 1 +, then xz = xy 0 z L. If y 0 + 1 +, then xyz L. We have a contradiction, so L can not be regular. a n cb n not rl a n cb m rl Regular Language Page 56

연습문제 3.5 풀이교과서 116 쪽 A = ab + ba (1) B = ab + bc (2) C = bd + ab (3) D = ba + ab + ε (4) 식 (4) 에서 ba + ab = ab + ba = A 이므로 D = A + ε (5) 식 (3) 에식 (5) 를대입 C = b(a + ε) + ab = ba + ab + b = A + b (6) 식 (2) 에식 (6) 을대입 B = ab + b(a + b) = ab + ba + bb = A + bb (7) 식 (1) 에식 (7) 을대입 A = ab + ba = a(a + bb) + ba = aa + abb + ba = (a + b)a + abb = (a+b)*abb L(G) = (a+b)*abb Regular Language Page 57