TAFL 1 (ECS-403) Unit- II 2.1 Regular Expression: 2.1.1 The Operators of Regular Expressions: 2.1.2 Building Regular Expressions 2.1.3 Precedence of Regular-Expression Operators 2.1.4 Algebraic laws for Regular expressions: 2.2 Conversions: 2.2.1 Regular Expression to F.A. 2.2.2 F.A. to Regular Expression 2.2.2.1 using formula 2.2.2.2 using Arden s Theorem 2.3 Pumping Lemma 2.4 Closure Properties of Regular Languages. 2.5 Decision Properties of Regular Languages. 2.6 F.A with Output 2.6.1 Introduction to moore and mealy machine (Some Examples) 2.6.2 Conversion from moore to mealy machine 2.6.3 Conversion from mealy to moore machine 2.7 Applications and Limitations of F.A.
TAFL 2 (ECS-403) 2.1 Regular Expression: Regular Expressions are another type of language-defining notation, Regular expressions also may be thought of as a "programming language," in which we express some important applications, such as text-search applications or compiler components. Regular expressions are closely related to nondeterministic finite automata and can be thought of as a "user-friendly" alternative to the NFA notation for describing software components. Example: 01 * + 10 * denotes the language consisting of all strings that are either a single 0 followed by any number of l's or a single 1 followed by any number of O's. 2.1.1 The Operators of Regular Expressions: a) The union of two languages L and M, denoted L Ụ M, is the set of strings that are in either L or M, or both. For example if L = {00l, 10,111} and M = {Ԑ, 001}, then L Ụ M = {Ԑ, 10,001,111}. b) The concatenation of languages L and M is the set of strings that can be formed by taking any string in L and concatenating it with any string in M. For example, if L = {001, 10,111} and M = {Ԑ, 001}, then L.M, or just LM, is {001, 10,111, 001001, 10001, 111001}. c) The closure (or star, or Kleene closure) of a language L is denoted L* and represents the set of those strings that can be formed by taking any number of strings from L, possibly with repetitions (i.e., the same string may be selected more than once) and concatenating all of them. For instance, if L = {0, 1}, then L* is all strings of O's and l's. If L = {0, 11}, then L* consists of those strings of O's and 1's such that the l's come in pairs, e.g., 011, 11110, and e, but not 01011 or 101. More formally, L* is the infinite union Ui o L i, where L 0 = {Ԑ}, L 1 = L, and L i, for i > 1 is LL... L (the concatenation of i copies of L). L 2 = {00,011,110, 1111}. L 3 = {000, 0011,0110, 1100, 01111, 11011, 11110, 111111}
TAFL 3 (ECS-403) 2.1.2 Building Regular Expressions Basis: a) The constants Ԑ and ø are regular expressions, denoting the languages {Ԑ} and ø, respectively. That is, L(Ԑ) = {Ԑ}, and L(ø) = ø. b) If a is any symbol, than a is a regular expression. This expression denotes the language {a}. That is, L(a) = {a}. c) A variable, usually capitalized and italic such as L, is a variable: representing any language. Induction: a) If E and F arc regular expressions, then E + F is a regular expression denoting the union of L(E) and L(F). That is, L(E+ F) = L(E) U L(F). b) If E and F are regular expressions, then EF is a regular expression denoting the concatenation of L(E) and L(F). That is, L(EF) = L(E)L(F). c) If E is a regular expression, then E * is a regular expression, denoting the closure of L(E). That is, L(E * ) = (L(E))*. d) If E is a regular expression, then (E), a parenthesized E, is also a regular expression, denoting the same language as E. Formally; L((E)) = L(E). 2.1.3 Precedence of Regular-Expression Operators a) The star operator is of highest precedence. b) Next in precedence comes the concatenation or "dot" operator. After grouping all stars to their operands, we group concatenation operators to their operands. c) Finally, all unions (+ operators) are grouped with their operands. Since union is also associative, it again matters little in which order consecutive unions are grouped, but we shall assume grouping from the left.
TAFL 4 (ECS-403) 2.1.4 Algebraic laws for Regular expressions: a) Ø + R = R b) ØR = RØ = Ø c) ԐR = RԐ = R d) Ԑ * = Ԑ and Ø * = Ԑ e) R + R = R f) R * R* = R * g) RR * = R * R h) (R * ) * = R * i) Ԑ + RR * = R * = Ԑ + R * R j) (PQ) * P = P(QP) * k) (P+Q)* = (P*Q*)* = (P* + Q*)* l) (P+Q)R = PR+QR
TAFL 5 (ECS-403) EXERCISE 1. Write regular expressions for the following languages: a) The set of all strings over a,b of even length. b) The set of all strings over a,b of length 4, starting with an a. c) {a 2n n 1} d) The set of all strings over a,b having abab as a substring. e) {0,1,2} f) {1 2n+1 n > 0} g) {w Ԑ {a,b}* w has only one a}. h) The set of all strings over {0,1} which has at most two zeros. i) {a 2, a 5, a 8, } j) {a n n is divisible by 2 or 3 or n = 5} k) The set of all string over a,b beginning and ending with a. l) The set of all strings over a,b with three consecutive b s. m) The set of all strings over 0,1 beginning with 00. n) The set of all strings over 0,1 ending with 00 and beginning with 1. o) The set of all strings over a,b containing exactly 2a s. p) The set of all strings over a,b containing at least 2a s. q) The set of all strings over a,b containing at most 2a s. r) The set of all strings over a,b containing the substring aa. 2. Give English descriptions of the languages of the following regular expressions: a) ab * +b * b) a(a+b)*ab c) a*b + b*a d) (aa + b)*(bb + a)* e) (a+b)*(aa+bb+ab+ba)* f) (aa)*+(aaa)* g) a+b(a+b)* 3. Show that a) (1+00*1) + (1+00*1) (0+10*1)* (0+10*1) = 0*1(0+10*1)* b) Ԑ+1*(011)*(1*(011)*)* = (1+011)*
TAFL 6 (ECS-403) 1. Regular Expressions: a) (ab+aa+bb+ba)* b) a(a+b)(a+b)(a+b) c) (aa) + d) (a+b)*abab(a+b)* e) 0+1+2 f) 1(11) + g) b*ab* h) 1*+1*01*+1*01*01* i) aa(aaa)* j) (aa)* + (aaa)* + aaaaa k) a(a+b)*a l) (0+1)*000(0+1)* m) 00(0+1)* n) 1(0+1)*00 o) b*ab*ab* p) (a+b)*a(a+b)*a(a+b)* + b* q) b*ab*ab*+b*ab* r) (a+b)*aa(a+b)* SOLUTIONS 2. English Descriptions: a) Starting with a and having no other a s or having no a s but only b s. b) The set of all strings starting with a and ending with ab. c) The strings are either strings of a s followed by one b or strings of b s followed by one a. d) The set of all strings of aa or b followed by set of all string of bb or a. e) The set of all strings of a or b followed by all set of even length of a and b. f) {xԑ{a}* x is divisible by 2 or 3.} g) Accept only a or starting with b.
TAFL 7 (ECS-403) 3. Proof: a) (1+00*1)+(1+00*1)(0+10*1)*(0+10*1)=0*1(0+10*1)* L.H.S. = (1+00*1)+(1+00*1)(0+10*1)* (0+10*1) = (1+00*1)(Ԑ + Ԑ (0+10*1)* (0+10*1)) = (1+00*1) (0+10*1)* = 1(Ԑ+00*) (0+10*1)* = 0*1(0+10*1)* R.H.S. b) Ԑ+1*(011)*(1* *(011)*)* = (1+011)* L.H.S. = Ԑ+1*(011)*(1*(011)*)* = (1*(011)*)* (1*(011)*)* = (1*(011)*)* = (1+(011))* = (1+011)* R.H.S. 2.2 Conversions:
TAFL 8 (ECS-403)
TAFL 9 (ECS-403)
TAFL 10 (ECS-403)
TAFL 11 (ECS-403) 2.2.2 F.A. to Regular Expression 2.2.2.1 using formula 2.2.2.2 using Arden s Theorem
TAFL 12 (ECS-403)
TAFL 13 (ECS-403)
TAFL 14 (ECS-403)
TAFL 15 (ECS-403) NOTE: Arden s Theorem
TAFL 16 (ECS-403) 2.3 Pumping Lemma: Regular Languages are the languages that can be represented by Regular Expressions. But there are certain languages which cannot be represented using Regular Expressions. Such Languages are recognized as non-regular languages. For recognizing non-regular languages there is a theorem called pumping lemma is used. Theorem: Let L be a regular language. Then there exists a constant n (which depends on L) such that for every string z in L such that z n, we can break z into three strings, z = uvw, such that: v Ԑ uv n For all i 0, the string uv i w is also in L. Proof: Suppose L is regular. Then L = L(M) for some DFA A. Suppose A has n states. Now, consider any string z of length n or more, say w = a 1, a 2... a m where m n and each a i is an input symbol. The transition function δcould be written as δ(qo, q 1, q 2.. q i ) = q i, If q m is in F i.e. q 1, q 2.q m is in L(M) a 1 a 2 a j a k+1 a k+2 a m is also in L(M). Since there is path from q 0 to q m that goes throw q j but not around the loop labeled a j+1..a k. Thus: δ(qo, a 1, a j a k+1 a m ) = δ( δ (qo, q 1, q 2.. q j ), a k+1 a k+2 a m ) = δ (q j, q k+1, q m ) = δ (q k, q k+1 q m ) = q m
TAFL 17 (ECS-403) So any given string can be accepted FA, we should be able to find a substring near the beginning of the string that may be pumped i.e. repeated as many times as we like and resulting string may be accepted by F.A. But reverse will not be true. Example1: Show that L = {0 n 1 n+1 n > 0} is not regular. For n=1 011 For n=2 00111 For n=3 0001111 and so on Now consider z = 00111 such that u = 0, v = 01 and w = 11 uv 2 w = 0010111 uv 3 w = 001010111 we can see that uv i w is not in language L i.e. L = {0 n 1 n+1 n > 0} is not regular Exercise: 1. Show that L = {0 2n n > 0} is regular. 2. Show that L = {a p p is a prime} is not regular.
TAFL 18 (ECS-403) 2.4 Closure Properties of Regular Languages. 1. The union of two regular languages is regular. 2. The intersection of two regular languages is regular. 3. The complement of a regular language is regular. 4. The difference of two regular languages is regular. 5. The reversal of a regular language is regular. 6. The closure (star) of a regular language is regular. 7. The concatenation of regular languages is regular. 8. A homomorphism (substitution of strings for symbols) of a regular language is regular. 9. The inverse homomorphism of a regular language is regular. 2.5 Decision Properties of Regular Languages. 1. Membership Property: is a particular string w belongs to some language L? 2. Emptiness Property: is the given language empty? 3. Equivalence Property: Do the two descriptions of a language represent the same language? 2.6 F.A with Output: There are two types of FA with outputs and those are: 1. Moore Machine 2. Mealy Machine
TAFL 19 (ECS-403) 2.6.1 Introduction to moore and mealy machine Moore Machine Six Tuple (Q,,, δ, λ, q 0 ) Q: finite set of states : finite set of input symbols : an output alphabet δ: an transition function Q X Q λ: output function Q q 0: initial state of machine Mealy Machine Six Tuple (Q,,, δ, λ, q 0 ) Q: finite set of states : finite set of input symbols : an output alphabet δ: an transition function Q X Q λ: output function Q X q 0: initial state of machine Example: Example: Current Next State Output State 0 1 λ q 0 q 1 q 2 1 q 1 q 2 q 1 0 q 2 q 2 q 0 0 Current Next State State a=0 a=1 State Output State Output q 1 q 2 A q 3 A q 2 q 2 B q 3 A q 3 q 2 A q 3 B
TAFL 20 (ECS-403)
TAFL 21 (ECS-403) 2.6.2 Conversion from moore to mealy machine Current Next State Output State 0 1 λ q 0 q 0 q 1 0 q 1 q 2 q 0 1 q 2 q 1 q 2 2 λ (q,a) = λ(δ(q,a) λ (q 0,0) = λ(δ(q 0,0) = λ (q 0 ) = 0 λ (q 0,1) = λ(δ(q 0,1) = λ (q 1 ) = 1 λ (q 1,0) = λ(δ(q 1,0) = λ (q 2 ) = 2 λ (q 1,1) = λ(δ(q 1,1) = λ (q 0 ) = 0 λ (q 2,0) = λ(δ(q 2,0) = λ (q 1 ) = 1 λ (q 2,1) = λ(δ(q 2,1) = λ (q 2 ) = 2 Current Next State State a=0 a=1 State Output State output q 0 q 0 0 q 1 1 q 1 q 2 2 q 0 0 q 2 q 1 1 q 2 2
TAFL 22 (ECS-403) 2.6.3 Conversion from mealy to moore machine Current Next State State a=0 a=1 State Output State output q 1 q 3 0 q 2 0 q 2 q1 1 q 4 0 q 3 q 2 1 q 1 1 q 4 q 4 1 q 3 0 Current Next State State a=0 a=1 State Output State output q 1 q 3 0 q 20 0 q 20 q 1 1 q 40 0 q 21 q 1 1 q 40 0 q 3 q 21 1 q 1 1 q 40 q 41 1 q 3 0 q 41 q 41 1 q 3 0 Current Next State Output State 0 1 λ q 1 q 3 q 20 1 q 20 q 1 q 40 0 q 21 q 1 q 40 1 q 3 q 21 q 1 0 q 40 q 41 q 3 0 q 41 q 41 q 3 1
TAFL 23 (ECS-403) This moore machine accepts a zero-length sequence which is not accepted by the mealy machine so we must add a new starting state. Current Next State Output State 0 1 λ q 0 q 3 q 20 0 q 1 q 3 q 20 1 q 20 q 1 q 40 0 q 21 q 1 q 40 1 q 3 q 21 q 1 0 q 40 q 41 q 3 0 q 41 q 41 q 3 1 2.7 Applications and Limitations of F.A. Applications: 1. Text Editors: used for processing the text. 2. Lexical Analyzers: to scan the input programs Limitations: FA does not contain any memory so it cannot solve following types of problems: 1. Checking well-formedness of parenthesis 2. Checking the palindrome condition of given language.