Regular Expressions and Regular Grammars Laura Heinrich-Heine-Universität Düsseldorf Sommersemester 2011 Regular Expressions (1) Let Σ be an alphabet The set of regular expressions over Σ is recursively defined: is a regular expression denoting the set is a regular expression denoting the set {} For each a Σ: a is a regular expression denoting {a} If r and s are regular expressions denoting R and S, then (r + s), (rs) and (r ) are also regular expressions denoting R S, RS and R respectively Regular Grammars 1 12-13 April 2011 Regular Grammars 3 12-13 April 2011 Overview 1 Regular expressions 2 NFA with empty transitions 3 FSA and regular expressions 4 Regular Grammars 5 Regular Grammars and NFAs Regular Expressions (2) Assume that has a higher priority than concatenation which in turn has a higher priority than + a + is an abbreviation for aa For a regular expression x, we define L(x) as the set denoted by x Regular Grammars 2 12-13 April 2011 Regular Grammars 4 12-13 April 2011
Regular Expressions (3) The following holds: For each language L: there is a regular expression x with L = L(x) iff there is a FSA M with L = L(M) In order to show this, we define NFAs with empty transitions and show their equivalence to NFAs; show how to construct an equivalent NFA with empty transitions for a given regular expression; NFA with empty transitions (2) Let Q, Σ, δ, q 0, F be an NFA with empty transitions Definition of ˆδ: -closure(q) := {q there is a path from q to q containing only -transitions} {q} ˆδ(q, ) := -closure(q) ˆδ(q, wa) := p P -closure(p) where P = {p there is an r ˆδ(q, w) such that p δ(r, a)} show how to construct an equivalent regular expression for a given DFA Regular Grammars 5 12-13 April 2011 Regular Grammars 7 12-13 April 2011 NFA with empty transitions (1) A NFA with empty transitions is defined like an NFA except that δ allows for ε as second argument: A NFA with empty transitions M is a quintuple Q, Σ, δ, q 0, F such that Q, Σ, q 0 and F are like in a NFA δ : Q (Σ {ε}) P(Q) is the transition function NFA with empty transitions (3) Construction of an equivalent NFA Q, Σ, δ, q 0, F (without empty transitions) for a given NFA Q, Σ, δ, q 0, F with empty transitions: F := F {q 0 } if -closure(q 0 ) F Otherwise, F := F δ (q, a) := ˆδ(q, a) DFAs, NFAs and NFAs with empty transitions are equivalent Regular Grammars 6 12-13 April 2011 Regular Grammars 8 12-13 April 2011
FSA and regular expressions (1) To show: 1 For a given regular expression r there exists an NFA with empty transitions that accepts L(r) The set of languages denoted by regular expressions is a subset of the set of languages accepted by FSAs 2 For a given DFA, there exists a regular expression r such that L(r) is exactly the language acctepted by the DFA The set of languages accepted by FSAs is a subset of the set of languages denoted by regular expressions FSA and regular expressions (3) r = r 1 + r 2 : 0 r = r 1 r 2 : 1 2 1 f1 r = r 1 : q 0 qf1 qf2 2 1 q f1 qf2 qf0 Regular Grammars 9 12-13 April 2011 Regular Grammars 11 12-13 April 2011 FSA and regular expressions (2) Construction of an equivalent NFA with empty transitions for a given regular expression r: r = : q0 r = : 0 qf r = a, a Σ: a 0 qf FSA and regular expressions (4) To show: For each DFA, there is an equivalent regular expression Assume that {q 1,, q n } it the set of states Define Ri,j k as the set of sequences of input symbols that can be obtained from following all paths from q i to q j that do not pass through states q l with l > k Question: What do the Ri,j k look like and what are the corresponding regular expressions ri,j k? k = 0: Only paths of length 0 or 1 can be considered, since, between q i and q j, no other states q l are possible Each r 0 i,j has the form a 1 + a 2 + + a m or (if i = j) a 1 + a 2 + + a m + or (if i j and there is no edge from q i to q j, ) Regular Grammars 10 12-13 April 2011 Regular Grammars 12 12-13 April 2011
FSA and regular expressions (5) In general, R k ij contains all elements from R k 1 ij, all concatenation of a) an element from R k 1 ik, b) any number (including 0) of elements from R k 1 kk and c) an element from R k 1 Consequently, r k i,j = rk 1 ij + r k 1 ik (rk 1 kk ) r k 1 Finally, the regular expression for the language accepted by the automaton is r n 1f1 + rn 1f2 + + rn 1fm if F = {q f1,, q fm } FSA and regular expressions (7) The JFLAP conversion uses a different definition: Rij k is the set obtained by using no states q l with l < k Then ri,j k = rk+1 ij + r k+1 r 1 i,j ik (rk+1 kk ) r k+1 is the regular expression for the language accepted by the DFA With this, we obtain (ab c) ab for the sample automaton Regular Grammars 13 12-13 April 2011 Regular Grammars 15 12-13 April 2011 FSA and regular expressions (6) Regular Grammars (1) Example: abcsjff A type 2 grammar is called r 2 12 = r 1 12 + r 1 12(r 1 22) r 1 22 right-linear iff for all productions A β: β T or β = β X with β T, X N r12 1 = r0 12 + r0 11 (r0 11 ) r12 0 = a + ( )a = a left-linear iff for all productions A β: β T or β = Xβ with β T, X N r 1 22 = r0 22 + r0 21 (r0 11 ) r 0 12 regular if it is left-linear or right-linear = (b + ) + c() a = + b + ca For each left-linear grammar there is an equivalent right-linear grammar and vice versa r 2 12 = a + a( + b + ca)+ = a( + b + ca) = a(b + ca) Regular Grammars 14 12-13 April 2011 Regular Grammars 16 12-13 April 2011
Regular Grammars (2) The following holds: L = L(G) for a regular grammar G iff L is a regular set (ie, can be described by a regular expression) To show this, we show that for each right-linear grammar, there exists an equivalent NFA; and for each DFA, there is an equivalent right-linear grammar Regular Grammars (4) Construction of a right-linear grammar for given DFA: The states are the nonterminals The start state is the start symbol For each transition a Q Q add a production Q aq and, if Q F, a production Q a If we construct the right-linear grammar for L R (L reversed) and we mirror all right-hand sides of productions, we obtain a left-linear grammar for L Regular Grammars 17 12-13 April 2011 Regular Grammars 19 12-13 April 2011 Regular Grammars (3) Construction of equivalent NFA (with empty transitions) for given right-linear G: Start state [S] For all productions A β 1 β 2 there is a state [β 2 ] Transitions simulate top-down left-to-right traversal of derivation tree: For every A β: [A] [β] For every state [aβ] with a T: a [aβ] [β] Final state is [] Regular Grammars 18 12-13 April 2011