Closure Properties of Regulr Lnguges Regulr lnguges re closed under mny set opertions. Let L 1 nd L 2 e regulr lnguges. (1) L 1 L 2 (the union) is regulr. (2) L 1 L 2 (the conctention) is regulr. (3) L 1 (the Kleene str) nd L 1 + (the Kleene plus) re regulr. (4) L R 1 (the reversed lnguge) is regulr. (5) L 1 (the complement) is regulr. (6) L 1 L 2 (the intersection) is regulr. (7) L 1 - L 2 (the set sutrction) is regulr. 104
Proof of the Closure Properties We cn either use regulr grmmrs, F, or regulr expressions for the simplicity of the proof. Let r 1 nd r 2 e regulr expressions tht, respectively, express the lnguges L 1 nd L 2. (1) Clerly, r 1 + r 2 is regulr expression which denotes the union of two lnguges L 1 nd L2, respectively, denoted y r 1 nd r 2. ince every regulr expression denotes regulr lnguge, L 1 L 2 is regulr. We cn lso constructively prove this property s follows; let G 1 = ( V N1, V T1, P 1, 1 ) nd G 2 = ( V N2, V T2, P 2, 2 ) e regulr grmmrs tht generte L 1 nd L 2, respectively. Without loss of generlity, ssume tht V N1 nd V N2 re disjoint, i.e., V N1 V N2 = φ. Otherwise, we cn lwys convert the given grmmrs to the ones tht stisfy such property. Construct regulr grmmr G with production rules 1 2 nd ll the rules in P 1 nd P 2. Clerly, L(G) = L 1 L 2. (2) Clerly r 1 r 2 is regulr expression which denotes the lnguge L 1 L 2, which mens L 1 L 2 is regulr. (3) Let r 1 e regulr expression for L 1. Clerly, (r 1 ) * is regulr expression for L 1 ince L 1 + = L 1 - {ε}, y property (7) tht will e proved, L 1 + is regulr. 105
Proof of the closure Properties (cont ed) (4) uppose tht the following F M 1 ccepts L 1. We modify M 1 s shown elow. Clerly, the resulting utomton recognizes the reversed lnguge of L 1. strt ε strt dd new ccepting stte strt ε ε ε Let the new ccepting stte e the strt stte, reverse the direction of ll the edges, nd the old strt stte e the only ccepting stte. 106
Proof of the Closure Properties (cont ed) You cn lso prove prt (4) using regulr grmmr G 1 using nother form of regulr grmmrs, where the production rules re restricted to the form either Bx, or x, where nd B re ritrry nonterminl symols, nd x is string of terminl symol, or ε. (Recll tht we chose to restrict to xb, or x.) If we reverse the right side of ech production rule, then the resulting grmmr G genertes L R 1. (5) s for prt (4), we modify the finite trnsition grph M 1 of n utomton tht recognizes L 1 s follows. dd the ded stte, if it is not shown in the trnsition grph. (Recll tht we usully do not show the ded stte for convenience.) Chnge ccepting sttes to non-ccepting sttes nd non-ccepting sttes to ccepting sttes. (6) ince L 1 L 2 = L 1 L 2 = L 1 L 2, nd regulr lnguge re closed under union nd complementtion (properties (1) nd (5) ove), L 1 L 2 is regulr. (7) ince L 1 - L 2 = L 1 L 2, it is regulr y properties (5) nd (6)ove. 107
Properties of Context-free Lnguges Let L 1 nd L 2 e CFL s. (1) L 1 L 2 (the union) is CFL. (2) L 1 L 2 (the conctention) is CFL. (3) L 1 * (the Kleene str) nd L 1 + (the Kleene plus) re CFL. (4) L 1 R (the reversed lnguge) is CFL. (5) L 1 L 2 (the intersection) is not necessrily CFL. (6) L 1 (the complement) is not necessrily CFL. 108
Proof of the Context-free Lnguge Properties Let G 1 = ( V N1, V T1, P 1, 1 ) nd G 2 = ( V N2, V T2, P 2, 2 ) e CF grmmrs tht generte L 1 nd L 2, respectively. Without loss of generlity, ssume tht V N1 nd V N2 re disjoint, i.e., V N1 V N2 = φ. (Otherwise, we cn modify them.) (1) Construct CFG G y merging the rules of grmmrs G 1 nd G 2 nd dding new rules 1 2. (This is the sme technique for regulr lnguges.) (2) Construct CFG G y merging the rules of G 1 nd G 2 nd dding new rule 1 2. (3) For L 1 * dd rules 1 ε in grmmr G 1. For L 1 + dd rules 1 1,where is new strt symol. (4) Construct CFG from G 1 y chnging ech rule β to β R, i. e., reverse right side of ech production rule. 109
Proof of the Context-free Lnguge Properties (cont ed) (5) We know tht L 1 = { i i c j i, j 0 } nd L 2 = { k n c n k, n 0 } re CFL s. But L 1 L 2 = { i i c i i 0 } is not CFL. (6) uppose tht CFL s re closed under complementtion. ince CFL s re closed under union (property (1)), nd L 1 L 2 = L 1 L 2, which implies CFL s re closed under intersection. This contrdicts to the proven fct of property (5). 110
Minimizing the Numer of ε-production Rules Theorem. Given n ritrry CFG G, we cn construct CFG G such tht L(G) = L(G ) nd if ε is not in L(G), then G dose not hve ε- production rule. If ε L(G), then ε is the only ε-production rule of G. Proof (n lgorithm). Let G = (V T, V N, P, ), nd let, B V N. We construct CFG G = (V T,V N,P, ) from G y the following steps. (1) Find the set W of ll nonterminls of G which derive ε s follows; W 0 = { V N nd ε is in P}; Do W i+1 = W i { V N nd α is in P, for some α W i + }; until (W i+1 = W i ); W = W i ; //W contins ll nonterminl symols from which ε cn e derived. (2) Delete ll ε-production from P. Cll this new set of productions P 1. (3) Modify P 1 to P s follows: If production α is in P 1, then put the rules α nd β into P, for ll β ( ε ) which re otined from α y deleting one or more nonterminls in the set W constructed y step (1). (4) If is in W, then dd ε in P. 111
Minimizing the Numer of ε-production Rules (exmple) Convert the following CFG G to nother CFG G such tht L(G) = L(G ) nd G hs the smllest possile numer of e-production rules. G: DC EFg ε D FGH C c ε E F f ε G Gg H H h ε Computing W: W 0 = {, C, F, H} W 1 = W 0 {G} = {, C, F, G, H} W 2 = W 1 {D} = {, C, D, F, G, H} W 3 = W 2 {} = {, C, D, F, G, H, } W 4 = W 3 {} = {, C, D, F, G, H, } P 1 : DC EFg D FGH C c E F f G Gg H H h P : DC D C DC D C ε EFg Eg D FGH FG FH GH F G H C c E F f G Gg g H H h 112
Eliminting Useless ymols from CFG Lemm 1. Given CFG G = (V T, V N, P, ), we cn construct n equivlent CFG G = (V T, V N, P, ), such tht every nonterminl symol in V N derives string x (V T ) * Proof. Let OLDV nd NEWV e sets of nonterminls, nd e n ritrry nonterminl. We construct V N nd P s follows. OLDV = φ; NEWV = { w is in P for some w (V T ) * }; while (OLDV NEWV) do { OLDV = NEWV; } NEWV = OLDV { α for some α in (V T OLDV) * }; V N = NEWV; P = { α α is in P nd α (V N V T ) * }; 113
Eliminting Useless ymols from CFG (cont ed) Lemm 2. Given CFG G = (V T,V N, P, ), we cn construct n equivlent CFG G = (V T,V N, P, ), such tht, for ech symol X V T V N, the strt symol derives αxβ, for some α, β (V T V N) *, i.e., cn derive sententil form ( string of terminls nd nonterminls) which contins symol X. Proof. The following lgorithm computes V T, V N nd P. (1) Let V T nd V N e the empty sets. (2) Put into V N. (3) If V N is put into V N nd α 1 α 2... α n, then ll nonterminls in α i, 1 i n, re put into V N nd ll terminls in re put into V T. (4) Repet (3) until there is no symol to e dded to V N. (5) Let P contin ll the productions in P except for the ones which hve symol not in V T V N. 114
Eliminting Useless ymols from CFG (cont ed) Theorem. Given ritrry CFG G = (V T, V N, P, ), we cn construct n equivlent CFG G = (V T, V N, P, ), such tht, (1) for ech V N, (V ) * T (i.e., derives terminl string or ε ), nd (2) for ech X V T V N, αxβ, for some α, β V N (V ) * T, (i.e., the strt symol cn drive sententil form which contins X). Proof. Use Lemms 1 nd 2. 115
Eliminting Useless ymols from CFG (exmple) Exmple. Eliminte useless symols from the following CFG G. G: D EFg GD D FGd C ccec E Ee F Ff ε G Gg g H hh h tep 1: pply Lemm 1 to find the set of nonterminls V N such tht every nonterminl symol in V N derives string x (V T ) *. OLDV = {}; NEWV = {F, G, H} OLDV = NEWV; NEWV = OLDV {D} = {D, F, G, H}; OLDV = NEWV; NEWV = OLDV {} = {, D, F, G, H}; OLDV = NEWV; NEWV = OLDV {} = {, D, F, G, H, }; OLDV = NEWV; NEWV = OLDV { } = {, D, F, G, H, }; V N = NEWV = {, D, F, G, H, } Find the set of rules P. P : D GD D FGd F Ff ε G Gg g H hh h 116
P : D GD D FGd F Ff ε G Gg g H hh h tep 2: Find the set of symols V = V T V N such tht ech symol in V cn e derived strting from. 1. V T = V N = {}; // initilize with empty set 2. V N = V N {} V T = V T {} 3. V N = V N {, D} = {,, D} V T = V T {} 4. V N = V N {G, F} = {,, D, G, F} V T = V T {, d} ={, d} 5. V N = V N {} = {,, D, F, G} V T = V T {, d} ={, d, g, f} 6. V N = V N {} = {,, D, F, G} V T = V T {} ={, d, g, f} Clened set of rules: P : D GD D FGd F Ff ε G Gg g 117
Eliminting Useless ymols from CFG (cont ed) Remrk: Notice tht pplying Lemm 2 first nd then Lemm 1 my fil to eliminte ll useless productions. Exmple. Consider grmmr with rules P = { B } By pplying Lemm 1 first, we hve P = { Lemm 2, we hve P = { }. }, then pplying However, if we pply Lemm 2 first, we hve P = { B }. Then pplying Lemm 1, we hve P = { }, which still hs useless production. 118
miguous Context-free Grmmrs There re two kinds of miguities in lnguge. Lexicl miguity (or semntic miguity): symol or n expression hs more thn one mening (e.g., story, sw). yntctic miguity (or structurl miguity): n expression cn e prsed in two different wys. CFG is miguous if the lnguge hs string for which there re more thn one prse tree. For given context-free grmmr G nd string x, the prse tree shows how x is derived with the rules of G (see n exmple on the next slide). In progrmming lnguge different prse trees give different oject codes. In this course we will only study syntctic miguity of context-free grmmrs. Exmple 1 (in nturl lnguge). mn entered room with picture cn e interpreted in two different wys. mn entered room the with picture mn entered with picture room the 119
miguous Context-free Grmmrs (cont ed) Exmple 2 (in forml lnguge). The following context-free grmmr is miguous, ecuse it hs two prse trees shown in Figures () nd () elow for string p q r. G: p q r p q r p q r Figure () Figure () 120
ome Techniques for Designing Unmiguous CFG (1) Use prenthesis such tht ech derivtion tree genertes unique string. Notice tht this technique chnges the lnguge y introducing new terminl symols, the prentheses. Exmple: miguous G 1 : p q r Unmiguous G 2 : ( ) ( ) () p q r ( ) p ( ) q r Figure () (p (q r)) ( ) ( ) p q r Figure (). ((p q ) r) 121
ome Techniques for Designing Unmiguous CFG (2) Modify the production rules tht cuse the miguity. Exmples: () Grmmr G 3 elow is clerly miguous grmmr ecuse it cn either generte left side first nd then right side or vice vers for string c. Grmmr G 4 doesn t hve this possiility ecuse it genertes left side s first, if ny. miguous G 3 : c Unmiguous G 4 : c c c c Figure (). miguity of G 3 Figure (). Unmiguous G 4. 122
ome Techniques for Designing Unmiquous CFG (cont ed) () The following grmmr G 5 is miguous, since it cn generte ε in two wys. We eliminte this possiility y pplying the technique of reducing ε-production rules. Grmmr G 6 is the result. G 5 : B D B Bc ε D dde ε G 6 : B D ε B Bc c D dde de (c) Grmmr G 1 cn e modified in two different wys to mke it unmiguous. Notice tht for G 7 we used the sme technique for Exmple () ove. G 7 : p q r G 8 : D D D C D C C C p q r For G 8 we set up precedence rule such tht, if ny, is derived (y ) first, then (y D) nd in tht order from the top of the prse tree. The lter n opertor is derived the higher precedence it hs over the others. 123
Known fcts out miguous context-free grmmrs. There is no lgorithm tht cn tell whether n ritrry CFG is miguous or not. There is so clled inherently miguous context-free lnguges for which every CFG is miguous. Here is n exmple. { n n c m d m n, m 1} { n m c m d n n, m 1}. There is no lgorithm tht cn convert n ritrry miguous CFG, which is not inherently miguous, to n unmiguous one. 124
Norml Forms of Context-free Grmmrs When we investigte context-free grmmrs nd their lnguges, sometimes it is convenient to mke the right side of ech production rule meet certin form. uch form is clled norml form. There re two norml forms for context-free grmmrs; Chomsky Norml Form(CNF) nd Greich Norml Form(GNF). Let G = (V N, V T, P, ) e context-free grmmr. Grmmr G is in CNF, if ll the production rules of the grmmr re of the form fi BC or fi, where, B, C V N, V T. context-free grmmr is in GNF, if every production rule of the grmmr is of the form fi, where V N, V T, nd (V N ) *. Notice tht is string of nonterminl symols or null string. We cn show tht every context-free grmmr whose lnguge does not contin ε cn e converted to CNF nd GNF. (Recll tht we cn eliminte ll ε- production rules from given context-free grmmr, if its lnguge does not contin ε.) The following exmple shows how to convert context-free grmmr to CNF. We cn esily generlize the ide. Converting context-free grmmr to GNF is quite involved (see the text Chpter 6). We shll not study the proof. 125
Converting Context-free Grmmr to CNF(exmple) uppose tht context-free grmmr hs production rule BCDE, which is not in CNF. We introduce new nonterminl symols nd production rules in CNF such tht cn derive the right side string BCDE s follows; 1 B 1 1 // nd we let B 1 derive BCDE s follows; B 1 BC 1 // nd we let C 1 derive CDE s follows; C 1 CD 1 // nd we let D 1 derive DE s follows; D 1 DE 1 // nd we let E 1 derive E s follows; E 1 F 1 E F 1 // nd we let E 1 derive E s follows; Exmple. Convert the following context-free grmmr to CNF. BC B C C C c nswer: 1 1 2 3 2 3 B 4 4 C 5 5 B 1 B 2 B 1 B 2 B 3 B 4 B 3 B 4 B C 1 C C 1 C D 1 D 2 E 1 E 2 D 1 D 2 CD 3 D 3 E 1 E 2 c 126