Context Free Grammars: Introduction CFGs are more powerful than RGs because of the following 2 properties: 1. Recursion Rule is recursive if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2, w 3, w 4 V 2. Self-embedment Rule is self-embedding if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2, w 3, w 4 Σ + Simplifying grammars 1. Eliminating symbols that do not terminate Context Free Grammars: Simplifying CFGs CFG removeunproductive (CFG G) marked = G.Sigma; //productive symbols do oldmarked = marked; for (each X -> alpha in G.R) marked = marked + X; for (each s in alpha) if (s not in marked) marked = marked - X; while (marked!= oldmarked); G.R = NULL; for (each X -> alpha in G.R) G.R = G + X -> alpha; for (each s in alpha) if (s not in marked) G.R = G - X -> alpha; G.V = NULL; G.V = marked; G.Sigma = G.Sigma; G.S = G.S; return G ; 1
2. Eliminating unreachable symbols CFG removeunreachable (CFG G) marked = G.S; //reachable symbols do oldmarked = marked; for (each X -> alpha in G.R) for (each s in alpha) if ((X in marked) AND (s not in marked)) marked = marked + s; while (marked!= oldmarked); G.V = G.V; for (each X in (G.V - G.Sigma)) if (X not in marked) G.V = G.V - X; G.Sigma = G.Sigma; for (each s in G.Sigma) if (s not in marked) G.Sigma = G.Sigma - s; G.R = G.R; for (each (X -> alpha) in G.R) if (X not in marked) G.R = G.R - (X -> alpha); G.S = G.S; return G ; Proving grammar correctness Context Free Grammars: Proving Grammar Is Correct string generate (grammar G) w = G.S; do apply some rule r = X -> alpha A beta from G.R; while (w contains X in (G.V - G.Sigma)); return w; Proof constructed as follows: 1. Construct loop invariant I for above algorithm 2. Show that (a) I is true when loop starts (b) I holds on each iteration of loop (c) At termination, w L(G) 2
Context Free Grammars: Ambiguity Issues arise with CFGs that are not inherent in RGs 1. CFG derivation order not obvious from parse tree Not an issue for RGs there can only be a single NT on RHS Example: Consider grammar < S > < NP >< V P > < NP > < A >< N >< P P > < A >< N > < V P > < V >< NP >< P P > < V >< NP > < P P > < P >< NP > < N > boy < N > man < N > telescope < A > the < A > a < V > saw < P > with Derivations of the man saw a boy S < NP >< V P > < A >< N >< V P > the < N >< V P > the < N >< V >< NP > the < N > saw < NP > the < N > saw < A >< N > the < N > saw < A > boy the < N > saw a boy the man saw a boy Vs S < NP >< V P > < A >< N >< V P > the < N >< V P > the man < V P > the man < V P > the man < V >< NP > the man saw < NP > the man saw < A >< N > the man saw a < N > the man saw a boy 2. CFG may be ambiguous Example: 3
To reduce ambiguity 1. Eliminate ɛ rules CFG removeeps (CFG G) nullable = NULL; for (each rule (X -> alpha) in G.R) if (alpha == epsilon)) nullable = nullable + X; do oldnullable = nullable; for (each rule (X -> alpha) in G.R) if (alpha contains only NTs) flag = TRUE; for (each Y in alpha) if (Y not in nullable) flag = FALSE; if (flag) nullable = nullable + X; while (oldnullable!= nullable); R = G.R; do oldr = R ; for (each (X -> alpha Q beta) in R ) if (Q in nullable) if ((X -> alpha beta) not in R ) AND (alpha beta!= epsilon) AND (X!= alpha beta)) R = R + (X -> alpha beta); while (oldr!= R ); for(each X -> alpha in R ) if (alpha == epsilon) R = R - (X -> alpha); G = (G.V, G.Sigma, R, G.S); return G ; Need to check special case in above algorithm: Language may include ɛ Add special rule for this case at end of algorithm: if (nullable(g.s)) G.V = G.V + G.S ; G.R = G.R - (S -> epsilon); G.R = G.R + (S -> epsilon); G.R = G.R + (S -> S); 2. Eliminating symmetric recursive rules 3. Ambiguous attachment 4
Context Free Grammars: Normal Forms 1. Chomsky Normal Form All rules are in one of forms (a) X a, where a Σ (b) X BC, where B, C V Σ 2. Greibach Normal Form All rules are in form (a) X αβ, where α Σ, β (V Σ) Theorem 11.3: Rule Substitution Statement: Context Free Grammars: Converting to Normal Forms Let G = (V, Σ, R, S) be a CFG with rules r i of form X αy β, where α, β V, Y V Σ Let Y γ 1 γ 2... γ n be a set of rules in R with Y as LHS Let R = (R r i ) X αγ 1 β, X αγ 2 β,..., X αγ n β G = (V, Σ, R, S) Then, L(G ) = L(G) Algorithm uses 4 basic steps Context Free Grammars: Converting to Normal Forms - CNF CFG converttocnf (CFG G) Eliminate epsilon rules; Eliminate unit productions; Eliminate rules where RHS > 1 and have terminal symbol on RHS; Eliminate rules where RHS > 2; return (G.V, G.Sigma, modified rules, G.S); 1. ɛ productions: See above 2. Unit productions CFG removeunits (CFG G) R = G.R; visited = NULL; while (no unit productions in R ) r = (X -> Y) in R ; R = R - r; visited = visited + r; for (each r = (Y -> beta) in R ) if ((X -> beta) not in visited) R = R + (X -> beta); visited = visited + (X -> beta); 5
3. Replacing where RHS > 1 and contain a terminal CFG removemixed (CFG G) R = G.R; for (each a in G.Sigma) create terminal symbol Ta; G.V = G.V + Ta; R = R + (Ta -> a); for (each r in R ) if (length(r.rhs) > 1) for (each s in r.rhs) if (s in G.Sigma) replace s in r with Ts; return (G.V, G.Sigma, R, G.S); 4. Replace rules where RHS > 2 with rules where RHS = 2 CFG removelong (CFG G) R = NULL; for (each r in G.R) if (length(r.rhs) > 2) n = length(r.rhs); r = (X -> Y 1 Y 2...Y n ); R = R + (X -> Y 1 R 1 ); G.V = G.V + R 1 ; for (i = 1; i < n - 2; i++) G.V = G.V + R i+1 ; R = R + (R i -> Y i+1 R i+1 ); G.V = G.V + R n 2 ; R = R + (R n 2 -> Y n 1 Y n ); else R = R + r; return (G.V, G.Sigma, R, G.S); 6
Context Free Grammars: Converting to Normal Form - GNF Algorithm uses 3 basic steps CFG converttognf (CFG G) G = converttocnf(g); Modify rules so that they are of the form S ɛ A cα, where c Σ, α (V Σ) A Bα, where B V Σ, α (V Σ) Modify rules so that they are in GNF; return (G.V, G.Sigma, modified rules, G.S); 1. Convert G to CNF (See CNF algorithm) 2. Modify the rules so that they are of the form S ɛ A cα, where c Σ, α (V Σ) A Bα, where B V Σ,, α (V Σ) (a) Number the NTs (b) Left recusion is eliminated To eliminate direct left recursion Consider rules A Au 1 Au 2... Au j and A v 1 v 2... v k, where the first symbol of u i A Replace these rules with the following Z u 1 u 2... u j u 1 Z u 2 Z... u j Z, where Z is a new symbol A v 1 v 2...v k v 1 Z v 2 Z... v k Z 3. Modify rules to GNF For rules of the form X Y γ, where Y V Σ, replace Y with ths RHSs of rules with Y on the LHS 7