Closure Properties of Context-Free Languages Foundations of Computer Science Theory
Closure Properties of CFLs CFLs are closed under: Union Concatenation Kleene closure Reversal CFLs are not closed under intersection, difference, or complement
Closure Under Union Let L and M be CFLs with grammars G and H, respectively Assume G and H have no variables in common Rename the variables if necessary Names of variables do not affect the language Let S 1 and S 2 be the start symbols of G and H Form a new grammar for L M by combining all the symbols and productions of G and H
Closure Under Union Then, add a new start symbol S Add productions S S 1 S 2 In the new grammar, all derivations start with S The first step replaces S with either S 1 or S 2 In the first case, the result must be a string in L(G) = L, and in the second case, the result must be a string in L(H) = M
Warning: Be Careful Using Union If L 1 and L 2 are context-free, then so is L 3 = L 1 L 2 But what if L 3 and L 1 are context-free? What can we say about L 2? L 2 may or may not be context-free For example, consider the following: a n b n c* = a n b n c* a n b n c n L 3 L 1 L 2
Closure Under Concatenation Let L and M be CFLs with grammars G and H, respectively Assume G and H have no variables in common Let S 1 and S 2 be the start symbols of G and H Form a new grammar for LM by starting with all symbols and productions of G and H Add a new start symbol S and production S S 1 S 2 Every derivation from S results in a string in L followed by one in M
Closure Under Star Let L have grammar G, with start symbol S 1 Form a new grammar for L* by introducing to G a new start symbol S and the productions S S 1 S ε A rightmost derivation from S generates a sequence of zero or more S 1 s, each of which generates some string in L
Closure of Under Reversal If L is a CFL with grammar G, form a grammar for L R by reversing the body of every production Example: Let G have S 0S1 01 The reversal of L(G) has grammar S 1S0 10
CFLs are Not Closed Under Intersection Unlike the regular languages, the class of CFLs is not closed under intersection For example, we know that L 1 = {0 n 1 n 2 n : n 1} is not context-free (use the pumping lemma) However, L 2 = {0 n 1 n 2 i : n 1, i 1} is context-free CFG: S AB, A 0A1 01, B 2B 2 So is L 3 = {0 i 1 n 2 n : n 1, i 1} But L 1 = L 2 L 3 is not context-free
CFLs are Not Closed Under Difference We can prove something more general: Any class of languages that is closed under difference is also closed under intersection Proof: L M = L (L M) Thus, if CFL s were closed under difference, they would also be closed under intersection, but they are not
CFLs are Not Closed Under Complement L 1 L 2 = ( L 1 L 2 ) Recall that the context-free languages are closed under union So if they were closed under complement, they would also be closed under intersection (which they are not)
CFLs are Not Closed Under Complement Recall A n B n C n = {a n b n c n : n 0}. Now consider L = (A n B n C n ), which is L 1 L 2, where: L 1 = {w {a, b, c}* : the letters are out of order} L 2 = {a i b j c k : i, j, k 0 and (i j or j k)} (in other words, unequal numbers of a s, b s, and c s) A simple DFA can recognize L 1. L 2 can be built similar to the one we created for accepting strings with an unequal number of a s and b s. Thus, (A n B n C n ) is context-free, whereas A n B n C n is not context-free (as we already proved).
Intersection With a Regular Language The intersection of two context-free languages may or may not be context-free Closure means the result is guaranteed to be context-free But the intersection of a CFL with a regular language is always context-free The proof involves running an NFA in parallel with a PDA, and noting that the combination is a PDA that accepts by final state
Difference With a Regular Language The difference (L C L R ) between a context-free language L C and a regular language L R is always context-free Proof: L C L R = L C L R If L R is regular then so is L R If L C is context-free then so is L C L R However, the difference (L R L C ) between a regular language L R and a context-free language L C may or may not be context-free
Difference With a Regular Language Example: Let L = {a n b n : n 0 where n 1776} Alternatively, L = {a n b n : n 0} {a 1776 b 1776 } We know that {a n b n : n 0} is not regular (but it is context-free) {a 1776 b 1776 } is regular (because it is finite) Therefore, L must also be context-free
Using Closures with Pumping L = {ww : w {a, b}*} L is not context-free, but using the pumping lemma for CFLs to prove that it is not can be tricky: suppose we choose w = (ab) k Then we could break up string w into uvxyz, by letting v = ab and y = ab, and this pumps fine w w ababab abababababab ababab
Using Closures with Pumping L = {ww : w {a, b}*} continued Suppose we choose w = a k ba k b This also pumps fine if v is a in the first group of a s, and y is a in the second group of a s w aaaaaa aaaaaaabaaaa aaaaaaaaab w
Using Closures with Pumping L = {ww : w {a, b}*} continued Consider L 2 = L a*b*a*b* = a n b m a n b m If L were context-free, then L 2 would also be context-free because context-free languages are closed under intersection with regular languages But L 2 is not context-free, which we have already proven using the pumping lemma for context-free languages Choose a k b k a k b k for pumping Therefore L cannot be context-free either
Using Closures with Pumping Let L = {w {a, b, c}* : # a (w) = # b (w) = # c (w) } If L were context-free, then L 2 = L a*b*c* would also be context-free But L 2 = A n B n C n If L were context-free, then L 2 would also be contextfree because it is the intersection of a context-free language with a regular language But L 2 is not context-free, as we have already proven using the pumping lemma for context-free languages Therefore L cannot be context-free either