EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM Please read all instructions, including these, carefully There are 7 questions on the exam, with multiple parts. You have 3 hours to work on the exam. The exam is open book, open notes. Please write your answers in the space provided on the exam and clearly mark your solutions. Write only the final answer on the exam. Scratch paper is available at the front of the room. Each problem has a straightforward solution. Solutions will be graded on correctness and clarity. Partial solutions will be given partial credit. NAME : Problem Max points Points 1 15 2 20 3 15 4 10 5 15 6 10 7 15 TOTAL 100

1. (a) Explain why an LL(1) parser cannot handle left-recursive grammars. An LL(1) parser works by replacing non-terminals that appear on the top of the parse stack with one of the non-terminal s right hand sides. If one of these productions is leftrecursive, i.e. if the non-terminal of the left-hand side occurs first on the right-hand side, when that right had side is pushed the same non-terminal will again appear on the stack top. Since the lookahead symbol that caused the left-recursive right hand side to be pushed on the stack is still in the input, the action for the non-terminal A on the stack top given the next symbol a in the input (i.e., the parse table entry for [A,a]) will be to replace A on the stack top with the same left-recursive right hand side for A. Thus, the parser will go into an infinite loop, since it will continue to replace A with the left-recursive right hand side forever. (b) The following production, part of a grammar for arithmetic expressions, is immediately left-recursive. Transform it into an equivalent production or equivalent productions that are not left-recursive in order to provide a suitable basis for the implementation of an LL(1) parser. Exp Prim Exp Exp Exp + Prim Exp * Prim Prim Exp + Prim Exp * Prim Exp ε (c) Based on your transformed grammar fragment from question (b) above, write a parsing function for a recursive descent parser for the non-terminal Exp. parseexp: t = getnexttoken(); Prim(t); Exp (t); 1

2. (a) Construct the DFA that recognizes the viable prefixes for the following CFG: S aab cs B A eab Af g B d S, A, and B are non-terminal symbols, S is the start symbol, and a, b, c, d, e, f, and g are terminal symbols. 2

(b) Construct the SLR(1) parse table for the grammar. ACTION a b c d e f g EOF 0 S1 S7 S4 1 S6 S2 2 R:A g R:A g R:A g 3 R: S B 4 R:B d R:B d R:B d R:B d 5 S8 S9 6 S6 S2 7 S1 S7 S4 8 R:S aab 9 R:A Af R:A Af R:A Af 10 S4 S9 11 R: S cs 12 R:A eab R:A eab R:A eab GO TO S A B 0 3 1 5 2 3 4 5 6 10 7 11 3 8 9 10 12 11 12 3

(c) Show the steps of the SLR(1) parser for the string caegdffb. (d) Which entries in the parse table would differ if you had instead constructed an LR(1) parser from this grammar, and what would they be? The LR(1) parse table would indicate a reduce for explicit lookaheads, rather than for any terminal in the Follow set for the non-terminal on the left of the rule. Because both S and B appear only on the far right of any rules, EOF would be the lookahead for every case where they appear; since EOF is the only symbol in the Follow set for both S and B the table entries would not be affected. The reductions for non-terminal A would be affected, since the lookahead will differ depending on the state we are in. In fact, there would be separate states for reducing by the A productions for different lookaheads. For example, state 1 above would contain the following: S a Ab A eab, b A Af, b A g, b A eab, f A Af, f A g, f 4

State 6 would then contain: A e AB, b A eab, d A Af, d A g, d A eab, f A Af, f A g, f The GOTO from state 6 on e would be a new state containing: A e AB, d A Af, d A g, d A eab, f A Af, f A g, f GOTO from state 6 on A would go to state 10, containing: A ea B, b A A f, d B d, $ Then GOTO from state 10 on B would be state 12: A eab, b Therefore, in state 12, the reduce would be indicated in the column for input b only. The reduce on d that would eventually result from the new state created above (from state 6 on e) would appear in another new state. 5

3. Consider the following grammar: S as Ab A XYZ ε X cs ε Y ds ε Z es (a) Give an LL (1) parse table for this grammar. Even though it was not required, it is best to first calculate the First and Follow sets for the nonterminals: a b c d e EOF S S as S Ab S Ab S Ab S Ab ACCEPT A A ε A XYZ A XYZ A XYZ X X cs X ε X ε Y Y ds Y ε Z Z es 6

(b) Show the steps of the parser for the string acbebb. Stack Input Action S acbebb[eof] S as as acbebb[eof] Match a S cbebb[eof] S Ab AbS cbebb[eof] A XYZ XYZbS cbebb[eof] X cs csyzbs cbebb[eof] Match c SYZbS bebb[eof] S Ab AbYZbS bebb[eof] A ε byzbs bebb[eof] Match b YZbS ebb[eof] Y ε ZbS ebb[eof] Z es esbs ebb[eof] Match e SbS bb[eof] S Ab AbbS bb[eof] A ε bbs bb[eof] Match b bs b[eof] Match b S [EOF] Accept 7

CS331 Compiler Design 4. Consider the following context free grammar: S d a A e A d a A E E E b P P P d p A p (a) Is the grammar SLR(1)? Why or why not? The grammar is SLR(1). There are no shift/reduce or reduce/reduce conflicts. This can be seen by looking at the grammar carefully and examining rules that can cause conflicts. The biggest possibility appears to be in the following rules: S d a A e A d a A They are very similar, so they could cause a shift/reduce conflict. However, in order for them to cause a shift/reduce conflict, they must appear in the same state, with a dot at the end of the A rule and a dot before the final e of the S rule. This never occurs, because the state that includes S d a A e will (by closure) include all of the A rules with the dot at the far left; the GOTO from this state on A will include S d aa e but none of the A rules. Another possibility arises in the following state: A d a A A d a A A E A P P d P pap However, there are no possibilities of shift/reduce or reduce/reduce here either because. is not at the end of any of the items. On seeing a d, we shift to the following state: A d a A P d Shift/reduce error could occur here if Follow(P) includes a, which it does not. There are no other possible states where conflict situations can arise. (b) Is the grammar LR(1)? Why or why not? The grammar is LR(1) because SLR(1) grammars are a subset of LR(1) grammars. Since we showed the grammar is SLR(1) in part (a), the grammar is by definition also LR(1). 8

5. Consider the following three grammars: (1) A BC B Ax x C yc y (2) A BC B Ax x ε C yc y (3) A BC B Ax x ε C yc y ε For each of the following statements about First and Follow sets, indicate for which grammar(s) (1, 2, or 3) the statement is correct. Each statement is correct for one or more grammars. Be sure to list all the grammars for which a statement is correct. First(A) = {x,y} 2 Follow(A) = {$,x} 1, 2, 3 Follow(B) = {$,x,y} 3 First(C) = {y} 1,2 Follow(C) = {$,x} 1,2,3 9

6. For each grammar explain why, or why not, the grammar is LL (1): (a) S ABBA A a ε B b ε This grammar is not LL(1) because the First sets of the right hand sides for B are not disjoint: B : First( b ) = {b} First( ε )= Follow(B) = {b, a, EOF} (b) S a S e B B b B e C C c C e d This grammar is LL(1) because the First sets of the right hand sides for all non-terminals are disjoint: S : First( ase ) = {a} First( B )= {b, c, d} B : First( bbe ) = {b} First( C )= {c, d} C : First( cce ) = {c} First( d )= {d} 10

7. Consider the following grammar for specifying binary trees (in linearized form): BinTree (num BinTree 1 BinTree 2 ) ε (a) Extend the above grammar by defining a translation scheme such that a depth-first left-toright traversal of the parse tree with semantic actions will entail checking if the binary tree is ordered. A binary tree is ordered if the values of the numbers of the first subtree (BinTree1) are less than the value of the number of the node currently being visited and the values of the numbers of the second subtree (BinTree2) are greater than the value of the number of the node currently being visited. For example, (2 (1 nil nil) (3 nil nil)) is ordered but (1 (2 nil nil) (3 nil nil)) is not. BinTree (num { BinTree.num = num; } BinTree 1 { if (BinTree 1.tree == null) {BinTree.min = num; BinTree.ordered = true; else BinTree.ordered = BinTree 1.ordered && (BinTree.num > BinTree 1.max); BinTree.min = BinTree 1.min; BinTree.tree = nonnull; } BinTree 2 ) { if (BinTree 1.tree = null) BinTree.max = num; else BinTree.ordered = BinTree 2.ordered && (BinTree.num < BinTree 2.min); BinTree.max = BinTree 2.max ; BinTree.tree = nonnull; } BinTree ε { BinTree.tree = null} (b) Show the translation of the input string (2 (1 nil nil) (3 nil nil)) by decorating its parse tree. Make sure your tree is drawn clearly. BinTree BinTree.num = 2 BinTree.min = 1 BinTree.max = 3 BinTree.ordered = true BinTree.tree = nonnull ( 2 BinTree BinTree.num = 1 BinTree.min = 1 BinTree.ordered = true BinTree.tree = nonnull BinTree BinTree.num = 3 BinTree.max = 3 ) BinTree.ordered = true BinTree.tree = nonnull ( 1 BinTree BinTree ) ( 3 BinTree BinTree ) BinTree.tree = null BinTree.tree = null BinTree.tree = null BinTree.tree = null ε ε ε ε 11