I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

Similar documents
Compiler Design Spring 2017

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Syntax Analysis (Part 2)

Parsing -3. A View During TD Parsing

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

CS 406: Bottom-Up Parsing

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

CA Compiler Construction

Compiler Construction Lectures 13 16

Computer Science 160 Translation of Programming Languages

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

Translator Design Lecture 16 Constructing SLR Parsing Tables

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

Introduction to Bottom-Up Parsing

Why augment the grammar?

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

Creating a Recursive Descent Parse Table

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 7a Syntax Analysis Elias Athanasopoulos

Compiler Construction Lent Term 2015 Lectures (of 16)

Curs 8. LR(k) parsing. S.Motogna - FL&CD

Compiler Construction Lent Term 2015 Lectures (of 16)

Top-Down Parsing and Intro to Bottom-Up Parsing

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

THEORY OF COMPILATION

Syntax Analysis Part I

Parsing VI LR(1) Parsers

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs.

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Bottom-Up Syntax Analysis

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.

Syntax Analysis Part I

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Syntactic Analysis. Top-Down Parsing

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Bottom-Up Syntax Analysis

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

INF5110 Compiler Construction

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4)

Chapter 4: Bottom-up Analysis 106 / 338

Context free languages

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Computing if a token can follow

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. source code. IR IR target.

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Compiler Construction

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

CS153: Compilers Lecture 5: LL Parsing

Briefly on Bottom-up

Syntax Analysis, VI Examples from LR Parsing. Comp 412

1. For the following sub-problems, consider the following context-free grammar: S A$ (1) A xbc (2) A CB (3) B yb (4) C x (6)

CSE302: Compiler Design

Syntax Analysis Part III

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412

Bottom-Up Syntax Analysis

Compiler Design. Spring Syntactic Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Bottom-Up Syntax Analysis

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

CS20a: summary (Oct 24, 2002)

CS415 Compilers Syntax Analysis Bottom-up Parsing

Compiling Techniques

SLR(1) and LALR(1) Parsing for Unrestricted Grammars. Lawrence A. Harris

Compiler Principles, PS7

UNIT-VIII COMPUTABILITY THEORY

Compiler Principles, PS4

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

CS415 Compilers Syntax Analysis Top-down Parsing

2.4 Parsing. Computer Science 332. Compiler Construction. Chapter 2: A Simple One-Pass Compiler : Parsing. Top-Down Parsing

Everything You Always Wanted to Know About Parsing

Talen en Compilers. Johan Jeuring , period 2. October 29, Department of Information and Computing Sciences Utrecht University

Generalized Bottom Up Parsers With Reduced Stack Activity

Computer Science 160 Translation of Programming Languages

1. For the following sub-problems, consider the following context-free grammar: S AB$ (1) A xax (2) A B (3) B yby (5) B A (6)

Exercises. Exercise: Grammar Rewriting

Theory of computation: initial remarks (Chapter 11)

Parsing. Left-Corner Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 17

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) October,

Part 3 out of 5. Automata & languages. A primer on the Theory of Computation. Last week, we learned about closure and equivalence of regular languages

Follow sets. LL(1) Parsing Table

LL conflict resolution using the embedded left LR parser

Theory of computation: initial remarks (Chapter 11)

UNIT-VI PUSHDOWN AUTOMATA

Compiling Techniques

CS Rewriting System - grammars, fa, and PDA

8. LL(k) Parsing. Canonical LL(k) parser dual of the canonical LR(k) parser generalization of strong LL(k) parser

Compiling Techniques

INF5110 Compiler Construction

INF5110 Compiler Construction

Transcription:

Let s try building an SLR parsing table for another simple grammar: S XaY Y X by c Y X S XaY Y X by c Y X Canonical collection: I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } I 9 : {S XaY } 1 2

I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } I 9 : {S XaY } FOLLOW(S) = {$} FOLLOW(X) = {a, $} FOLLOW(Y ) = {a, $} action goto STATE a b c $ S X Y 0 1 2 3 4 5 6 7 8 9 3 I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } I 9 : {S XaY } FOLLOW(S) = {$} FOLLOW(X) = {a, $} FOLLOW(Y ) = {a, $} action goto STATE a b c $ S X Y 0 s4 s5 1 2 3 1 acc 2 s6,r5? r5 3 r2 4 s4 s5 8 7 5 r4 r4 6 s4 s5 8 9 7 r3 r3 8 r5 r5 9 r1 4

action goto STATE a b c $ S X Y 0 s4 s5 1 2 3 1 acc 2 s6,r5? r5 3 r2 r2 4 s4 s5 8 7 5 r4 r4 6 s4 s5 8 9 7 r3 r3 8 r5 r5 9 r1 Let s try a parse: STACK INPUT ACTION 0 cac$ s5 0c5 ac$ r4 0X2 ac$ s6, not r5! 0X2a6 c$ s5 0X2a6c5 $ r4 0X2a6X8 $ r5 0X2a6Y9 $ r1 0S1 $ acc 5 action goto STATE a b c $ S X Y 0 s4 s5 1 2 3 1 acc 2 s6,r5? r5 3 r2 r2 4 s4 s5 8 7 5 r4 r4 6 s4 s5 8 9 7 r3 r3 8 r5 r5 9 r1 Let s try the alternative... STACK INPUT ACTION 0 cac$ s5 0c5 ac$ r4 0X2 ac$ try r5, instead of s6 0Y3 ac$ r2 0S1 ac$ error In LR parsing, the stack should always contain a viable prefix, and the stack plus lookahead (unless it s $) should be a prefix of a right-sentential form. Otherwise the parse will fail... 6

If we have X on the stack, with lookahead a, we had better not reduce X to Y : Y a is not a prefix of a right-sentential form, despite the fact that a FOLLOW(Y ). FOLLOW(Y ) = {a, $} I 0 : {S S, S XaY, S Y, X by, X c, Y X} I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X } I 9 : {S XaY } S XaY by ay So, what viable prefixes take the parser to state 2? X only! To get an a we must use production S XaY. Then to get a Y in front of that a, we must produce a Y from the X, which we can do only with production X by. (So we can t get substring Y a without a leading b.) (Notice that in this case we showed that Ba is not a prefix of any sentential form of course it follows that it is not a prefix of any right-sentential form.) Whenever the parser is in state 2 with lookahead a, the stack contains X (that is, 0X2), and reduction by 5 leaves us with stack Y (that is, 0Y 3) in which case the stack plus lookahead is not a prefix of any right-sentential form. So in state 2, with lookahead a, we should always shift and go to state 6. The SLR parsing table isn t adequate... But maybe there are still cases in which, in state 2 on lookahed a, the parser should reduce with production 5... The problem is with the use of FOLLOW sets to decide when to reduce. 7 8

Canonical LR parsing tables, LR(1) items The grammar in the previous example is not ambiguous, and can be parsed by the LR method, if only we can construct a more adequate parsing table. We ll look first at canonical LR parsing tables. Then LALR parsing tables, which are smaller and weaker (and are what yacc builds!). Both methods can handle the previous examples. We begin by enriching the notion of an item, to better take into account lookahead tokens. Definition An LR(1) item is a pair of an LR(0) item and a token or $, typically written [X α β, a]. LR(1) closure function Definition Given a set I of LR(1) items from an augmented grammar, closure(i) is the least set of items satisfying the following two conditions: 1. I closure(i) 2. If [X α Y β, a] closure(i) and there is an LR(1) item [Y γ, b] s.t. b FIRST(βa) then [Y γ, b] closure(i). Intuitively, the lookahead a has no effect if β ǫ, but if β = ǫ, then [X α β, a] calls for reduction by X αβ if the lookahead token is a. Thus, we will reduce by a production X α only on those lookaheads a for which there is an item [X α, a]. Claim: FIRST(βa) FOLLOW(Y ). (As elsewhere, I suspect this is not true, but it is suggestive. Why not true? Because the definition of FOLLOW is about existence of sentential forms, but this definition is not.) The set of such a s is always a subset of FOLLOW(X). 9 10

1. I closure(i) 2. If [X α Y β, a] closure(i) and there is an LR(1) item [Y γ, b] s.t. b FIRST(βa), then [Y γ, b] closure(i). Example What is closure({[s S, $]})? So we consider LR(1) items with first components S XaY and S Y. In this case, β = ǫ, so FIRST(β$) = {$}. So we add LR(1) items [S XaY, $] and [S Y, $]. Now consider X by and X c. Here β = ay, so FIRST(β$) = {a}. Thus we add items [X by, a] and [X c, a]. Now consider Y X, with β = ǫ. Since FIRST(β$) = {$}, we add item [Y X, $]. Now consider X by and X c, with β = ǫ. Since FIRST(β$) = {$}, we add items [X by, $] and [X c, $]. 11 1. I closure(i) 2. If [X α Y β, a] closure(i) and there is an LR(1) item [Y γ, b] s.t. b FIRST(βa), then [Y γ, b] closure(i). Example What is closure({[s X ay, $], [Y X, $]})? What is closure({[x b Y, a $]})? (stands for 2 LR(1) items) Consider LR(1) items with first component Y X. In this case, β = ǫ. Since FIRST(β$) = {$}, we add [Y X, $], and since FIRST(βa) = {a}, we add [Y X, a]. Now consider X by and X c. Here again β = ǫ, so FIRST(β$) = {$} and FIRST(βa) = {a}. Thus we add [X by, a $] and [X c, a $]. What is closure({[s Xa Y, $]})? 12

LR(1) goto function Construction of sets of LR(1) items Definition For any set I of LR(1) items and grammar symbol X, goto(i, X) is the closure of the set of all LR(1) items [A αx β, a] s.t. [A α Xβ, a] I. Example I 0 : {[S S, $], [S XaY, $], [S Y, $] [X by, a $], [X c, a $], [Y X, $]} goto(i 0, S) = closure({[s S, $]}) = {[S S, $]} goto(i 0, X) = closure({[s X ay, $], [Y X, $]}) = {[S X ay, $], [Y X, $]}) goto(i 0, Y ) = closure({[s Y, $]}) = {[S Y, $]} goto(i 0, b) = closure({[x b Y, a $]}) = {[X b Y, a $], [Y X, a $], [X by, a $], [X c, a $]} goto(i 0, c) = closure({[x c, a $]}) = {[X c, a $]} Here is the algorithm for construction of the canonical collection of sets of LR(1) items for an augmented grammar: C := { closure({[s S, $]}) }; repeat for each set of LR(1) items I C and each grammar symbol X s.t. goto(i, X) is not empty and is not already in C do add goto(i, X) to C until no more sets of items can be added to C in this way So this is the same construction we used for LR(0) items but now for LR(1) items, with the LR(1) versions of closure and goto, and the initial LR(1) item. 13 14

Construction of canonical LR(1) parsing table 1. Construct the canonical collection C = {I 0,...,I n } of sets of LR(0) items. Each element of C is a state, and we ll often write k to refer to the state I k. 2. The applicable parsing actions for state k and lookahead a are determined as follows: a) If goto(k, a) = j, then shift and go to state j is applicable. b) If [B α, a] k and B is not S, then reduce by production B α is applicable. c) If [S S, $] k and a = $, then accept is applicable. If any conflicting actions are generated in step 2, the grammar is not LR(1). 3. For all states k and variables A, if goto(k, A) is nonempty, then goto(k, A) is the corresponding goto entry. 4. All remaining undefined entries have action error. 5. If I k = closure({[s S, $]}), then k is the initial state of the parser. 15