Parsing -3. A View During TD Parsing

Similar documents
CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Syntax Analysis (Part 2)

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

CA Compiler Construction

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

Compiler Design Spring 2017

CS 406: Bottom-Up Parsing

Computer Science 160 Translation of Programming Languages

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

Syntax Analysis Part I

Top-Down Parsing and Intro to Bottom-Up Parsing

Parsing VI LR(1) Parsers

Introduction to Bottom-Up Parsing

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

THEORY OF COMPILATION

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

CSE302: Compiler Design

Introduction to Bottom-Up Parsing

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

Introduction to Bottom-Up Parsing

Syntax Analysis Part I

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Introduction to Bottom-Up Parsing

Compiler Construction Lectures 13 16

Compiler Construction Lent Term 2015 Lectures (of 16)

Compiler Construction Lent Term 2015 Lectures (of 16)

Chapter 4: Bottom-up Analysis 106 / 338

Why augment the grammar?

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 7a Syntax Analysis Elias Athanasopoulos

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Translator Design Lecture 16 Constructing SLR Parsing Tables

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009

Bottom-Up Syntax Analysis

Context free languages

Briefly on Bottom-up

Syntactic Analysis. Top-Down Parsing

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

CS20a: summary (Oct 24, 2002)

Bottom-Up Syntax Analysis

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Syntax Analysis Part III

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

CS153: Compilers Lecture 5: LL Parsing

INF5110 Compiler Construction

Syntax Analysis, VI Examples from LR Parsing. Comp 412

Bottom-up Analysis. Theorem: Proof: Let a grammar G be reduced and left-recursive, then G is not LL(k) for any k.

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

1. For the following sub-problems, consider the following context-free grammar: S A$ (1) A xbc (2) A CB (3) B yb (4) C x (6)

Creating a Recursive Descent Parse Table

CS415 Compilers Syntax Analysis Bottom-up Parsing

Compiler Construction

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

The Parser. CISC 5920: Compiler Construction Chapter 3 Syntactic Analysis (I) Grammars (cont d) Grammars

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. source code. IR IR target.

Bottom-Up Syntax Analysis

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4)

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Everything You Always Wanted to Know About Parsing

Compiling Techniques

Bottom-Up Syntax Analysis

CS415 Compilers Syntax Analysis Top-down Parsing

SLR(1) and LALR(1) Parsing for Unrestricted Grammars. Lawrence A. Harris

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs.

Compiler Principles, PS4

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Type 3 languages. Type 2 languages. Regular grammars Finite automata. Regular expressions. Type 2 grammars. Deterministic Nondeterministic.

CPS 220 Theory of Computation

1. For the following sub-problems, consider the following context-free grammar: S AB$ (1) A xax (2) A B (3) B yby (5) B A (6)

Chapter 1. Formal Definition and View. Lecture Formal Pushdown Automata on the 28th April 2009

UNIT-VIII COMPUTABILITY THEORY

On LR(k)-parsers of polynomial size

Computer Science 160 Translation of Programming Languages

CONTEXT FREE GRAMMAR AND

MA/CSSE 474 Theory of Computation

Einführung in die Computerlinguistik

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

Theory of Computation (IV) Yijia Chen Fudan University

Pushdown Automata (2015/11/23)

Generalized Bottom Up Parsers With Reduced Stack Activity

Compiler Principles, PS7

Compiler Design. Spring Syntactic Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Context-Free Grammars (and Languages) Lecture 7

Exercises. Exercise: Grammar Rewriting

Syntax Analysis - Part 1. Syntax Analysis

Computing if a token can follow

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Curs 8. LR(k) parsing. S.Motogna - FL&CD

Transcription:

Parsing -3 Deterministic table-driven parsing techniques Pictorial view of TD and BU parsing BU (shift-reduce) Parsing Handle, viable prefix, items, closures, goto s LR(k): SLR(1), LR(1) Problems with SLR LALR(k) an optimization Using ambiguity to an advantage Aho, Sethi, Ullman, Compilers : Principles, Techniques and Tools Aho + Ullman, Theory of Parsing and Compiling, vol II. 1 A View During TD Parsing S derives a γ, a string of terminals; X is nonterminal at top of stack, X derives γ; Initially X==S, a == e, γ is input S X partially constructed parse tree α γ 2 1

* S α A w α β w, rm rm so β is the handle. A View During BU Parsing S α partially constructed parse tree A β w, a string of terminals 3 Intuitive Comparison S A w x z LR(k) can recognize A α knowing w, x, and First k (z). LL(k) can recognize A α knowing only w and First k (x). Therefore, the set of languages recognizable by LR(k) contain those recognizable by LL(k). α 4 2

BU Parsing (Shift-Reduce) Handle - part of sentential form last added in a rightmost derivation. BU parsing is handle hunting (1) S E (2) E E + T (3) E T (4) T id Rightmost derivation of a+b+c, handles in red S E E + T E + id E + T + id E + id + id T + id + id id + id + id 5 Shift-Reduce Parser, Example Actions: shift, reduce, accept, error Stack Input Action $ id1 + id2 + id3 $ shift $ id1 + id2 + id3 $ reduce (4) $ T + id2 + id3 $ reduce (3) $ E + id2 + id3 $ shift $ E + id2 + id3 $ shift $ E + id2 + id3 $ reduce(4) $ E + T + id3 $ reduce (2) $ E + id3 $ shift $ E + id3 $ shift $ E + id3 $ reduce (4) $ E + T $ reduce(2) $ E $ reduce (1) $ S $ accept (1) S E (2) E E + T (3) E T (4) T id 6 3

Problems Shift-reduce conflicts S if E then S if E then S else S other On stack: if E then S Input: else Should shift trying for 2nd alternative or reduce by first rule? Reduce-reduce conflicts if A α and B α both in grammar When α on stack, how do we know which production to choose? 7 Predictive Parsing Top Down: LL(k), Bottom Up: LR(k) Avoids backtracking while parsing by using lookahead into input NO cases where more than 1 action possible LR parsing algorithms developed in the mid-1970s; powerful enablers of table-driven compilation 8 4

LR(k) Left to right scan parsing does a rightmost derivation in reverse, using k symbols of lookahead into input Examples Simple LR - SLR(1) Cheap but doesn t always work LR(k) Most powerful and most expensive All SLR(1) languages are also LR(1), but parsers generated by corresponding grammars for the same language will differ in size. LR(k) catches syntax errors as early as possible in a left-to-right scan of the input and works for most modern PLs 9 LR Parsing FSA is embedded in parser which is a Pushdown automaton (top stack, input_symbol) accesses a particular entry in the parser table Shift to state s Reduce by A β Accept Error Goto: (state, top stack ) state 10 5

LR Parser symbol state input state\input Ac9on/ goto table stack 11 LR Parsing Idea: continue to stack inputs until have handle on top of stack and then reduce to its nonterminal symbol Viable prefix - set of prefixes of right sentential forms which can appear on a stack of a shift/reduce parser Goto function is transition function of DFA that recognizes viable prefixes of the grammar Idea is that while a viable prefix is on top of the stack the goto function continues the parse towards getting a handle on top of the stack that can be reduced 12 6

Building an SLR Parser Need states, goto s, Follow sets Item - rule with embedded dot S. E Closure of item I I {B.γ, if A α. B β in I} States built from items and their closures 13 SLR(1) Example - States S E I 0 : S. E E E + T E. E + T E T E. T T id Τ. id Closure of S. E I 1 : S E. I 2 : E T. E E. + T I 3 : T id. I 4 : E E +. T T. id I 5 : E E + T. 14 7

Example - Goto s + Follow sets goto (0, E) = 1 goto (0, id) = 3 goto (0, T) = 2 goto (1, +) = 4 goto (4, T) = 5 goto (4, id) = 3 goto ({set of items}, X) = closure {[A α X. β] [A α. X β] in {set of items}} where X is a terminal or nonterminal Follow(S) = {$} Follow(E) = Follow(T) = { +, $} S E E E + T E T T id 15 Rules for forming Follow Sets ASU p 189 1. Follow(S) contains $ 2. If A α X β then everything in First(β) except ε, is put into Follow (X) 3. If A α X or A α X β where First(β) contains ε, then Follow(A) is contained in Follow(X) 16 8

Example - Parser Table si, shift to state I; r(j) reduce by rule j States\ inputs: goto s id + $ E T 0 s3 1 2 1 s4 accept 2 r(3) r(3) 3 r(4) r(4) 4 s3 5 5 r(2) r(2) 17 Example Stack input action 0 id1 + id2 $ s3 0 id1 3 + id2 $ r(4), goto on T 0 T 2 + id2 $ r(3), goto on E 0 E 1 + id2 $ s4 0 E 1 + 4 id2 $ s3 0 E 1 + 4 id2 3 $ r(4), goto on T 0 E 1 + 4 T 5 $ r(2), goto on E 0 E 1 $ accept 18 9

SLR(1) Parser Rules If A α. a β is in state I j and goto(i j, a) is I r then (I j,, a) transitions by shift r (sr) If A α. is in state I j, set action [j,a] to reduce A α for all a in Follow(A) Note: A!= S If S E. in I j, action (j,$) is accept Any table entry not defined is error. 19 Problems Shift-reduce conflicts happen when Ab can occur in some sentential form and b Follow(A). S L = R I 0 : S. L = R S R S. R L * R R. L L id L. * R R L L. id I 1 : S L. = R (1) R L. (2) In state I 1 1 st choice: shift when see = in input(item 1); 2 nd choice: reduce on = because = in Follow(R) (item 2); Note: S L = R * R = R, but this is not a rightmost derivation! 20 10

Problems, cont. Can see that a rightmost derivation is: S L = R L = L L = id * R = id *L = id * id = id Therefore, should reduce *R to L when see =, not shift in order to get *R onto the stack. Problem is that we can t distinguish those Follow elements corresponding to a rightmost derivation in a specific context. 21 Nomenclature in ASU An item [ A β. γ ] is valid for viable prefix α β if S * α A w α β γ w. rm rm Means can continue towards accumulating an handle on the stack by shifting Previously, shift would have changed viable prefix *R to nonviable prefix *R= If I is set of items valid for viable prefix β then goto(i, X) is set of items valid for viable prefix βx where X is terminal or nonterminal 22 11

LR(1) Parsing LR items include a lookahead symbol, (into the input) which helps in conflict resolution Need new closure rule: For [ A α. B γ, a ] item add [B. δ, b] for every b in First(γ a). 23 Example I 0 : S. E, $ - initial item E. E + T, $ - closure initial item E. T, $ E. E + T, + - closure 1st red item E. T, + T.id, $ - closure 2nd red item T.id, + - closure 2nd blue item Will write these in more compact form by combining lookaheads. For [ A a. B g,a ] item add [B. d,b] for every b in First(g a). 24 12

Example, LR(1) Parser I 0 :S.E, $ I 1 : [goto (I 0, E)] E.E + T, $/+ S E., $ E. T, $/+ S E. + T, $/+ T.id, +/$ I 2 :[goto (I 0, T)] I 4 :[goto(i 1, +)] E T., $/+ E E +. T, $/+ I 3 : [goto (I 0, id)] T. id, $/+ T id., $/+ I 5 : [goto (I 4, T)] E E + T., $/+ 25 LR(1) Parser Reduce based on lookaheads in item which are a subset of Follow set Rules similar to SLR(1) Shift in I k, [A α. a β, b], goto (I k, a) = I j Reduce [A α., b] reduce α to A on b Accept [S E., $], accept on $ 26 13

LALR Parsing Idea: merge all states with common first components in their LR(1) items Implementation problem: need to reduce number of states to get smaller parser table Reduced size parser will perform Same as LR on correct inputs (will be parsed by LALR) On incorrect inputs, LR may find error faster; LALR will never do an incorrect shift but may do some wrong reductions 27 LALR Parsing Conceptually, build LALR(1) parser from LR(1) parser Merge all states with same first components Union all goto s of these merged states (goto s are independent of second components) Correctness of conceptual derivation Can never produce a shift-reduce conflict or else [A α., a] and [B β. a γ, b] existed in some LR(1) state 28 14

Useful Ambiguous Grammars Used to build compact parse trees Get rid of useless nonterminal to nonterminal productions (e.g., S-->E-->T) Conflicts resolvable through desired properties of operators (e.g., precedence) Generate smaller parsers Example of expression grammar 29 Example S E I 0 : S. E I 1 : goto (I 0,E) E E + E E. E + E S E. E id E. id E E. + E I 2 : goto (I 1,+) I 3 : goto (I 2,E) I 4 : goto(i 0,id) E E +. E E E + E. E id. E. E + E E E. + E E. id (reduce on + in Follow(E), shift on +) Choose reduce action making + left associative; can resolve operator precedence clashes the same way (e.g., + versus *) 30 15

Grammar Classification Context-free Langs {0 n 1 n n >= 1} union {0 n 1 2n n >= 1} LR(k) ~ LR(1) SLR(k) LL(k) LALR(k) S L = R R L *R a R L 31 16