Top-Down Parsing and Intro to Bottom-Up Parsing

Similar documents
Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

CSE302: Compiler Design

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

Syntax Analysis Part III

Syntactic Analysis. Top-Down Parsing

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

CS415 Compilers Syntax Analysis Top-down Parsing

Syntax Analysis Part I

Compiler Principles, PS4

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

CS 314 Principles of Programming Languages

Creating a Recursive Descent Parse Table

Syntax Analysis Part I

Parsing -3. A View During TD Parsing

Computer Science 160 Translation of Programming Languages

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

Compiling Techniques

CS153: Compilers Lecture 5: LL Parsing

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Compiling Techniques

Top-Down Parsing, Part II

CONTEXT FREE GRAMMAR AND

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

Compiler Principles, PS7

Announcement: Midterm Prep

Follow sets. LL(1) Parsing Table

Context free languages

Exercises. Exercise: Grammar Rewriting

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Computing if a token can follow

Syntax Analysis (Part 2)

CS 406: Bottom-Up Parsing

The Parser. CISC 5920: Compiler Construction Chapter 3 Syntactic Analysis (I) Grammars (cont d) Grammars

LL(1) Grammar and parser

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Compiling Techniques

Pushdown Automata (Pre Lecture)

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

Talen en Compilers. Johan Jeuring , period 2. October 29, Department of Information and Computing Sciences Utrecht University

Computability Theory

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

Compilerconstructie. najaar Rudy van Vliet kamer 124 Snellius, tel rvvliet(at)liacs.

Compiler Construction Lent Term 2015 Lectures (of 16)

Compiler Construction Lent Term 2015 Lectures (of 16)

CA Compiler Construction

Computational Models - Lecture 4

CISC4090: Theory of Computation

Compiler Construction Lectures 13 16

THEORY OF COMPILATION

Syntax Analysis - Part 1. Syntax Analysis

Compiler Design Spring 2017

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009

Recursive descent for grammars with contexts

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Parsing VI LR(1) Parsers

CSE 105 THEORY OF COMPUTATION

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75

CS Rewriting System - grammars, fa, and PDA

CSE 105 THEORY OF COMPUTATION

INF5110 Compiler Construction

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Section 1 (closed-book) Total points 30

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

Bottom-up syntactic parsing. LR(k) grammars. LR(0) grammars. Bottom-up parser. Definition Properties

CS20a: summary (Oct 24, 2002)

Course Script INF 5110: Compiler construction

Compiler construction

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Computer Science 160 Translation of Programming Languages

Theory of Computation (VI) Yijia Chen Fudan University

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

From EBNF to PEG. Roman R. Redziejowski. Concurrency, Specification and Programming Berlin 2012

Introduction to Theory of Computing

Lecture 11 Context-Free Languages

Bottom-Up Syntax Analysis

Fundamentele Informatica II

Context-Free Grammars (and Languages) Lecture 7

Pushdown Automata: Introduction (2)

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation

Harvard CS 121 and CSCI E-207 Lecture 10: CFLs: PDAs, Closure Properties, and Non-CFLs

Compiler Design 1. LR Parsing. Goutam Biswas. Lect 7

Syntax Analysis, VI Examples from LR Parsing. Comp 412

CPS 220 Theory of Computation Pushdown Automata (PDA)

Transcription:

Predictive Parsers op-down Parsing and Intro to Bottom-Up Parsing Lecture 7 Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive parsers accept LL(k) grammars L means left-to-right scan of input L means leftmost derivation k means predict based on k tokens of lookahead In practice, LL(1) is used Prof. Aiken CS 143 Lecture 7 1 Prof. Aiken CS 143 Lecture 7 2 LL(1) vs. Recursive Descent Predictive Parsing and Left Factoring In recursive-descent, At each step, many choices of production to use Backtracking used to undo bad choices In LL(1), At each step, only one choice of production hat is When a non-terminal A is leftmost in a derivation he next input symbol is t here is a unique production A α to use Or no production to use (an error state) LL(1) is a recursive descent variant without backtracking Prof. Aiken CS 143 Lecture 7 3 Recall the grammar + ( ) Hard to predict because For two productions start with For it is not clear how to predict We need to left-factor the grammar Prof. Aiken CS 143 Lecture 7 4 Left-Factoring xample LL(1) Parsing able xample Recall the grammar + ( ) Factor out common prefixes of productions X X + ε ( ) Y Y ε Prof. Aiken CS 143 Lecture 7 5 Left-factored grammar X ( ) Y he LL(1) parsing table: X + ε Y ε next input token + ( ) $ X X X + ε ε Y ( ) Y ε ε ε leftmost non-terminal rhs of production to use Prof. Aiken CS 143 Lecture 7 6 1

LL(1) Parsing able xample (Cont.) Consider the [, ] entry When current non-terminal is and next input is, use production X his can generate an in the first position Consider the [Y,+] entry When current non-terminal is Y and current token is +, get rid of Y Y can be followed by + only if Y ε LL(1) Parsing ables. rrors Blank entries indicate error situations Consider the [,] entry here is no way to derive a string starting with from non-terminal Prof. Aiken CS 143 Lecture 7 7 Prof. Aiken CS 143 Lecture 7 8 Using Parsing ables Method similar to recursive descent, except For the leftmost non-terminal S We look at the next input token a And choose the production shown at [S,a] A stack records frontier of parse tree Non-terminals that have yet to be expanded erminals that have yet to matched against the input op of stack = leftmost pending terminal or non-terminal Reject on reaching error state Accept on end of input & empty stack Prof. Aiken CS 143 Lecture 7 9 LL(1) Parsing Algorithm initialize stack = <S $> and next repeat case stack of <X, rest> : if [X,next] = Y 1 Y n then stack <Y 1 Y n rest>; else error (); <t, rest> : if t == next ++ then stack <rest>; else error (); until stack == < > Prof. Aiken CS 143 Lecture 7 10 LL(1) Parsing Algorithm $ marks bottom of stack LL(1) Parsing xample initialize stack = <S $> and next repeat For non-terminal X on top of stack, lookup production case stack of <X, rest> : if [X,next] = Y 1 Y n then stack <Y 1 Y n rest>; else error (); <t, rest> : if t == next ++ For terminal t on top of then stack <rest>; stack, check t matches next else error (); input token. until stack == < > Pop X, push production rhs on stack. Note leftmost symbol of rhs is on top of the stack. Prof. Aiken CS 143 Lecture 7 11 Stack Input Action $ $ X X $ $ Y Y X $ $ terminal Y X $ $ X $ $ terminal X $ $ Y Y X $ $ terminal Y X $ $ ε X $ $ ε $ $ ACCP Prof. Aiken CS 143 Lecture 7 12 2

Constructing Parsing ables: he Intuition Computing First Sets Consider non-terminal A, production A α, & token t [A,t] = α in two cases: If α t β α can derive a t in the first position We say that t First(α) If A α and α ε and S β A t δ Useful if stack has A, input is t, and A cannot derive t In this case only option is to get rid of A (by deriving ε) Can work only if t can follow A in at least one derivation We say t Follow(A) Definition First(X) = { t X tα} {ε X ε} Algorithm sketch: 1. First(t) = { t } 2. ε First(X) if X ε if X A 1 A n and ε First(A i ) for 1 i n 3. First(α) First(X) if X A 1 A n α and ε First(A i ) for 1 i n Prof. Aiken CS 143 Lecture 7 13 Prof. Aiken CS 143 Lecture 7 14 First Sets. xample Computing Follow Sets Recall the grammar X ( ) Y X + ε Y ε Definition: Follow(X) = { t S β X t δ } First sets First( ( ) = { ( } First( ) = {, ( } First( ) ) = { ) } First( ) = {, ( } First( ) = { } First( X ) = {+, ε } First( + ) = { + } First( Y ) = {, ε } First( ) = { } Intuition If X A B then First(B) Follow(A) and Follow(X) Follow(B) if B ε then Follow(X) Follow(A) If S is the start symbol then $ Follow(S) Prof. Aiken CS 143 Lecture 7 15 Prof. Aiken CS 143 Lecture 7 16 Computing Follow Sets (Cont.) Algorithm sketch: 1. $ Follow(S) 2. First(β) - {ε} Follow(X) For each production A α X β 3. Follow(A) Follow(X) For each production A α X β where ε First(β) Follow Sets. xample Recall the grammar X ( ) Y X + ε Y ε Follow sets Follow( + ) = {, ( } Follow( ) = {, ( } Follow( ( ) = {, ( } Follow( ) = {), $} Follow( X ) = {$, ) } Follow( ) = {+, ), $} Follow( ) ) = {+, ), $} Follow( Y ) = {+, ), $} Follow( ) = {, +, ), $} Prof. Aiken CS 143 Lecture 7 17 Prof. Aiken CS 143 Lecture 7 18 3

Constructing LL(1) Parsing ables Construct a parsing table for CFG G For each production A α in G do: For each terminal t First(α) do [A, t] = α If ε First(α), for each t Follow(A) do [A, t] = α If ε First(α) and $ Follow(A) do [A, $] = α Notes on LL(1) Parsing ables If any entry is multiply defined then G is not LL(1) If G is ambiguous If G is left recursive If G is not left-factored And in other cases as well Most programming language CFGs are not LL(1) Prof. Aiken CS 143 Lecture 7 19 Prof. Aiken CS 143 Lecture 7 20 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient Builds on ideas in top-down parsing Bottom-up is the preferred method Concepts today, algorithms next time An Introductory xample Bottom-up parsers don t need left-factored grammars Revert to the natural grammar for our example: + () Consider the string: + Prof. Aiken CS 143 Lecture 7 21 Prof. Aiken CS 143 Lecture 7 22 he Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: + + + + + + Prof. Aiken CS 143 Lecture 7 23 Observation Read the productions in reverse (from bottom to top) his is a rightmost derivation! + + + + + + Prof. Aiken CS 143 Lecture 7 24 4

Important Fact #1 A Bottom-up Parse Important Fact #1 about bottom-up parsing: A bottom-up parser traces a rightmost derivation in reverse + + + + + + Prof. Aiken CS 143 Lecture 7 25 Prof. Aiken CS 143 Lecture 7 26 A Bottom-up Parse in Detail (1) A Bottom-up Parse in Detail (2) + + + + + Prof. Aiken CS 143 Lecture 7 27 Prof. Aiken CS 143 Lecture 7 28 A Bottom-up Parse in Detail (3) A Bottom-up Parse in Detail (4) + + + + + + + + + Prof. Aiken CS 143 Lecture 7 29 Prof. Aiken CS 143 Lecture 7 30 5

A Bottom-up Parse in Detail (5) A Bottom-up Parse in Detail (6) + + + + + + + + + + + + Prof. Aiken CS 143 Lecture 7 31 Prof. Aiken CS 143 Lecture 7 32 A rivial Bottom-Up Parsing Algorithm Let I = input string repeat pick a non-empty substring β of I where X β is a production if no such β, backtrack replace one β by X in I until I = S (the start symbol) or all possibilities are exhausted Questions Does this algorithm terminate? How fast is the algorithm? Does the algorithm handle all cases? How do we choose the substring to reduce at each step? Prof. Aiken CS 143 Lecture 7 33 Prof. Aiken CS 143 Lecture 7 34 Where Do Reductions Happen? Important Fact #1 has an eresting consequence: Let αβω be a step of a bottom-up parse Assume the next reduction is by X β hen ω is a string of terminals Why? Because αxω αβω is a step in a rightmost derivation Notation Idea: Split string o two substrings Right substring is as yet unexamined by parsing (a string of terminals) Left substring has terminals and non-terminals he dividing po is marked by a he is not part of the string Initially, all input is unexamined x 1 x 2... x n Prof. Aiken CS 143 Lecture 7 35 Prof. Aiken CS 143 Lecture 7 36 6

Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift Shift Shift: Move one place to the right Shifts a terminal to the left string ABC xyz ABCx yz Reduce Prof. Aiken CS 143 Lecture 7 37 Prof. Aiken CS 143 Lecture 7 38 Reduce he xample with Reductions Only Apply an inverse production at the right end of the left string If A xy is a production, then Cbxy ijk CbA ijk + + reduce reduce Prof. Aiken CS 143 Lecture 7 39 + + + reduce reduce reduce + Prof. Aiken CS 143 Lecture 7 40 he xample with Shift-Reduce Parsing A Shift-Reduce Parse in Detail (1) + + + + + + + + + + reduce reduce reduce reduce reduce + + + Prof. Aiken CS 143 Lecture 7 41 Prof. Aiken CS 143 Lecture 7 42 7

A Shift-Reduce Parse in Detail (2) + + A Shift-Reduce Parse in Detail (3) + + + + + Prof. Aiken CS 143 Lecture 7 43 Prof. Aiken CS 143 Lecture 7 44 A Shift-Reduce Parse in Detail (4) + + + + A Shift-Reduce Parse in Detail (5) + + + + + + + Prof. Aiken CS 143 Lecture 7 45 Prof. Aiken CS 143 Lecture 7 46 A Shift-Reduce Parse in Detail (6) A Shift-Reduce Parse in Detail (7) + + + + + + + + + + + + + + + Prof. Aiken CS 143 Lecture 7 47 Prof. Aiken CS 143 Lecture 7 48 8

A Shift-Reduce Parse in Detail (8) A Shift-Reduce Parse in Detail (9) + + + + + + + + + + + + + + + + + + + Prof. Aiken CS 143 Lecture 7 49 Prof. Aiken CS 143 Lecture 7 50 A Shift-Reduce Parse in Detail (10) A Shift-Reduce Parse in Detail (11) + + + + + + + + + + + Prof. Aiken CS 143 Lecture 7 51 + + + + + + + + + + + Prof. Aiken CS 143 Lecture 7 52 he Stack Conflicts Left string can be implemented by a stack op of the stack is the Shift pushes a terminal on the stack Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a nonterminal on the stack (production lhs) In a given state, more than one action ( or reduce) may lead to a valid parse If it is legal to or reduce, there is a reduce conflict If it is legal to reduce by two different productions, there is a reduce-reduce conflict You will see such conflicts in your project! More next time... Prof. Aiken CS 143 Lecture 7 53 Prof. Aiken CS 143 Lecture 7 54 9