CS153: Compilers Lecture 5: LL Parsing

Similar documents
Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Follow sets. LL(1) Parsing Table

CSE302: Compiler Design

Compiler Construction Lent Term 2015 Lectures (of 16)

Computing if a token can follow

Top-Down Parsing and Intro to Bottom-Up Parsing

Compiler Construction Lent Term 2015 Lectures (of 16)

Compiler Construction Lectures 13 16

CS415 Compilers Syntax Analysis Top-down Parsing

Lecture 11 Sections 4.5, 4.7. Wed, Feb 18, 2009

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

CS 314 Principles of Programming Languages

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Syntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 3. Y.N. Srikant

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Parsing -3. A View During TD Parsing

CS20a: summary (Oct 24, 2002)

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Syntactic Analysis. Top-Down Parsing

Introduction to Bottom-Up Parsing

Syntax Analysis Part III

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing

Computer Science 160 Translation of Programming Languages

Introduction to Bottom-Up Parsing

CA Compiler Construction

Parsing VI LR(1) Parsers

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Compiling Techniques

Creating a Recursive Descent Parse Table

Compiler Principles, PS4

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Talen en Compilers. Johan Jeuring , period 2. October 29, Department of Information and Computing Sciences Utrecht University

Announcement: Midterm Prep

CONTEXT FREE GRAMMAR AND

Compiling Techniques

Syntax Analysis Part I

Syntax Analysis Part I

CS 406: Bottom-Up Parsing

CYK Algorithm for Parsing General Context-Free Grammars

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

I 1 : {S S } I 2 : {S X ay, Y X } I 3 : {S Y } I 4 : {X b Y, Y X, X by, X c} I 5 : {X c } I 6 : {S Xa Y, Y X, X by, X c} I 7 : {X by } I 8 : {Y X }

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

This lecture covers Chapter 5 of HMU: Context-free Grammars

THEORY OF COMPILATION

Compiling Techniques

Compiler Principles, PS7

Compiler construction

Announcements. H6 posted 2 days ago (due on Tue) Midterm went well: Very nice curve. Average 65% High score 74/75

MA/CSSE 474 Theory of Computation

Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

The Parser. CISC 5920: Compiler Construction Chapter 3 Syntactic Analysis (I) Grammars (cont d) Grammars

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

INF5110 Compiler Construction

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

Top-Down Parsing, Part II

Syntax Analysis - Part 1. Syntax Analysis

Compiler Design Spring 2017

LR(1) Parsers Part III Last Parsing Lecture. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

Lecture 11 Context-Free Languages

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

Context Free Grammars

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Course Script INF 5110: Compiler construction

1. For the following sub-problems, consider the following context-free grammar: S A$ (1) A xbc (2) A CB (3) B yb (4) C x (6)

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Context free languages

LL(1) Grammar and parser

Computer Science 160 Translation of Programming Languages

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Chapter 4: Bottom-up Analysis 106 / 338

Eng. Maha Talaat Page 1

Context-free Grammars and Languages

Bottom-Up Syntax Analysis

Why augment the grammar?

CS415 Compilers Syntax Analysis Bottom-up Parsing

LR(1) Parsers Part II. Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4)

CS Rewriting System - grammars, fa, and PDA

Pushdown Automata: Introduction (2)

CS 373: Theory of Computation. Fall 2010

From EBNF to PEG. Roman R. Redziejowski. Concurrency, Specification and Programming Berlin 2012

An Efficient Context-Free Parsing Algorithm. Speakers: Morad Ankri Yaniv Elia

Decision problem of substrings in Context Free Languages.

Formal Languages, Grammars and Automata Lecture 5

Syntax Analysis, VII The Canonical LR(1) Table Construction. Comp 412 COMP 412 FALL Chapter 3 in EaC2e. source code. IR IR target.

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

5 Context-Free Languages

Exercises. Exercise: Grammar Rewriting

Compilers. Lexical analysis. Yannis Smaragdakis, U. Athens (original slides by Sam

Syntax Analysis, VI Examples from LR Parsing. Comp 412

Harvard CS 121 and CSCI E-207 Lecture 10: Ambiguity, Pushdown Automata

Context Free Grammars: Introduction

CS453 Intro and PA1 1

Transcription:

CS153: Compilers Lecture 5: LL Parsing Stephen Chong https://www.seas.harvard.edu/courses/cs153

Announcements Proj 1 out Due Thursday Sept 20 (2 days away) Proj 2 out Due Thursday Oct 4 (16 days away) 2

Academic Integrity Kevin Stephen, Harvard College Honor Council 3

Today LL Parsing Nullable, First, Follow sets Constructing an LL parsing table 4

LL(k) Parsing Our parser combinators backtrack alt p1 p2 = fun cs -> (p1 cs) @ (p2 c2) runs p1 on cs, then backs up and runs p2 on same input! Inefficient! Tries all possible parses Could we somehow know which production to use? Basic idea: look at the next k symbols to predict whether we want p1 or p2 How do we predict which production to use? 5

FIRST Sets Given string γ of terminal and non-terminal symbols FIRST(γ) is set of all terminal symbols that can start a string derived from γ E T E + T E - T E T F T' * F T / F T E.g., FIRST(F T ) = { id, num, ( } F id F num F (E) We can use FIRST sets to determine which production to use! Given nonterminal X, and all its productions X γ 1, X γ 2,...,X γ n, if FIRST(γ 1 ),..., FIRST(γ n ) all mutually disjoint, then next character tells us which production to use 6

Computing FIRST Sets See Appel for algorithm. Intuition here... Consider FIRST(X Y Z) How do compute it? Do we just need to know FIRST(X)? What if X can derive the empty string? Then FIRST(Y) FIRST(X Y Z) What if Y can also derive the empty string? Then FIRST(Z) FIRST(X Y Z) 7

Computing FIRST, FOLLOW and Nullable To compute FIRST sets, we need to compute whether nonterminals can produce empty string FIRST(γ) = all terminal symbols that can start a string derived from γ Nullable(X) = true iff X can derive the empty string We will also compute: FOLLOW(X) = all terminals that can immediately follow X i.e., t FOLLOW(X) if there is a derivation containing Xt Algorithm iterates computing these until fix point reached Note: knowing nullable(x) and FIRST(X) for all non-terminals X allows us to compute nullable(γ) and FIRST(γ) for arbitrary strings of symbols γ 8

Example S E eof E T E + T E - T E T F T' * F T / F T F id F num F (E) S E E T T F nullable FIRST FOLLOW X is nullable if there is a production X γ where γ is empty, or γ is all nullable nonterminals T and E are nullable! And, we ve finished nullable. Why? 9

Example S E eof E T E + T E - T E T F T' * F T / F T F id F num F (E) S E E T T F nullable FIRST FOLLOW + - * / id num ( Given production X tγ, t FIRST(X) 10

Example S E eof E T E + T E - T E T F T' * F T / F T F id F num F (E) S E E T T F nullable FIRST FOLLOW id num ( id num ( + - id num ( * / id num ( Given production X γyσ, if nullable(γ) then FIRST(Y) FIRST(X) Repeat until no more changes... 11

Example S E eof E T E + T E - T E T F T' * F T / F T F id F num F (E) S E E T T F nullable FIRST FOLLOW id num ( id num ( + - id num ( * / id num ( eof + - * / Given production X γzδσ FIRST(δ) FOLLOW(Z) and if δ is nullable then FIRST(σ) FOLLOW(Z) and if δσ is nullable then FOLLOW(X) FOLLOW(Z) ) eof ) eof ) + - eof ) + - eof ) 12

Predictive Parsing Table Make predictive parsing table with rows nonterminals, columns terminals Table entries are productions When parsing nonterminal X, and next token is t, entry for X and t will tell us which production to use 13

S E eof E T E + T E - T E T F T' * F T / F T Example F id F num F (E) nullable FIRST FOLLOW S id num ( E id num ( eof ) E + - eof ) T id num ( + - eof ) T * / + - eof ) F id num ( * / + - eof ) id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) For X γ, add X γ to row X column t for all t FIRST(γ) For X γ, if γ is nullable, add X γ to row X column t for all t FOLLOW(X) 14

Example id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) If each cell contains at most one production, parsing is predictive! Table tells us exactly which production to apply 15

Example id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) S Parse S, next token is (, use S E eof E eof Parse E, next token is (, use E T E T E eof Parse T, next token is (, use T F T' F T E eof Parse F, next token is (, use F (E) (E) T E eof ( foo + 7 ) eof 16

Example id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) S Parse S, next token is (, use S E eof E eof Parse E, next token is (, use E T E T E eof Parse T, next token is (, use T F T' F T E eof Parse F, next token is (, use F (E) (E) T E eof Parse E, next token is id, use E T E (T E ) T E eof Parse T, next token is id, use T F T' (F T E ) T E eof Parse F, next token is id, use F id ( foo + 7 ) eof 17

Example id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) (F T E ) T E eof Parse F, next token is id, use F id (id T E ) T E eof Parse T, next token is +, use (id E ) T E eof Parse E, next token is +, use + T E (id + T E ) T E eof Parse T, next token is num, use T F T' (id + F T E ) T E eof Parse F, next token is num, use F num ( foo + 7 ) eof 18

Example id num + - * / ( ) eof S S E eof S E eof S E eof E E T T F E T E E T E E T E + T E - T E T F T' T F T' T F T' * F T / F T F id F num F (E) (id + F T E ) T E eof Parse F, next token is num, use F num (id + num T E ) T E eof Parse T, next token is ), use (id + num E ) T E eof Parse E, next token is ), use (id + num) T E eof Parse T, next token is eof, use (id + num) E eof Parse E, next token is eof, use (id + num) eof ( foo + 7 ) eof 19

LL(1), LL(k), LL(*) Grammars whose predictive parsing table contain at most one production per cell are called LL(1) Left-to-right parse i.e., go through token stream from left to right. (Almost all parsers do this) Leftmost derivation Derivation expands the leftmost non-terminal 1-symbol lookahead 20

LL(1), LL(k), LL(*) Grammars whose predictive parsing table contain at most one production per cell are called LL(1) Can be generalized to LL(2), LL(3), etc. Columns of predictive parsing table have k tokens FIRST(X) generalized to FIRST-k(X) An LL(*) grammar can determine next production using finite (but maybe unbounded) lookahead An ambiguous grammar is not LL(k) for any k, or even LL(*) Why? 21

LR(k) What if grammar is unambiguous but not LL(k)? LR(k) parsing is more powerful technique Left-to-right parse k-symbol lookahead Rightmost derivation Derivation expands the rightmost non-terminal (Constructs derivation in reverse order!) 22