Syntactical analysis. Syntactical analysis. Syntactical analysis. Syntactical analysis

Similar documents
Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Einführung in die Computerlinguistik

CSC 4181Compiler Construction. Context-Free Grammars Using grammars in parsers. Parsing Process. Context-Free Grammar

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

CSE302: Compiler Design

This lecture covers Chapter 5 of HMU: Context-free Grammars

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

Parsing Algorithms. CS 4447/CS Stephen Watt University of Western Ontario

Parsing -3. A View During TD Parsing

Context-free Grammars and Languages

Creating a Recursive Descent Parse Table

5/10/16. Grammar. Automata and Languages. Today s Topics. Grammars Definition A grammar G is defined as G = (V, T, P, S) where:

Lecture 11 Context-Free Languages

5 Context-Free Languages

Definition: A grammar G = (V, T, P,S) is a context free grammar (cfg) if all productions in P have the form A x where

Context-Free Grammars and Languages. Reading: Chapter 5

CPS 220 Theory of Computation Pushdown Automata (PDA)

Compiling Techniques

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Context-Free Grammars and Languages

Syntactic Analysis. Top-Down Parsing

Bottom-Up Syntax Analysis

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

Syntax Analysis Part I

Syntax Analysis - Part 1. Syntax Analysis

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Introduction to Bottom-Up Parsing

EXAM. Please read all instructions, including these, carefully NAME : Problem Max points Points 1 10 TOTAL 100

Computational Models - Lecture 5 1

CPS 220 Theory of Computation

Syntax Analysis Part III

Introduction to Bottom-Up Parsing

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Chapter 4: Context-Free Grammars

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Foundations of Informatics: a Bridging Course

CS 314 Principles of Programming Languages

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

Introduction to Bottom-Up Parsing

Compiler Principles, PS4

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Pushdown Automata: Introduction (2)

Syntax Analysis Part I

Introduction to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing

EXAM. CS331 Compiler Design Spring Please read all instructions, including these, carefully

MA/CSSE 474 Theory of Computation

Introduction to Theory of Computing

LR2: LR(0) Parsing. LR Parsing. CMPT 379: Compilers Instructor: Anoop Sarkar. anoopsarkar.github.io/compilers-class

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Compiler Construction Lent Term 2015 Lectures (of 16)

Context Free Grammars

Compiler Construction Lent Term 2015 Lectures (of 16)

Computer Science 160 Translation of Programming Languages

Shift-Reduce parser E + (E + (E) E [a-z] In each stage, we shift a symbol from the input to the stack, or reduce according to one of the rules.

Introduction and Motivation. Introduction and Motivation. Introduction to Computability. Introduction and Motivation. Theory. Lecture5: Context Free

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Context free languages

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Bottom-Up Parsing. Ÿ rm E + F *idÿ rm E +id*idÿ rm T +id*id. Ÿ rm F +id*id Ÿ rm id + id * id

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Theory of Computation (II) Yijia Chen Fudan University

CS Rewriting System - grammars, fa, and PDA

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.

Computer Science 160 Translation of Programming Languages

Computational Models - Lecture 4 1

Talen en Compilers. Johan Jeuring , period 2. October 29, Department of Information and Computing Sciences Utrecht University

Section 1 (closed-book) Total points 30

Computational Models - Lecture 4

Predictive parsing as a specific subclass of recursive descent parsing complexity comparisons with general parsing

Pushdown Automata. Notes on Automata and Theory of Computation. Chia-Ping Chen

1. Draw a parse tree for the following derivation: S C A C C A b b b b A b b b b B b b b b a A a a b b b b a b a a b b 2. Show on your parse tree u,

Grammars and Context Free Languages

CISC4090: Theory of Computation

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Functions on languages:

Outline. CS21 Decidability and Tractability. Machine view of FA. Machine view of FA. Machine view of FA. Machine view of FA.

Theory of Computation (VI) Yijia Chen Fudan University

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Computational Models - Lecture 4 1

MA/CSSE 474 Theory of Computation

n Top-down parsing vs. bottom-up parsing n Top-down parsing n Introduction n A top-down depth-first parser (with backtracking)

CS20a: summary (Oct 24, 2002)

Context-Free Grammars and Languages. We have seen that many languages cannot be regular. Thus we need to consider larger classes of langs.

1. (a) Explain the procedure to convert Context Free Grammar to Push Down Automata.

Theory of Computation (IV) Yijia Chen Fudan University

Even More on Dynamic Programming

Chapter 5: Context-Free Languages

Parsing Regular Expressions and Regular Grammars

Intro to Theory of Computation

CS 406: Bottom-Up Parsing

The Parser. CISC 5920: Compiler Construction Chapter 3 Syntactic Analysis (I) Grammars (cont d) Grammars

Compiling Techniques

Theory Bridge Exam Example Questions

Bottom up parsing. General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

SCHEME FOR INTERNAL ASSESSMENT TEST 3

CA Compiler Construction

Grammars and Context Free Languages

Transcription:

Context-free grammars Derivations Parse Trees Left-recursive grammars Top-down parsing non-recursive predictive parsers construction of parse tables Bottom-up parsing shift/reduce parsers LR parsers GLR parsers SGLR parsers A context-free grammar is a 4-tuple G = (N, Σ, P, S) 1. N is a set of non terminals 2. Σ is a set of terminals (disjoint from N) 3. P is a subset of (N Σ)* N An element (α, A) P is called a production A ::= α or α A 4. S N is the start symbol The sets N, Σ, P are nite / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 0 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 1 A context-free grammar can be consider as a simple rewrite system: A if A P (,, (N Σ)*, A N) Example N = {E, Σ = {+,*,(,),-, a, S = E, P = { E + E E E * E E ( E ) E - E E a E Derivation: E -E -(E) -(E+E) -(a+e) -(a+a) The language L(G) generated by the context-free grammar G = (N, Σ, P, S) is: L(G) = {w Σ* S + w A sentence w L(G) contains only terminals A sentential form is a string of terminals and nonterminals which can be derived from S: S * with (N Σ) * A sentence in L(G) is a sentential form in which no non-terminals occur / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 2 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 3

Left/right derivations There are choices to be made for each derivation step: which non-terminal must be replaced? which alternative of the selected non-terminal must be applied? Always selecting the leftmost non-terminal in the sentential form gives a leftmost derivation: lm There exists also a rightmost derivation: rm Consider the context-free grammar for expressions: Leftmost derivation for -(a+a) E -E -(E) -(E+E) -(a+e) -(a+a) Rightmost t derivation for -(a+a) E -E -(E) -(E+E) -(E+a) -(a+a) A parse tree for a context-free grammar is a G = (N, Σ, P, S) tree: 1. The root is labeled with S (the start non-terminal) 2. Each leaf is labeled with a terminal ( Σ)or ε 3. All other nodes are labeled with a non-terminal If A is the label of a node and X 1,,X n are the labels of the children (from left to right) then X 1,,X n A must be a production rule in G (with X i is either a terminal or a non-terminal) Special case: ε A with label A which has exactly one child with label ε / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 4 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 5 Example: E -E -(E) -(E+E) -(a+e) -(a+a) E E E E E E E - E - E - E - E - E - E ( E ) ( E ) ( E ) ( E ) ( E ) E + E E + E E + E a a a The parse tree abstracts from the derivation order Acceptor and parser For each grammar G there exists a decision procedure (acceptor) AG for L(G): AG: STRING {true, false such that AG(w) =true w L(G) A parser is an acceptor which constructs a parse tree as well. A top-down parser constructs the tree starting from the root A bottom-up parser constructs the tree starting from the leafs / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 6 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 7

During parsing the following problems may occur: The grammar is ambiguous The grammar is left recursive The grammar contains cycles A grammar G is ambiguous if one word w L(G) ( ) has at least two parse trees Expression grammar without associativities and priorities Dangling else problem A grammar is immediate left recursive if the grammar contains a rule of the form A A grammar is left recursive if there exists a non- terminal A and a string (N Σ)* such that A A This means that after one or more steps in a derivation an occurrence of A reduces again to an occurrence of A without recognizing any terminal in the input sentence. / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 8 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 9 Examples of indirect left recursion B A A B or worse B A DA B ε D G D It is easy to remove left recursion from a context-free grammar Elimination of left recursion A A A produce the sentential forms: n A set of equivalent (non left recursive) rules are A A A A ε A / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 10 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 11

Example: (1) E + T E (immediate left rec.) (2) T E (3) T * F T (immediate left rec.) (4) F T (5) ( E ) F (6) a F Applying the left recursion elimination transformation: (1) E E (with = + T) (2) E (with = T) Example: (1 ) T E E (2 ) + T E E (2 ) ε E the same for: (3 ) F T T (4 ) * F T T (4 ) ε T / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 12 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 13 Indirect left recursion elimination Suppose we have a rule of the form B A 1 B 2 B n B The rule B A is now transformed into: 1 A 2 A n A This process is repeated until either t A; the process stops, or A A; the immediately left recursion elimination rule can be applied / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 14 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 15

Left factorization In general it is efcient to move the difference between een the alternatives of a non-terminal as far as possible to the left Productions of the form 1 A 2 A n A Are equivalent with A A 1 A n A Example if b then S else S S if b then S S Only at the occurrence of else it can be decided which alternative should have been selected An equivalent grammar is if b then S S S else S S ε S / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 16 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 17 Top-down parsing A top-down parser guesses the next alternative to be recognized, and veries whether this alternative can be recognized in the input. If not, another alternative will be tried Constructs the parse tree starting at the root Finds the leftmost derivation of the sentence Alternative types of top-down parsers: recursive descent parser with backtracking recursive descent parser without backtracking ( predictive parser ) non-recursive predictive parser (uses push-down automaton) Recursive descent parser with backtracking Grammar c A d S a A a b A Parser bool proc S() { if input = c then inptr +:= 1; if A() then if input = d then inptr +:= 1; check for EOF;return(true) ; return(false) / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 18 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 19

bool proc A() { isave := inptr; if input = a then inptr +:= 1; if input = b then inptr +:= 1; return(true) ; inptr := isave; if input = a then inptr +:= 1; return(true) else return(false) Recursive descent without backtracking For ``some'' context-free grammars a recursive descent grammar without backtracking can be derived Grammar: T E E + T E E ε E F T T * F T T ε T ( E ) F a F / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 20 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 21 Recursive descent without backtracking Parser: proc E() { proc E () { T(); if input = + E () then inptr +:= 1; T(); E () proc T() { proc T () { F(); if input = * T () then inptr +:= 1; F(); T () Recursive descent without backtracking proc F() { if input = a then inptr +:= 1 else if input = ( then inptr +:= 1; E(); if input = ) then inptr +:= 1 else ERROR() else ERROR() / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 22 / Faculteit Wiskunde en Informatica 15-9-2010 PAGE 23