Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch.

Similar documents
Closure Properties of Regular Languages

CS 314 Principles of Programming Languages

1.4 Nonregular Languages

Formal languages, automata, and theory of computation

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

Convert the NFA into DFA

FABER Formal Languages, Automata and Models of Computation

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

The transformation to right derivation is called the canonical reduction sequence. Bottom-up analysis

CS 275 Automata and Formal Language Theory

Parsing and Pattern Recognition

1.3 Regular Expressions

CS 275 Automata and Formal Language Theory

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

First Midterm Examination

Context-Free Grammars and Languages

CS 275 Automata and Formal Language Theory

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Review for the Midterm

Tutorial Automata and formal Languages

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

Chapter 2 Finite Automata

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

Harvard University Computer Science 121 Midterm October 23, 2012

First Midterm Examination

Regular expressions, Finite Automata, transition graphs are all the same!!

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir

Minimal DFA. minimal DFA for L starting from any other

Nondeterminism and Nodeterministic Automata

CS 330 Formal Methods and Models

Let's start with an example:

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Chapter 4 Regular Grammar and Regular Sets. (Solutions / Hints)

SWEN 224 Formal Foundations of Programming WITH ANSWERS

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

CHAPTER 1 Regular Languages. Contents

Homework 4. 0 ε 0. (00) ε 0 ε 0 (00) (11) CS 341: Foundations of Computer Science II Prof. Marvin Nakayama

Finite Automata-cont d

Lecture 6 Regular Grammars

Parse trees, ambiguity, and Chomsky normal form

Normal Forms for Context-free Grammars

Designing finite automata II

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

CSC 311 Theory of Computation

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Formal Languages and Automata Theory. D. Goswami and K. V. Krishna

NFAs continued, Closure Properties of Regular Languages

CMSC 330: Organization of Programming Languages

Formal Languages Simplifications of CFGs

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

CSC 473 Automata, Grammars & Languages 11/9/10

Lexical Analysis Finite Automate

Greedy regular expression matching

Lexical Analysis Part III

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

CS375: Logic and Theory of Computing

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Homework 3 Solutions

I. Theory of Automata II. Theory of Formal Languages III. Theory of Turing Machines

CISC 4090 Theory of Computation

NFAs continued, Closure Properties of Regular Languages

Regular Languages and Applications

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Formal Languages and Automata

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Automata and Languages

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

Lecture 09: Myhill-Nerode Theorem

Name Ima Sample ASU ID

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

3 Regular expressions

More on automata. Michael George. March 24 April 7, 2014

General idea LR(0) SLR LR(1) LALR To best exploit JavaCUP, should understand the theoretical basis (LR parsing);

The University of Nottingham

Finite-State Automata: Recap

Regular Language. Nonregular Languages The Pumping Lemma. The pumping lemma. Regular Language. The pumping lemma. Infinitely long words 3/17/15

2. Lexical Analysis. Oscar Nierstrasz

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

Bottom-Up Parsing. Canonical Collection of LR(0) items. Part II

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

1 From NFA to regular expression

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

State Minimization for DFAs

Lecture 08: Feb. 08, 2019

Theory of Computation Regular Languages

CS 330 Formal Methods and Models

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Closure Properties of Regular Languages

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

Thoery of Automata CS402

Transcription:

Overview H9 Vertlerouw H 9: Prsing: op-down & LL(1) do 3 mei 2001 56 heo Ruys h. 8 - Prsing 8.1 ontext-free Grmmrs 8.2 op-down Prsing 8.3 LL(1) Grmmrs See lso [ho, Sethi & Ullmn 1986] for more thorough discussion. INF 5037 - tel. 3716 ruys@cs.utwente.nl donderdg 3 mei 2001 (56) Vertlerouw - H9 1 donderdg 3 mei 2001 (56) Vertlerouw - H9 2 Introduction Prser (= syntx nlyser) checks whether the input progrm is syntcticlly correct usully specified y context-free grmmr regulr expressions fi finite-stte utomton context-free grmmr fi stck utomton is usully ugmented with ctions for context constrints code optimistion nd genertion not only for progrmming lnguges, ut for ll progrms tht process structured dt prsing strtegies: top-down prsing ottom-up prsing donderdg 3 mei 2001 (56) Vertlerouw - H9 3 ontext-free Grmmrs (1) ontext-free Grmmr (FG) G is defined y 4-tuple (N,, P, S) S: strt symol S N tokens tht occur P: production rules : finite set of terminls N: finite set of non-terminls define structure xmple: G = ({,B}, {,,c},, P) where P: fi fi B B fi B fi c regexp: ( c) N β (N ) V N Nottionl conveniences: only prove the production rules use choice opertor: fi B donderdg 3 mei 2001 (56) Vertlerouw - H9 4 ontext-free Grmmrs (2) FG is specifiction of rewrite system. FGs re used to derive strings of terminls. Nottion: α, β, γ, δ (N ) = V string of symols u, v, w string of terminls X, Y, Z (N ) single grmmr symol, B,, D N single non-terminl,, c single terminl 1-step derivtion: αγ αβγ using production rule: fi B donderdg 3 mei 2001 (56) Vertlerouw - H9 5 FGs (3) Derivtion αγ αβγ left-most derivtion if α = w, then wγ l αβγ right-most derivtion if γ = w, then αw r αβw zero or more steps α β one or more steps α β Recursion left-recursive derivtion if α then the FG is left-recursive right-recursive derivtion if α then the FG is right-recursive α, β, γ (N ) u, v, w X, Y, Z (N ), B, N,, c donderdg 3 mei 2001 (56) Vertlerouw - H9 6 1

FGs (4) erminology (cont.) if S β then β is sententil form if S w then w is sentence xmple: fi D D fi c α, β, γ (N ) u, v, w X, Y, Z (N ), B, N,, c ontext-free Grmmrs (5) ontext-free Lnguge (FL) FL = the set of ll sentences derived from FG FL(G) = { w S w } Previous exmple ( fi D D fi c): FL = {, c,, c,, c,, c,... } Prse tree: nother representtion of derivtion D sententil forms sentence D corresponds with D donderdg 3 mei 2001 (56) Vertlerouw - H9 7 donderdg 3 mei 2001 (56) Vertlerouw - H9 8 xmple: fi fi fi ontext-free Grmmrs (6) wo wys of deriving sentence in the FL corresponding to G: G is miguous! In this cse, G does not define the reltive priorities of nd oth derivtions re left-derivtions donderdg 3 mei 2001 (56) Vertlerouw - H9 9 ontext-free Grmmrs (7) Unmiguous grmmr: fi fi fi fi () hs priority over () n extr nonterminl is used to solve the priority/miguity prolem donderdg 3 mei 2001 (56) Vertlerouw - H9 10 ontext-free Grmmrs (8) Infmous dngling-else prolem: S fi if then S S fi if then S else S if then if then S else S ontext-free Grmmrs (9) Prolem: verifying tht the lnguge L is generted y grmmr G i.e. to prove tht: L(G) = L fi verify: if S w then w L verify: if w L then S w if S then S if then S else S if S then if S else S then S xmple: L is the lnguge consisting of lnced prntheses. G: S fi ( S ) S Proof sketch: use induction on the numer of derivtion steps nd the length of the sentence donderdg 3 mei 2001 (56) Vertlerouw - H9 11 donderdg 3 mei 2001 (56) Vertlerouw - H9 12 2

FGs (10) S fi ( S ) S FGs (11) S fi ( S ) S fi verify tht every generted string is lnced n=1 (one step derivtion) e is lnced n>1 ssume tht ll strings re lnced for <n-step derivtions conser n n-step derivtion, which will e of the form: S (S) S (x) S (x) y x nd y must e lnced (oth re cses of <n derivtions), hence (x)y is lnced. verify tht ll lnced-prenthesis strings cn e generted from S n=0 (length of sentence) e is derivle from S n>0 ssume tht every string of length <2n is derivle conser lnced string of length 2n (for n 1) let (x) e the shortest prefix of the lnced string the lnced string cn e written s (x)y where x nd y re oth lnced, nd re oth <2n in length; therefore they re derivle; Hence, we cn find S (S) S (x) S (x) y donderdg 3 mei 2001 (56) Vertlerouw - H9 13 donderdg 3 mei 2001 (56) Vertlerouw - H9 14 ontext-free Grmmrs (12) Rs vs FGs R cn lwys e expressed s FG. lgorithm: 1. " stte s, crete nonterminl s 2. " trnsition lelled, write s t 3. For ccept sttes s, write s e 4. he strt stte is the egin symol. R: { } NF 0 1 2 donderdg 3 mei 2001 (56) Vertlerouw - H9 15 3 FG: 0 0 0 0 0 1 1 2 2 3 3 -> e ontext-free Grmmrs (13) So: RL is lwys context-free. FL is usully not regulr. xmples: L 1 = { n n 1} regulr: L 2 = { n n n 1} not regulr context free: S fi S L 3 = { n n c n n 1} not regulr not context free Ide: RL/FL cn lwys e written in form so tht sustring/stte is repeted. he Pumping Lemms for regulr expressions nd grmmrs should e used to prove tht lnguge L is not RL or FL. (see [ Sudkmp 1991]) finite utomton cnnot keep count grmmr cn count two items, ut not three donderdg 3 mei 2001 (56) Vertlerouw - H9 16 op-down Prsing (1) R Use element construction suset construction to generte DF (= scnner) DF = finite-stte utomton FG n we lso generte prser for FG? S = stck utomton S is NF (or DF) with n extr stck. he stck gives the F the extr power. prser is n lgorithm sed on S tht egins with strt symol of FG nd derives sentence. op-down Prsing (2) Recll recursive-descent prsing (h.1) procedure is ssocited with ech nonterminl N in the grmmr. he ody of the procedure my contin sttements tht mtch terminls; sttements tht cll procedures for ech nonterminl in the right-hnd se of the production of N; semntic ctions. Recursive-descent prsers implicitly use stck, i.e. the cll-stck of the procedures. op-down prsing: uilding the prse tree from the root (i.e. the strt symol). donderdg 3 mei 2001 (56) Vertlerouw - H9 17 donderdg 3 mei 2001 (56) Vertlerouw - H9 18 3

op-down Prsing (3) xmple (using n explicit stck): fi B fi e string = B fi stck input $ B $ BB $ BB $ B $ B $ $ - $ donderdg 3 mei 2001 (56) Vertlerouw - H9 19 B B For the nonterminl on top of production rule is executed. erminls on top of the stck get popped, while dvncing the look-hed pointer in the input. e B B BB B ll left-derivtions op-down le-driven D-D lgorithm: ool DD() { Stck s; ool ccept=true; s.init(); s.push(s); D Prsing (4) Note tht in ech itertion symol is popped from the stck while (ccept && (look_hed!=$!s.empty())) { top = s.pop(); if (top ) { if (top!= look_hed) ccept=flse; else look_hed=red_input(); } else if (top N) { // ssume top == Select some production fi X 1... X n s.push(x 1,...,X n); } else ccept=flse; might e nondeterministic... } In the D-D lgorithm, the selection return ccept; of production rule is driven y tle. } donderdg 3 mei 2001 (56) Vertlerouw - H9 20 LL(1) (1) So we my hve choice of production rules fi α hoosing production rule non-predictive: rndomly (requires cktrcking!) predictive: using the look-hed symols in the input LL(k) If y looking hed k symols in the input strem, we cn lwys choose the right production rule, the given grmmr is (strong) LL(k). L: left-to-right scnning through the input strem L: left-derivtion donderdg 3 mei 2001 (56) Vertlerouw - H9 21 LL(1) (2) strong LL(k) vs. (norml) LL(k) strong LL(k): we only conser the look-hed tokens in the input strem when choosing production rule. LL(k): prt from the look-hed symols in the input, we my lso use the input tokens tht hve lredy een red to choose production rule. clss of strong LL(k) grmmrs clss of LL(k) grmmrs xmple: p 1 : fi p 2 : fi LL(1) LL(2) LL(3)... If k=1, we cnnot tell if p 1 or p 2 should e pplied. herefore, the grmmr is not LL(1); it is LL(2). donderdg 3 mei 2001 (56) Vertlerouw - H9 22 LL(1) (3) onser k=1 LL(1) = strong LL(1) LL(1) grmmrs re sufficient to descrie most progrmming constructs Define: prefix(w) = first terminl of w FIRS(α) New definition of LL(1): Given G nd productions fi α nd, then if FIRS(α.FOLLOW()) FIRS(β.FOLLOW()) = then G is LL(1). α nd β might e e = { terminls tht re first in sentence w derived from α } = { α w nd =prefix(w), for some w } FOLLOW() = { terminls tht re in FIRS(γ) in some sententil form βγ } = { S βγ nd FIRS(γ), for some β,γ V } donderdg 3 mei 2001 (56) Vertlerouw - H9 23 xmple: G 1 is defined y p 1 : fi B p 2 : fi e p 3 : B fi p 4 : B fi c LL(1) (4) L(G) = {, c,, c, c, cc,... } donderdg 3 mei 2001 (56) Vertlerouw - H9 24 B LL(1)-test for G 1: p 1 nd p 2 FIRS(B.FOLLOW()) FIRS(e.FOLLOW()) = {} {$} = p 3 nd p 4 FIRS(.FOLLOW(B)) FIRS(c.FOLLOW(B)) = {} {c} = B c B e Hence, G1 is LL(1) 4

LL(1) (5) xmple: G 2 is defined y p 1 : fi B L(G) = {, c,, c, p 2 : fi e c, cc,... } p 3 : B fi p 4 : B fi c LL(1)-test for G 2: p 1 nd p 2 FIRS(B.FOLLOW()) FIRS(e.FOLLOW()) = {} {FOLLOW()} = {} {} = {} G2 is not LL(1) LL(1) (6) Nottionl convenience: Insted of using the expression FIRS(α.FOLLOW()) for production rule fi α, we define DIRS( fi α) = FIRS(α), if MPY(α) = FIRS(α) FOLLOW(), otherwise he DIRS set cn e used to compute the prse tle DIRS( fi α) = { 1, 2,... } DIRS() = { 1, 2,... } Now: M(, i) = fi α M(, i) = B fi β In generl: M(,) = fi α, if DIRS( fi α) donderdg 3 mei 2001 (56) Vertlerouw - H9 25 donderdg 3 mei 2001 (56) Vertlerouw - H9 26 LL(1) (7)... we know how to check whether G is LL(1)... Left fctoristion fi αβ fi αγ cnnot e LL(1) ecomes liminte left recursion fi α cnnot e LL(1)... It is not decle, though ecomes When G is not LL(1), cn it e mde LL(1)? fi αb B fi β B fi γ which is LL(1) if FIRS(β) FIRS(γ) = fi B B fi β fi α fi e which is LL(1) if FIRS(α.FOLLOW()) FIRS(e.FOLLOW()) = nswer: sometimes donderdg 3 mei 2001 (56) Vertlerouw - H9 27 LL(1) (8)... it does not lwys work (e.g. left fctoristion) D is not importnt fi B fi It my not directly cler why this G is not LL(1). fi DB fi D fi α fi D fi B fi fi α he nonterminl now ssumes the role of the originl : this method will not terminte! donderdg 3 mei 2001 (56) Vertlerouw - H9 28 oncluding remrks: LL(1) (9) Section 8.3.3 of the ook/reder contins n extensive nd forml discussion on the LL(1)-test using the function MPY nd the sets LDING (generlistion of FIRS), RILING, FOLLOW nd DIRS. lgorithms re presented to utomticlly clculte these sets to perform the LL(1)-test; nd if the grmmr is LL(1), the DIRS cn directly e used to construct the prse tle. We will riefly discuss these sets (nd lgorithms) in H10 when presenting FGs. donderdg 3 mei 2001 (56) Vertlerouw - H9 29 5