Parsing and Pattern Recognition

Similar documents
Convert the NFA into DFA

Closure Properties of Regular Languages

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Overview HC9. Parsing: Top-Down & LL(1) Context-Free Grammars (1) Introduction. CFGs (3) Context-Free Grammars (2) Vertalerbouw HC 9: Ch.

Designing finite automata II

Parse trees, ambiguity, and Chomsky normal form

The transformation to right derivation is called the canonical reduction sequence. Bottom-up analysis

CS 314 Principles of Programming Languages

CS 275 Automata and Formal Language Theory

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

First Midterm Examination

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Finite Automata-cont d

Lexical Analysis Finite Automate

More on automata. Michael George. March 24 April 7, 2014

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

1 Nondeterministic Finite Automata

Formal languages, automata, and theory of computation

Harvard University Computer Science 121 Midterm October 23, 2012

CS 275 Automata and Formal Language Theory

CS 275 Automata and Formal Language Theory

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

First Midterm Examination

Minimal DFA. minimal DFA for L starting from any other

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Chapter 2 Finite Automata

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

Homework 3 Solutions

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

FABER Formal Languages, Automata and Models of Computation

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

This lecture covers Chapter 8 of HMU: Properties of CFLs

CS S-12 Turing Machine Modifications 1. When we added a stack to NFA to get a PDA, we increased computational power

CISC 4090 Theory of Computation

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

Lecture 08: Feb. 08, 2019

Normal Forms for Context-free Grammars

CMSC 330: Organization of Programming Languages

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

Regular expressions, Finite Automata, transition graphs are all the same!!

Context-Free Grammars and Languages

Section 4: Integration ECO4112F 2011

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

1.4 Nonregular Languages

SWEN 224 Formal Foundations of Programming WITH ANSWERS

Nondeterminism and Nodeterministic Automata

p-adic Egyptian Fractions

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Some Theory of Computation Exercises Week 1

2.4 Linear Inequalities and Interval Notation

Java II Finite Automata I

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

ɛ-closure, Kleene s Theorem,

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem. Kleene s Theorem 2/16/15

CS 330 Formal Methods and Models Dana Richards, George Mason University, Spring 2016 Quiz Solutions

Thoery of Automata CS402

Worked out examples Finite Automata

1 From NFA to regular expression

Lecture 3: Equivalence Relations

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

1.3 Regular Expressions

Review for the Midterm

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

3 Regular expressions

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes

Let's start with an example:

Coalgebra, Lecture 15: Equations for Deterministic Automata

Name Ima Sample ASU ID

CS375: Logic and Theory of Computing

Regular Language. Nonregular Languages The Pumping Lemma. The pumping lemma. Regular Language. The pumping lemma. Infinitely long words 3/17/15

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Exercises Chapter 1. Exercise 1.1. Let Σ be an alphabet. Prove wv = w + v for all strings w and v.

CS241 Week 6 Tutorial Solutions

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

Bases for Vector Spaces

PART 2. REGULAR LANGUAGES, GRAMMARS AND AUTOMATA

Tutorial Automata and formal Languages

SUMMER KNOWHOW STUDY AND LEARNING CENTRE

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Deterministic Finite Automata

Interpreting Integrals and the Fundamental Theorem

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

CS 330 Formal Methods and Models

CHAPTER 1 PROGRAM OF MATRICES

Bridging the gap: GCSE AS Level

Lexical Analysis Part III

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Transcription:

Topics in IT Prsing nd Pttern Recognition Week Context-Free Prsing College of Informtion Science nd Engineering Ritsumeikn University

this week miguity in nturl lnguge in mchine lnguges top-down, redth-first prser: Erley s lgorithm otining derivtion from the prser chrt

miguity ungyun Seo nd Roert F. Simmom Syntctic Grphs: A Representtion for the Union of All Amiguous prse Trees \f I vpp rood npp [,rrow,nl I I Sentence: Time flies like n rrow. Figure 6 Grph Representtion nd Prse Trees of Highly Amiguous Sentence. OED: most-used 00 words verge menings ech reding, ecuse ech node cn e modifier node only once in one reding. Therefore, we cn focus on the rcs pointing to the sme node s miguous points. In terms of triples, ny two triples with identicl modifier terms revel point of miguity, where modifier term is dominted y more thn one node. In the exmple in Figure, the syntctic miguities ech position must prticipte in every syntctic reding of syntctic grph, every node which is not root node nd hs only one in-rc, must lwys e included in every syntctic reding. Such unmiguous nodes re common to the intersections of ll possile redings. When we know the exct loctions of severl pieces in jigsw puzzle, it is much esier to plce the other

Jungyun Seo nd Roert F. Simmons miguity Syntctic Grphs: A Representtion for the Union of All Amiguous Prse Trees., Jl [I'se''vl I vtp. _... _ ~ /.I"....,,, i i,.,.,...o..o, I I dot Sentence: I sw mn on the hill with telescope. I[8,,rtl [ trees in shred, pcked-prse forest) We clim tht syntctic grph represented y the triples nd n exclusion mtrix contins ll importnt syntctic informtion in the prse forest. In the next section, we motivte this work with n exmple. Then we riefly introduce X (X-r) theory Figure : Syntctic Grph of the Exmple Sentence. OED: most-used 00 words verge menings ech I sw mn on the hill with telescope. I clened the lens to get etter view. When we red the first sentence, we cnnot determine whether the mn hs telescope or the telescope is used to see the mn. This is known s the PP-ttchment prolem, nd mny reserchers hve proposed vrious wys to solve it (Frzier nd Fodor 979; Shuert 98,

miguity ny grmmr cn e mde miguous y dding duplictes: two derivtions for the empty string: A ɛ B ɛ infinite numer of derivtions for the empty string: A A ɛ left- nd right-recursive derivtions (nd ny comintion of them): nd... A A A ɛ A A + A A - A clssic exmple, dngling else : if (x) if (y) P; else Q;

miguity not prolem in trditionl lnguges designed/implemented y one person who is experienced in grmmrs who would never llow the ove exmples in their grmmr ut: there re lnguges with extensile grmmrs end user cn dd rules to existing grmmr without complete knowledge of the se grmmr without experience in lnguge design nd while we cnnot stop them reking the lnguge... we cn ttempt to produce correct results even when miguity is introduced 6

miguity recursive descent cnnot hndle these constructions something more powerful required: emrcing multiple derivtions (for miguity) immune to infinite left-recursion immune to infinite right-recursion etc. in other words technique to prse ny CFG we lredy sw NFAs deling esily with miguity prllel mtching lgorithm let s review how... 7

Let s try to mtch: "c" prllel recognition with NFAs egin y dding just the strt stte to the first set (input position 0) ( )*?c 0 c input string nd set of permitted sttes 6 7 8 c 9 8

prllel recognition with NFAs ( )*?c 6 0 c 7 8 input string nd set of permitted sttes c 9 9

prllel recognition with NFAs ( )*?c 6 0 c 7 8 input string nd set of permitted sttes c 9 0

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 7 8 input string nd set of permitted sttes c 9 6

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 7 8 input string nd set of permitted sttes c 9 7

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 8

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 0

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 6

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 7

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 7 8 input string nd set of permitted sttes c 9 8

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 input string nd set of permitted sttes c 9 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 input string nd set of permitted sttes c 9 0

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 6 0 c 6 7 8 8 6 7 8 7 8 input string nd set of permitted sttes c 9

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9 6

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9 7

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9 8

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9 9

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9 0

prllel recognition with NFAs ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9

prllel recognition with NFAs Success! ( )*?c 0 c input string nd set of permitted sttes 6 8 6 9 7 7 8 8 6 7 8 c 9

tht technique ws useful to turn top-down prsing ck-trcking NFA implementtion into prllel (non-ck-trcking) NFA implementtion we sw (lst week) how top-down ck-trcking prser cn prse input ccording to (restricted) CFG grmmr y expnding sententil forms rememering positions s prser items cking up when the derivtion cnnot continue we will remove ck-trcking from top-down prsing using similr technique to the one we used to remove ck-trcking from NFA mtching first review of the ck-trcking top-down prsing...

top-down prsing: ck-trcking production sententil input L,, grmmr for list of two or more digits: L N, L N, L,, N, L,, L N, L, L,, L N, N, L,, L N, L, N, L,, N [0 9] 6 N,, L,, 7,, L,, 8,, L,, 9 L N, L,, N, L,, 0 N,,, L,,,,, L,,,,, L,, input, ck-trck to 8,, L,, nd try different prodution for L L N, N,, N, N,, N,,, N,, 6,,, N,, input, ck-trck to 7, L,, nd try different production for L 8 L N, N, N, N,, 9 N,, N,, 0,, N,,,, N,, N,,,,,,,, derivtion:,, 8, 9,

exmple: prllel prsing of CFGs using the grmmr P S S S + M M M M * T T T [0-9] # the strt rule let s consider prsing the sentence: + * the initil input position will e: +* the initil stte (item) set will e: S 0 = {(P S)}

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S) initil stte inspecting the first item in the set... the item s position ( ) is immeditely efore non-terminl S we predict n S might pper t this input position we dd items corresponding to ech production of S to the set (there re two of them) 6

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from inspecting the next item in the set... the items corresponding to ech production of S re lredy in the set, so there is nothing to do 7

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from inspecting the next item in the set... we predict n M might pper t this input position we dd items for ech production of M to the set (there re two of them) 8

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from (M M * T ) predict from (M T ) predict from inspecting the next item in the set... we predict n M might pper t this input position the items corresponding to ech production of M re lredy in the set, so there is nothing to do 9

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from (M M * T ) predict from (M T ) predict from inspecting the next item in the set... we predict T might pper t this input position dd n item corresponding to its production... 0

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from (M M * T ) predict from (M T ) predict from 6 (T ) predict from the current item is positioned efore terminl we check the input for the sme terminl, nd if present... dd n item to the next set representing the prser stte fter scnning pst the terminl

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from (M M * T ) predict from (M T ) predict from 6 (T ) predict from S = (T ) scn from S 0.6 there re no more sttes left in set S 0, so... dvnce the input position to the next terminl symol repet the process with the next set, S

exmple: prllel prsing of CFGs P S S S + M M M M * T T T [0-9] # the strt rule + * S 0 = (P S ) initil stte (S S + M) predict from (S M ) predict from (M M * T ) predict from (M T ) predict from 6 (T ) predict from S = (T ) scn from S 0.6 the current item (now in S ) is positioned t the end of its production we hve completely recognised T we cn dvnce pst it in the set where it ws originlly predicted: T T ut, how do we know tht the item we re looking for is numer in set S 0?

exmple: prllel prsing of CFGs let s dd the originl input position to our stte set items insted of storing sets of (X α β) we will store sets of (j : X α β) where j is the input position where this X ws originlly predicted then, when we rech the end of n item, such s (j : X αβ ) we look in set S j for ny items tht originlly predicted X those items will contin α X β which cn now e dvnced to α X β

reooted exmple using the grmmr P S S S + M M M M * T T T [0-9] # the strt rule let s consider prsing the sentence: + * the initil input position will e: +* the initil stte set will e: S 0 = { (0 : P S) }

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) ( 0 : T [0-9] ) we completed T t position 0, so... copy the corresponding item from S 0 to S with the T chnged to T 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) ( 0 : M T ) we completed n M t position 0, so... copy the corresponding items to S with the M chnged to M 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) ( 0 : M M * T) (0 : S M ) the next item requires * on the input, which is not there move onto the next item 8

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T) ( 0 : S M ) we hve completed n S t position 0 copy the corresponding items with S chnged to S 9

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) ( 0 : S S + M) (0 : P S ) we need to scn +, which is the next input symol, so... copy the current item to the next set, with + chnged to + 60

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) ( 0 : P S ) (0 : S S + M) we hve lso completed the strt rule P in position 0 this input ( ) would e vlid prse, ut... we re not t the end of the input, so we cn continue 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) there re no more items left in S, so... dvnce the input to the next terminl symol egin considering the next stte set, S 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) ( 0 : S S + M) predict n M dd two items corresponding to the two productions for M 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) ( : M M * T ) ( : M T ) predict n M gin ll corresponding items re lredy present in S, so do nothing 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) ( : M M * T ) ( : M T ) predict T dd the corresponding item 6

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) we need to scn digit which we hve ( ), so dd the scnned item to the next stte set, S 66

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) ( : T [0-9] ) no more items left in S dvnce the input to the next terminl, nd consider the next stte set, S 67

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) * S S we hve completed T in S dd the corresponding item to the set, chnging T to T 68

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) * S S we hve completed n M in S dd the corresponding items to the set, chnging M to M 69

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 S + S S * S S (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) we need to scn *, which is present on the input dd the corresponding item to the next set, chnging * to * 70

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) ( 0 : S S + M ) * S ( : M M * T) S we hve completed n S in S 0 dd the corresponding items to the set, chnging S to S 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) ( 0 : P S ) (0 : S S + M) * S ( : M M * T) S we hve completed P in S 0 this input ( + ) would e vlid prse, ut there is more input continue with the next item 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) ( 0 : S S + M) * S ( : M M * T) S we need to scn +, which is not present move on to the next item 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) S no more items in S dvnce input, consider the next stte set S 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) S predict T dd the corresponding item 7

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S need to scn digit, which is present on the input ( ) copy the item to the next set, with [0-9] chnged to [0-9] 76

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) no more items in S dvnce input, consider S 77

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) completed T in S copy the corresponding item to the current set, with T chnged to T 78

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) completed n M in S copy the corresponding item to the current set, with M chnged to M 79

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) (0 : S S + M ) need to scn *, which is not present ignore the item, move on to the next item 80

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) ( 0 : S S + M ) completed n M in S copy the items, with M chnged to M 8

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) (0 : S S + M ) ( 0 : S S + M) (0 : P S ) need to scn +, which is not present on the input move on to the next item 8

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) (0 : S S + M ) (0 : S S + M) ( 0 : P S ) completed P (the strt symol) in S 0 (strt of input) this ( +* ) is vlid prse of the input there is no more input: ccept this s one possile derivtion 8

P S S S + M M M M * T T T [0-9] complete exmple # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : P S ) (0 : S S + M) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) (0 : S S + M ) (0 : S S + M) (0 : P S ) there re no more items in this set, nd no more sets there re no more possile prses of the input ll derivtions ccepted so fr re possile prses of the input 8

complete exmple P S S S + M M M M * T T T [0-9] # the strt rule S 0 (0 : P S ) (0 : S S + M) (0 : S M ) (0 : M M * T ) (0 : M T ) (0 : T [0-9] ) S (0 : T [0-9] ) (0 : M T ) (0 : M M * T ) (0 : S M ) (0 : S S + M) (0 : P S ) + S (0 : S S + M) ( : M M * T ) ( : M T ) ( : T [0-9] ) S ( : T [0-9] ) ( : M T ) ( : M M * T ) (0 : S S + M ) (0 : S S + M) (0 : P S ) * S ( : M M * T) ( : T [0-9] ) S ( : T [0-9] ) ( : M M * T ) ( : M M * T ) (0 : S S + M ) (0 : S S + M) (0 : P S ) we cn formulte this process s few simple rules... 8

prsing CFGs: items let upper-cse letters (e.g., X or Y) e non-terminl symols lower-cse letters (e.g., p or q) e terminl symols greek letters (e.g., α or β) e ny sequence of symols n item is production with dot indicting the current position X α β nd so... X α β γ = n item tht my soon mtch X X α β γ = n item tht hs egun mtching X X α β γ = n item tht hs lmost mtched X X α β γ = n item tht hs finished mtching X 86

prsing CFGs: stte sets stte is tuple contining n item nd n input position (i : X α β) the item s production sys wht we re trying to mtch the dot sys how much of the production we hve mtched the position i indictes where the production egn the origin position of X for every input position, the prser genertes stte set position 0 is efore the first token of the input sentence position n is the position fter ccepting the n th token the stte set t input position k is clled S k if the strt symol is S nd the strt rule is S α then, initilly, S 0 = { (0 : S α) } 87

prsing CFGs there re three possile forms tht ech stte might tke: stte in S k interprettion (j : X α p β) we expect to see terminl p next on the input (j : X α Y β) we expect to mtch the entire non-terminl Y next (j : X γ ) we hve completely mtched n X t input position j ech cse suggests n pproprite response: stte in S k interprettion (j : X α p β) try to scn p t the current position (j : X α Y β) predict we might see Y t the current position (j : X γ ) complete the mtching of X in stte S j 88

prsing CFGs ech response stte in S k interprettion (j : X α p β) try to scn p t the current position (j : X α Y β) predict we might see Y t the current position (j : X γ ) complete the mtching of X in set S j trnsltes into mnipultions of (dditions to) the stte sets: stte in S k new stte(s) (j : X α p β) if the next input token is p, then scn p dd (j : X α, p β) to stte set S k+ (j : X α Y β) for every production in the grmmr for Y, Y γ predict Y dd (k : Y γ) to S k (j : X γ ) for ll sttes in S j of the form (i : Y α X β) complete X dd (i : Y α X β) to S k 89

prsing CFGs the result is chrt of the pths tken during prsing oth successful, nd unsuccessful ll successful derivtions will hve completion of the strt rule steps tken within derivtions cn e found esily identify completion of the initil (strt) rule follow completions nd scns ckwrds to the initil rule in S 0 mrk ech step encountered prse tree cn then e reconstructed from the mrked steps either following scns nd predictions forwrds (top-down) or following scns nd completions ckwrds (ottom-up) 90

prsing CFGs first vlid prse: S 0. (P S, 0) initil stte S 0. (S M, 0) predict from S 0. (M T, 0) predict from S 0.6 (T [0-9], 0) predict from S. (T [0-9], 0) scn from S 0.6 = (T [0-9], 0) S. (M T, 0) complete from, S 0. = (M T, 0) S. (S M, 0) complete from, S 0. = (S M, 0) S.6 (P S, 0) complete from, S 0. = (P S, 0) 9

nother vlid prse: + prsing CFGs S 0. (P S, 0) initil stte S 0. (S M, 0) predict from S 0. (M T, 0) predict from S 0.6 (T [0-9], 0) predict from S. (T [0-9], 0) scn from S 0.6 = (T [0-9], 0) S. (M T, 0) complete from, S 0. = (M T, 0) S. (S M, 0) complete from, S 0. = (S M, 0) S. (S S + M, 0) complete from, S 0. = (S S + M, 0) S. (S S + M, 0) scn from S. = (S M + M, 0) S. (M T, ) predict from S. (T [0-9], ) predict from S. (T [0-9], ) scn from S. S. (M T, ) complete from, S. = (M T, ) S. (S S + M, 0) complete from, S. = (S S + M, 0) S.6 (P S, 0) complete from, S 0. = (P S, 0) 9

prsing CFGs finl vlid prse: + * S 0. (P S, 0) initil stte S 0. (S M, 0) predict from S 0. (M T, 0) predict from S 0.6 (T [0-9], 0) predict from S. (T [0-9], 0) scn from S 0.6 S. (M T, 0) complete from, S 0. S. (S M, 0) complete from, S 0. S. (S S + M, 0) complete from, S 0. S. (S S + M, 0) scn from S. S. (M T, ) predict from S. (T [0-9], ) predict from S. (T [0-9], ) scn from S. S. (M T, ) complete from, S. S. (M M * T, ) complete from, S. S. (M M * T, ) scn from S. S. (T [0-9], ) predict from S. (T [0-9], ) scn from S. S. (M M * T, ) complete from, S. S. (S S + M, 0) complete from, S. S.6 (P S, 0) complete from, S 0. 9

Erley prsing this form of chrt prser ws invented y Jy Erley in 968 hence we cll it n Erley Prser it cn prse ny context-free grmmr LL nd LR comptile grmmrs in liner time: O(n) ny non-miguous grmmrs in t most qudrtic time: O(n ) ny miguous CF grmmr in (t most) cuic time: O(n ) where n is the size of the input sentence it performs prticulrly well with left-recursive rules very good for left-ssocitive opertors (e.g., most rithmetic opertors) it will deliver ll vlid prses of the input nd useful informtion even if there re no vlid prses Erley prsers re populr for nturl lnguge processing 9

Erley prsing function erley-prse(input, grmmr) = dd (0 : S γ) to S 0 for ech stte set S 0, S,..., S k for ech stte s S k if s = (j : X α p β) /* scn */ if input[k] = p dd (j : X α p β) to S k+ else if s = (j : X α Yβ) /* predict */ for ech (Y γ) grmmr dd (k : Y γ) to S k else /* s = (j : X γ ) */ /* complete */ for ech (i : A α X β) S j dd (i : A α X β) to S k 9

homework prctice Erley prsing using the input: 9*+ P S S S + M M M M * T T T [0-9] # the strt rule S 0 9 S * S S + S S 96

glossry complete step in n Erley prser tht finishes recognising production. All sttes tht were predicting the corresponding non-terminl t the position where the rule egn cn e dvnced nd dded to the current stte. extensile lnguge whose syntx (or semntics) cn e extended y the user, who is often not n expert in the domin of syntx (or semntics). item prser stte, used s n element of stte set or prser stte mchine, contining production rule nd dot representing the progress tht hs een mde in recognising the right-hnd side of tht rule. origin position the input position t which the recognising of production s right-hnd side ws egun. position numericl offset from the strt of the sentence, mesured in tokens. 97

predict step in n Erley prser tht predicts seeing non-terminl next in the input. All productions corresponding to the non-terminl re dded to the current stte set. scn step in n Erley prser tht mkes progress y dvncing dot over terminl symol when the next input symol mtches it. Progress is noted y dding new item to the next stte set with the dot moved to fter the terminl symol in the production s right hnd side. stte tuple in n Erley prser comining prsing item with n origin position recording where in the sentence the prsing of the rule ws egun. stte set in n Erley prser, set of sttes ssocited with specific position in the input. tuple pir of vlues tht re ssocited with ech other. In n Erley prser, ech stte is tuple tht comines prser item with n input position. 98