CSCI Compiler Construction

Similar documents
CYK Algorithm for Parsing General Context-Free Grammars

Properties of Context-Free Languages

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Even More on Dynamic Programming

Grammars and Context Free Languages

Grammars and Context Free Languages

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages

Computing if a token can follow

Remembering subresults (Part I): Well-formed substring tables

Formal Languages, Grammars and Automata Lecture 5

Introduction to Computational Linguistics

h>p://lara.epfl.ch Compiler Construc/on 2011 CYK Algorithm and Chomsky Normal Form

straight segment and the symbol b representing a corner, the strings ababaab, babaaba and abaabab represent the same shape. In order to learn a model,

RNA Secondary Structure Prediction

Chap. 7 Properties of Context-free Languages

CS311 Computational Structures. NP-completeness. Lecture 18. Andrew P. Black Andrew Tolmach. Thursday, 2 December 2010

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings

Formal Languages and Automata

CS20a: summary (Oct 24, 2002)

Context-Free Grammars (and Languages) Lecture 7

Context-Free Languages (Pre Lecture)

CFLs and Regular Languages. CFLs and Regular Languages. CFLs and Regular Languages. Will show that all Regular Languages are CFLs. Union.

The Pumping Lemma for Context Free Grammars

Introduction to Theory of Computing

A parsing technique for TRG languages

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Chomsky Normal Form for Context-Free Gramars

Follow sets. LL(1) Parsing Table

CS 373: Theory of Computation. Fall 2010

CPSC 313 Introduction to Computability

Non-context-Free Languages. CS215, Lecture 5 c

Foundations of Informatics: a Bridging Course

Statistical Machine Translation

Computability Theory

Address for Correspondence

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Automata Theory CS F-08 Context-Free Grammars

Decidable and undecidable languages

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

Pushdown Automata (Pre Lecture)

CS375: Logic and Theory of Computing

Finite Automata and Formal Languages TMV026/DIT321 LP Useful, Useless, Generating and Reachable Symbols

This lecture covers Chapter 7 of HMU: Properties of CFLs

Compiling Techniques

Maschinelle Sprachverarbeitung

CPS 220 Theory of Computation

Theory of Computation 8 Deterministic Membership Testing

Maschinelle Sprachverarbeitung

Decision problem of substrings in Context Free Languages.

MTH401A Theory of Computation. Lecture 17

Theory of Computation - Module 3

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

Simplification and Normalization of Context-Free Grammars

MA/CSSE 474 Theory of Computation

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

NPDA, CFG equivalence

Einführung in die Computerlinguistik

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Introduction to Bottom-Up Parsing

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Context Free Grammars: Introduction

Note: In any grammar here, the meaning and usage of P (productions) is equivalent to R (rules).

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Introduction to Bottom-Up Parsing

Lecture 11 Context-Free Languages

Homework 4 Solutions. 2. Find context-free grammars for the language L = {a n b m c k : k n + m}. (with n 0,

Computational Models - Lecture 5 1

Grammars and Context-free Languages; Chomsky Hierarchy

Parsing beyond context-free grammar: Parsing Multiple Context-Free Grammars

Syntax Analysis Part I

Syntax Analysis Part I

Context-Free Grammars: Normal Forms

Notes for Comp 497 (454) Week 10

Context-Free Grammar

Solution. S ABc Ab c Bc Ac b A ABa Ba Aa a B Bbc bc.

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Recursive descent for grammars with contexts

Introduction to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Context Free Grammars

Compiling Techniques

Introduction to Bottom-Up Parsing

Properties of context-free Languages

Chomsky and Greibach Normal Forms

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Einführung in die Computerlinguistik Kontextfreie Grammatiken - Formale Eigenschaften

Syntactic Analysis. Top-Down Parsing

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Context-free Grammars and Languages

Context Free Languages and Grammars

Transcription:

CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017

Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule is of the form A BC A a where a is any terminal A,B,C are non-terminals B, C cannot be start variable We allow the rule S ɛ if ɛ L 1

Recap: Conversion to CNF 1. remove unproductive non-terminals (optional) 2. remove unreachable non-terminals (optional) 3. make terminals occur alone on right-hand side 4. reduce arity of every production to 2 5. remove ɛ-production rules X ɛ 6. remove unit productions X Y 7. unproductive non-terminals 8. unreachable non-terminals 2

Approaches to Parsing Top Down (Goal driven) S Start from the start non-terminal S Grow parse tree downwards to match the input word Expr Plus Term Bottom Up (Data Driven) Start from the input word Term num Build up parse tree whose root is the start non-terminal S num 3

CYK (Cocke-Younger-Kasami) One of the earliest recognition and parsing algorithms Bottom-up parsing (starts with words) Independently developed by J. Cocke, D. Younger, T. Kasami (CYK) Build solutions compositionally from sub-solutions Based on a dynamic programming approach 4

CYK algorithm: Basic idea For a grammar G and a word w: For every substring v 1 of length 1, find all non-terminals A such that A v 1 For every substring v 2 of length 2, find all non-terminals A such that A v 2. For the unique substring w of length w, find all non-terminals A such that A w Check whether S belongs to the last set 5

CYK algorithm Standard CYK operates only on context-free grammars in Chomsky Normal Form (CNF) To illustrate the benefits of CNF we use a non-cnf grammar 6

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9 Sign + 1 2. 3 e + 4 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ Sign 0 1 8 9 Sign + 1 2. 3 e + 4 We start with substrings of length 1 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9 Sign + 1 2. 3 e + Sign 4 We start with substrings of length 1 Grammar has unit-rules: after finding a non-terminal, we need to repetitively search for other non-terminals that derive the same word Chain of rules: 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9, Fraction Sign + 1 2. 3 e + Sign 4 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9, Fraction Scale Sign + 1 2. 3 e + Sign 4 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9, Fraction, Real Scale Sign + 1 2. 3 e + Sign 4 7

Example word: 12.3e+4 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9,, Real Fraction, Real Scale Sign + 1 2. 3 e + Sign 4 7

Example word: 12.3 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9, Fraction Sign + 1 2. 3 8

Example word: 12.3 Real Real Fraction Scale Fraction. Scale e Sign ɛ 0 1 8 9, Fraction Scale, Real, Real ɛ Sign + 1 ɛ 2 ɛ. ɛ 3 ɛ We neglected ɛ-production rules until now ɛ-rules can be used between any two symbols, at the beginning and at the end 8

Substring Recognition s 1,4 s 2,4 word: w = a 1 a 2 a n s 1,3 s 3,4 s 2,3 s s 1,1 1,2 s 2,2 s 3,3 s 4,4 a 1 a 2 a 3 a 4 X i,j set of all non-terminals deriving the substring s i,j Start non-terminal S derives w (= s 1,n ) if and only if S is a member of X 1,n 9

Substring Recognition How to check if the RHS of rule A A 1 A 2 A n derives s i,j? A 1 A 2 A n (i < k < j) a i a k 1 a k a j a k+1 We need to check all the possible matching of substrings with A i non-terminals Results of checking subproblem solutions are reusable Subproblem results are computed once and then memoized for later Matching in CNF: A 1 A 2 a i a k 1 a k a j 10

Recognition Table X 1,5 X 1,4 X 1,3 X 2,5 X 2,4 X 3,5 X4,5 X 1,2 X 2,3 X 3,4 X 1,1 X 2,2 X 3,3 X 4,4 X 5,5 a 1 a 2 a 3 a 4 a 5 Each row corresponds to one length of substrings Bottom Row: words of length 1 Second from Bottom Row: words of length 2. Top Row: word w 11

Example S AB CB grammar: A BA a B BC b intput: babab C AC a B A, C B A, C B b a b a b 12

Example S AB CB grammar: A BA a B BC b intput: babab C AC a A, B S A, B S B A, C B A, C B b a b a b 12

Example S AB CB grammar: A BA a B BC b intput: babab C AC a S S S A, B S A, B S B A, C B A, C B b a b a b 12

Example S AB CB grammar: A BA a B BC b intput: babab C AC a S, A S S S A, B S A, B S B A, C B A, C B b a b a b 12

Example S AB CB grammar: A BA a B BC b intput: babab C AC a S S, A S S S A, B S A, B S B A, C B A, C B b a b a b 12

Exercise What is the time and space complexity of CYK? Assume size of input word is n, number of non-terminals is r 13

CYK Problem Parsers usually parse the original grammar without further modification Grammar reflects the actual structure of language Conversion to CNF can make it difficult to keep the intended structure 14