Introduction to Computational Linguistics

Size: px
Start display at page:

Download "Introduction to Computational Linguistics"

Transcription

1 Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, / 101

2 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates moved Start as soon as you can though 2 / 101

3 Overview We will focus on the concept of parsing and why naive parsing is inefficient idea Then will briefly cover and then Assignment 4 asks you to study it in detail next time: probabilistic parsing (also need for Assignment 4) 3 / 101

4 Parsing Recognizing string as input and assigning structure to it Syntactic parsing: assigning syntactic structure Semantic parsing: assigning semantic structure 4 / 101

5 Parsing: Making explicit structure that is inherent (implicit) in natural language strings What is that structure? Why would we need it? 5 / 101

6 Bottom Up vs. Top Down parsing S / \??? a b c d d f 6 / 101

7 Recursive Top-Down Parsing slide from: 7 / 101

8 Recursive Top-Down Parsing slide from: 8 / 101

9 Recursive Top-Down Parsing slide from: 9 / 101

10 Recursive Top-Down Parsing slide from: 10 / 101

11 Recursive Top-Down Parsing slide from: 11 / 101

12 Recursive Top-Down Parsing What is the time complexity of what we just saw? 12 / 101

13 Recursive Top-Down Parsing What is the time complexity of what we just saw? Exponential in n! (Meaning, as we increase the length of input, the time to do parsing increases exponentially) (Which is very very bad) 13 / 101

14 **Optional exercise: Recursive Parsing Running Time S A A a A b a A c ɛ Consider a string a n c n Convince yourself that you will expand A 2 n times 14 / 101

15 Example 15 / 101

16 Bottom-up parsing Book the flight through Houston 16 / 101

17 Bottom-up parsing N D N P PN Book the flight through Houston 17 / 101

18 Bottom-up parsing D P PN N the N through Houston Book flight 18 / 101

19 Bottom-up parsing NP PP N D P PN Book the N through Houston flight No more possibilities! Backtrack / 101

20 Bottom-up parsing D PP N the N P PN Book flight through Houston 20 / 101

21 Bottom-up parsing D N the PP Book N P PN flight through Houston 21 / 101

22 Bottom-up parsing NP N D Book the PP N P PN flight through Houston No more possibilities! Backtrack... Up to where?.. 22 / 101

23 Bottom-up parsing Backtrack to the very beginning, actually! Book the flight through Houston 23 / 101

24 Bottom-up parsing V D N P PN Book the flight through Houston 24 / 101

25 Bottom-up parsing V D P PN Book the N through Houston flight 25 / 101

26 Bottom-up parsing V NP PP Book D P PN the N through Houston flight 26 / 101

27 Bottom-up parsing VP PP V NP P PN Book D through Houston the N flight 27 / 101

28 Bottom-up parsing VP VP PP V NP P PN Book D through Houston the N flight 28 / 101

29 Bottom-up parsing S VP VP PP V NP P PN Book D through Houston the N flight 29 / 101

30 Bottom-up parsing Or, we could have instead done: V D PP Book the N P PN flight through Houston 30 / 101

31 Bottom-up parsing V D Book the PP N P PN flight through Houston 31 / 101

32 Bottom-up parsing V NP Book D the PP N P PN flight through Houston 32 / 101

33 Bottom-up parsing VP V NP Book D the PP N P PN flight through Houston 33 / 101

34 Bottom-up parsing S VP V NP Book D the PP N P PN flight through Houston 34 / 101

35 How do we make sure we get both trees? Go through all possibilities for productions 35 / 101

36 Top-Down Parsing S 36 / 101

37 Top-Down Parsing S NP VP 37 / 101

38 Top-Down Parsing S NP VP Pronoun 38 / 101

39 Top-Down Parsing S NP VP 39 / 101

40 Top-Down Parsing S NP VP PN 40 / 101

41 Top-Down Parsing S NP VP 41 / 101

42 Top-Down Parsing S NP VP D 42 / 101

43 Top-Down Parsing S NP VP 43 / 101

44 Top-Down Parsing S 44 / 101

45 Top-Down Parsing S Aux NP VP 45 / 101

46 Top-Down Parsing S 46 / 101

47 Top-Down Parsing S VP 47 / 101

48 Top-Down Parsing S VP V 48 / 101

49 Top-Down Parsing S VP V Book Yes, but we have more input still / 101

50 Top-Down Parsing S VP V NP 50 / 101

51 Top-Down Parsing S VP V NP Book 51 / 101

52 Top-Down Parsing S VP V NP Book Pronoun 52 / 101

53 Top-Down Parsing S VP V NP Book 53 / 101

54 Top-Down Parsing S VP V NP Book PN 54 / 101

55 Top-Down Parsing S VP V NP Book 55 / 101

56 Top-Down Parsing S VP V NP Book D 56 / 101

57 Top-Down Parsing S VP V NP Book D the 57 / 101

58 Top-Down Parsing S VP V NP Book D the N 58 / 101

59 Top-Down Parsing S VP V NP Book D the N flight Yes, but we have more input still / 101

60 Top-Down Parsing S VP V NP Book D the N 60 / 101

61 Top-Down Parsing S VP V NP Book D the N N 61 / 101

62 Top-Down Parsing S VP V NP Book D the N N flight Nope... Backtrack again / 101

63 Top-Down Parsing S VP V NP Book D the 63 / 101

64 Top-Down Parsing S VP V NP Book D the PP 64 / 101

65 Top-Down Parsing S VP V NP Book D the PP N 65 / 101

66 Top-Down Parsing S VP V NP Book D the PP N flight 66 / 101

67 Top-Down Parsing S VP V NP Book D the PP N P NP flight 67 / 101

68 Top-Down Parsing S VP V NP Book D the PP N P NP flight through 68 / 101

69 Top-Down Parsing S VP V NP Book D the PP N P NP flight through Pronoun 69 / 101

70 Top-Down Parsing S VP V NP Book D the PP N P NP flight through 70 / 101

71 Top-Down Parsing S VP V NP Book D the PP N P NP flight through PN 71 / 101

72 Top-Down Parsing S V VP NP Book D the PP N P NP flight through PN Houston 72 / 101

73 Top-Down Parsing Could we have gotten the second tree by top-down parsing? 73 / 101

74 Top-Down Parsing Could we have gotten the second tree by top-down parsing? Yes; it is a matter of which rule happened to be on the top of the stack We grabbed VP V NP But the option VP VP PP is also on the stack somewhere Thus the returned parse is subject to an arbitrary listing of rules in the grammar 74 / 101

75 Bottom Up vs. Top Down parsing Top-down parsers do not waste time exploring hypotheses not leading to S...but do waste time exploring hypotheses not matching the input Bottom-up parsers do not waste time exploring hypotheses not matching input...but do waste time exploring hypotheses not leading to S Both can take exponential time (in the worst case, easier shown on abstract CFG) Some recursive parsers are O(n 4 ) An answer to poor time complexity: dynamic O(n 3 ) 75 / 101

76 Example: The Fibonacci numbers Recursive definition: f(0) = 0; f(1) = 1 f(n) = f(n-1) + f(n-2) f(100) = / 101

77 The Fibonacci numbers: naive implementation Since we have a recursive definition, let s implement the Fibonacci numbers printer recursively! def fibonacci(n): if n in [0,1]: return n return fibonacci(n-1) + fibonacci(n-2) What s the problem with this? 77 / 101

78 The Fibonacci numbers: better implementation def fibonacci(n): return fibonacci_helper(n,{}) def fibonacci_helper(n,memo): if n in [0,1]: return n if not n in memo: memo[n] = fibonacci_helper(n-1,memo) + fibonacci_helper(n-2,memo) return memo[n] 78 / 101

79 Fill in a table with solutions to subproblems Then can just look up momentarily the precomputed solution No need to perform the same computation many times 79 / 101

80 Fibonacci numbers f(0) = 0 f(1) = 1 f(2) = f(1) + f(0) what s f(1)? what s f(0)? 80 / 101

81 Fibonacci numbers f(3) = f(2) + f(1) f(2) = f(1) + f(0) f(1) = 1 f(0) = 0 81 / 101

82 Fibonacci numbers f(4) = f(3) + f(2) f(3) = f(2) + f(1) f(2) = f(1) + f(0) f(1) = 1 f(0) = 0 82 / 101

83 Fibonacci numbers f(5) = f(4) + f(3) f(4) = f(3) + f(2) f(3) = f(2) + f(1) f(2) = f(1) + f(0) f(1) = 1 f(0) = 0 etc... (deep recursion; slow; do the same computation again and again) 83 / 101

84 Fibonacci numbers f(0) =? / 101

85 Fibonacci numbers f(0) =? Not in the table, so compute: f(0)=0 (or rather, return the base case) fill in the cell / 101

86 Fibonacci numbers f(1) =? Not in the table, so compute: f(1)=1 (or rather, return the base case) fill in the cell / 101

87 Fibonacci numbers f(2) =? Not in the table, so compute: f(2)=f(2-1) + f(2-2) = f(1) + f(0) But both f(1) and f(0) are already in the table! No need to compute! Just look up! fill in the cell / 101

88 Fibonacci numbers f(3) =? Not in the table, so compute: f(3)=f(3-1) + f(3-2) = f(2) + f(1) But both f(2) and f(1) are already in the table! No need to compute! Just look up! fill in the cell / 101

89 Fibonacci numbers f(4) =? Not in the table, so compute: f(4)=f(4-1) + f(4-2) = f(3) + f(2) But both f(3) and f(2) are already in the table! No need to compute! Just look up! fill in the cell / 101

90 for parsing Once the constituent has been discovered, store the information Example: The algorithm (Cocke-Kazami-Younger) 90 / 101

91 Chomsky Normal Form All productions must conform to two forms: A BC A w i.e. only binary branching trees (and leaves) 91 / 101

92 Chomsky Normal Form To convert to CNF: copy all conforming rules as is Flatten unit productions N, N cat dog becomes: cat, dog Introduce dummy rules to get rid of mixed terminal-nonterminal RHS (right-hand side) INF to VP becomes: TO to and INF TO VP Introduce dummy rules to expand rules with RHS greater than 2 nonterminals A B C D becomes: A X D, X B C 92 / 101

93 Original grammar 93 / 101

94 Chomsky Normal Form 94 / 101

95 : Main idea A 2-dimensional array (aka a table) can encode the structure of the tree Each cell [i,j] contains all constituents that span positions i through j of the input string 0 Book 1 that 2 flight 3 Cell [0,n] must have the Start symbol if we have a parse...and can have more than one! 95 / 101

96 96 / 101

97 97 / 101

98 98 / 101

99 99 / 101

100 Limitations of classical It is a recognizer Turn a recognizer into a parser by storing all tree paths leading to S...but returning all possible trees is again exponential time! Also, we modified the grammar! 100 / 101

101 Limitations of classical : Solutions Probabilistic parsing Train a probabilistic grammar and then return the most probable parse Modify to be able to recover original grammar Employ e.g. partial parsing to get accommodate CFG directly (not in CNF) 101 / 101

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

CSCI Compiler Construction

CSCI Compiler Construction CSCI 742 - Compiler Construction Lecture 12 Cocke-Younger-Kasami (CYK) Algorithm Instructor: Hossein Hojjat February 20, 2017 Recap: Chomsky Normal Form (CNF) A CFG is in Chomsky Normal Form if each rule

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Even More on Dynamic Programming

Even More on Dynamic Programming Algorithms & Models of Computation CS/ECE 374, Fall 2017 Even More on Dynamic Programming Lecture 15 Thursday, October 19, 2017 Sariel Har-Peled (UIUC) CS374 1 Fall 2017 1 / 26 Part I Longest Common Subsequence

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Context- Free Parsing with CKY. October 16, 2014

Context- Free Parsing with CKY. October 16, 2014 Context- Free Parsing with CKY October 16, 2014 Lecture Plan Parsing as Logical DeducBon Defining the CFG recognibon problem BoHom up vs. top down Quick review of Chomsky normal form The CKY algorithm

More information

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF)

CS5371 Theory of Computation. Lecture 7: Automata Theory V (CFG, CFL, CNF) CS5371 Theory of Computation Lecture 7: Automata Theory V (CFG, CFL, CNF) Announcement Homework 2 will be given soon (before Tue) Due date: Oct 31 (Tue), before class Midterm: Nov 3, (Fri), first hour

More information

Context Free Grammars: Introduction

Context Free Grammars: Introduction Context Free Grammars: Introduction Context free grammar (CFG) G = (V, Σ, R, S), where V is a set of grammar symbols Σ V is a set of terminal symbols R is a set of rules, where R (V Σ) V S is a distinguished

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences

More information

Natural Language Processing. Lecture 13: More on CFG Parsing

Natural Language Processing. Lecture 13: More on CFG Parsing Natural Language Processing Lecture 13: More on CFG Parsing Probabilistc/Weighted Parsing Example: ambiguous parse Probabilistc CFG Ambiguous parse w/probabilites 0.05 0.05 0.20 0.10 0.30 0.20 0.60 0.75

More information

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21

Parsing. Unger s Parser. Laura Kallmeyer. Winter 2016/17. Heinrich-Heine-Universität Düsseldorf 1 / 21 Parsing Unger s Parser Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2016/17 1 / 21 Table of contents 1 Introduction 2 The Parser 3 An Example 4 Optimizations 5 Conclusion 2 / 21 Introduction

More information

Statistical Machine Translation

Statistical Machine Translation Statistical Machine Translation -tree-based models (cont.)- Artem Sokolov Computerlinguistik Universität Heidelberg Sommersemester 2015 material from P. Koehn, S. Riezler, D. Altshuler Bottom-Up Decoding

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RNA Secondary Structure Prediction 1 RNA structure prediction methods Base-Pair Maximization Context-Free Grammar Parsing. Free Energy Methods Covariance Models 2 The Nussinov-Jacobson Algorithm q = 9

More information

Computability and Complexity

Computability and Complexity Computability and Complexity Lecture 10 More examples of problems in P Closure properties of the class P The class NP given by Jiri Srba Lecture 10 Computability and Complexity 1/12 Example: Relatively

More information

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition Salil Vadhan October 11, 2012 Reading: Sipser, Section 2.3 and Section 2.1 (material on Chomsky Normal Form). Pumping Lemma for

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

In this chapter, we explore the parsing problem, which encompasses several questions, including:

In this chapter, we explore the parsing problem, which encompasses several questions, including: Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation

More information

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is

Parsing. Unger s Parser. Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Introduction (1) Unger s parser [Grune and Jacobs, 2008] is a CFG parser that is Unger s Parser Laura Heinrich-Heine-Universität Düsseldorf Wintersemester 2012/2013 a top-down parser: we start with S and

More information

CPS 220 Theory of Computation

CPS 220 Theory of Computation CPS 22 Theory of Computation Review - Regular Languages RL - a simple class of languages that can be represented in two ways: 1 Machine description: Finite Automata are machines with a finite number of

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 14 Ana Bove May 14th 2018 Recap: Context-free Grammars Simplification of grammars: Elimination of ǫ-productions; Elimination of

More information

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form

Plan for 2 nd half. Just when you thought it was safe. Just when you thought it was safe. Theory Hall of Fame. Chomsky Normal Form Plan for 2 nd half Pumping Lemma for CFLs The Return of the Pumping Lemma Just when you thought it was safe Return of the Pumping Lemma Recall: With Regular Languages The Pumping Lemma showed that if a

More information

Probabilistic Context Free Grammars

Probabilistic Context Free Grammars 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set

More information

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Top-Down Parsing and Intro to Bottom-Up Parsing

Top-Down Parsing and Intro to Bottom-Up Parsing Predictive Parsers op-down Parsing and Intro to Bottom-Up Parsing Lecture 7 Like recursive-descent but parser can predict which production to use By looking at the next few tokens No backtracking Predictive

More information

Notes for Comp 497 (Comp 454) Week 10 4/5/05

Notes for Comp 497 (Comp 454) Week 10 4/5/05 Notes for Comp 497 (Comp 454) Week 10 4/5/05 Today look at the last two chapters in Part II. Cohen presents some results concerning context-free languages (CFL) and regular languages (RL) also some decidability

More information

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs

Context Free Grammars: Introduction. Context Free Grammars: Simplifying CFGs Context Free Grammars: Introduction CFGs are more powerful than RGs because of the following 2 properties: 1. Recursion Rule is recursive if it is of the form X w 1 Y w 2, where Y w 3 Xw 4 and w 1, w 2,

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Simplification and Normalization of Context-Free Grammars

Simplification and Normalization of Context-Free Grammars implification and Normalization, of Context-Free Grammars 20100927 Slide 1 of 23 Simplification and Normalization of Context-Free Grammars 5DV037 Fundamentals of Computer Science Umeå University Department

More information

Properties of Context-Free Languages

Properties of Context-Free Languages Properties of Context-Free Languages Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Introduction to Theory of Computing

Introduction to Theory of Computing CSCI 2670, Fall 2012 Introduction to Theory of Computing Department of Computer Science University of Georgia Athens, GA 30602 Instructor: Liming Cai www.cs.uga.edu/ cai 0 Lecture Note 3 Context-Free Languages

More information

Intro to Theory of Computation

Intro to Theory of Computation Intro to Theory of Computation LECTURE 7 Last time: Proving a language is not regular Pushdown automata (PDAs) Today: Context-free grammars (CFG) Equivalence of CFGs and PDAs Sofya Raskhodnikova 1/31/2016

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment. even Lectures on tatistical Parsing Christopher Manning LA Linguistic Institute 7 LA Lecture Attendee information Please put on a piece of paper: ame: Affiliation: tatus (undergrad, grad, industry, prof,

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

Foundations of Informatics: a Bridging Course

Foundations of Informatics: a Bridging Course Foundations of Informatics: a Bridging Course Week 3: Formal Languages and Semantics Thomas Noll Lehrstuhl für Informatik 2 RWTH Aachen University noll@cs.rwth-aachen.de http://www.b-it-center.de/wob/en/view/class211_id948.html

More information

Decision problem of substrings in Context Free Languages.

Decision problem of substrings in Context Free Languages. Decision problem of substrings in Context Free Languages. Mauricio Osorio, Juan Antonio Navarro Abstract A context free grammar (CFG) is a set of symbols and productions used to define a context free language.

More information

Algebraic Dynamic Programming. Solving Satisfiability with ADP

Algebraic Dynamic Programming. Solving Satisfiability with ADP Algebraic Dynamic Programming Session 12 Solving Satisfiability with ADP Robert Giegerich (Lecture) Stefan Janssen (Exercises) Faculty of Technology Summer 2013 http://www.techfak.uni-bielefeld.de/ags/pi/lehre/adp

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic

More information

Notes for Comp 497 (454) Week 10

Notes for Comp 497 (454) Week 10 Notes for Comp 497 (454) Week 10 Today we look at the last two chapters in Part II. Cohen presents some results concerning the two categories of language we have seen so far: Regular languages (RL). Context-free

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

Final exam study sheet for CS3719 Turing machines and decidability.

Final exam study sheet for CS3719 Turing machines and decidability. Final exam study sheet for CS3719 Turing machines and decidability. A Turing machine is a finite automaton with an infinite memory (tape). Formally, a Turing machine is a 6-tuple M = (Q, Σ, Γ, δ, q 0,

More information

NPDA, CFG equivalence

NPDA, CFG equivalence NPDA, CFG equivalence Theorem A language L is recognized by a NPDA iff L is described by a CFG. Must prove two directions: ( ) L is recognized by a NPDA implies L is described by a CFG. ( ) L is described

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars

Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars Harvard CS 121 and CSCI E-207 Lecture 9: Regular Languages Wrap-Up, Context-Free Grammars Salil Vadhan October 2, 2012 Reading: Sipser, 2.1 (except Chomsky Normal Form). Algorithmic questions about regular

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

CS20a: summary (Oct 24, 2002)

CS20a: summary (Oct 24, 2002) CS20a: summary (Oct 24, 2002) Context-free languages Grammars G = (V, T, P, S) Pushdown automata N-PDA = CFG D-PDA < CFG Today What languages are context-free? Pumping lemma (similar to pumping lemma for

More information

Syntax-Based Decoding

Syntax-Based Decoding Syntax-Based Decoding Philipp Koehn 9 November 2017 1 syntax-based models Synchronous Context Free Grammar Rules 2 Nonterminal rules NP DET 1 2 JJ 3 DET 1 JJ 3 2 Terminal rules N maison house NP la maison

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Chap. 7 Properties of Context-free Languages

Chap. 7 Properties of Context-free Languages Chap. 7 Properties of Context-free Languages 7.1 Normal Forms for Context-free Grammars Context-free grammars A where A N, (N T). 0. Chomsky Normal Form A BC or A a except S where A, B, C N, a T. 1. Eliminating

More information

Pushdown Automata (Pre Lecture)

Pushdown Automata (Pre Lecture) Pushdown Automata (Pre Lecture) Dr. Neil T. Dantam CSCI-561, Colorado School of Mines Fall 2017 Dantam (Mines CSCI-561) Pushdown Automata (Pre Lecture) Fall 2017 1 / 41 Outline Pushdown Automata Pushdown

More information

Logic. proof and truth syntacs and semantics. Peter Antal

Logic. proof and truth syntacs and semantics. Peter Antal Logic proof and truth syntacs and semantics Peter Antal antal@mit.bme.hu 10/9/2015 1 Knowledge-based agents Wumpus world Logic in general Syntacs transformational grammars Semantics Truth, meaning, models

More information

Aspects of Tree-Based Statistical Machine Translation

Aspects of Tree-Based Statistical Machine Translation Aspects of Tree-Based Statistical Machine Translation Marcello Federico Human Language Technology FBK 2014 Outline Tree-based translation models: Synchronous context free grammars Hierarchical phrase-based

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing! CS 6120 Spring 2014! Northeastern University!! David Smith! with some slides from Jason Eisner & Andrew

More information

Non-context-Free Languages. CS215, Lecture 5 c

Non-context-Free Languages. CS215, Lecture 5 c Non-context-Free Languages CS215, Lecture 5 c 2007 1 The Pumping Lemma Theorem. (Pumping Lemma) Let be context-free. There exists a positive integer divided into five pieces, Proof for for each, and..

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10 Ambiguity, Precedence, Associativity & Top-Down Parsing Lecture 9-10 (From slides by G. Necula & R. Bodik) 2/13/2008 Prof. Hilfinger CS164 Lecture 9 1 Administrivia Team assignments this evening for all

More information

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings

Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings Recitation 4: Converting Grammars to Chomsky Normal Form, Simulation of Context Free Languages with Push-Down Automata, Semirings 11-711: Algorithms for NLP October 10, 2014 Conversion to CNF Example grammar

More information

Everything You Always Wanted to Know About Parsing

Everything You Always Wanted to Know About Parsing Everything You Always Wanted to Know About Parsing Part V : LR Parsing University of Padua, Italy ESSLLI, August 2013 Introduction Parsing strategies classified by the time the associated PDA commits to

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

CSE 355 Test 2, Fall 2016

CSE 355 Test 2, Fall 2016 CSE 355 Test 2, Fall 2016 28 October 2016, 8:35-9:25 a.m., LSA 191 Last Name SAMPLE ASU ID 1357924680 First Name(s) Ima Regrading of Midterms If you believe that your grade has not been added up correctly,

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Introduction to Bottom-Up Parsing Outline Review LL parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables 2 Top-Down Parsing: Review Top-down parsing expands a parse tree

More information

Syntactic Analysis. Top-Down Parsing

Syntactic Analysis. Top-Down Parsing Syntactic Analysis Top-Down Parsing Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in Compilers class at University of Southern California (USC) have explicit permission to make

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Introduction to Bottom-Up Parsing Outline Review LL parsing Shift-reduce parsing The LR parsing algorithm Constructing LR parsing tables Compiler Design 1 (2011) 2 Top-Down Parsing: Review Top-down parsing

More information

CS 6120/CS4120: Natural Language Processing

CS 6120/CS4120: Natural Language Processing CS 6120/CS4120: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Assignment/report submission

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write: Languages A language is a set (usually infinite) of strings, also known as sentences Each string consists of a sequence of symbols taken from some alphabet An alphabet, V, is a finite set of symbols, e.g.

More information

Syntax Analysis Part I

Syntax Analysis Part I 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2013 2 Position of a Parser in the Compiler Model Source Program Lexical Analyzer

More information

Theory of Computation 8 Deterministic Membership Testing

Theory of Computation 8 Deterministic Membership Testing Theory of Computation 8 Deterministic Membership Testing Frank Stephan Department of Computer Science Department of Mathematics National University of Singapore fstephan@comp.nus.edu.sg Theory of Computation

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Outline Introduction to Bottom-Up Parsing Review LL parsing Shift-reduce parsing he LR parsing algorithm Constructing LR parsing tables 2 op-down Parsing: Review op-down parsing expands a parse tree from

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Decidable Problems of CFLs and beyond Haniel Barbosa Readings for this lecture Chapter 4 of [Sipser 1996], 3rd edition. Section 4.1. Decidable

More information

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4

Syntax Analysis Part I. Position of a Parser in the Compiler Model. The Parser. Chapter 4 1 Syntax Analysis Part I Chapter 4 COP5621 Compiler Construction Copyright Robert van ngelen, Flora State University, 2007 Position of a Parser in the Compiler Model 2 Source Program Lexical Analyzer Lexical

More information

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages CSCI 1010 Models of Computa3on Lecture 17 Parsing Context-Free Languages Overview BoCom-up parsing of CFLs. BoCom-up parsing via the CKY algorithm An O(n 3 ) algorithm John E. Savage CSCI 1010 Lect 17

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2016 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Introduction to Bottom-Up Parsing

Introduction to Bottom-Up Parsing Outline Introduction to Bottom-Up Parsing Review LL parsing Shift-reduce parsing he LR parsing algorithm Constructing LR parsing tables Compiler Design 1 (2011) 2 op-down Parsing: Review op-down parsing

More information

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up

More information

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION FORMAL LANGUAGES, AUTOMATA AND COMPUTATION DECIDABILITY ( LECTURE 15) SLIDES FOR 15-453 SPRING 2011 1 / 34 TURING MACHINES-SYNOPSIS The most general model of computation Computations of a TM are described

More information

The Pumping Lemma for Context Free Grammars

The Pumping Lemma for Context Free Grammars The Pumping Lemma for Context Free Grammars Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar is in the form A BC A a Where a is any terminal

More information

CS375: Logic and Theory of Computing

CS375: Logic and Theory of Computing CS375: Logic and Theory of Computing Fuhua (Frank) Cheng Department of Computer Science University of Kentucky 1 Table of Contents: Week 1: Preliminaries (set algebra, relations, functions) (read Chapters

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Spring 2016 http://cseweb.ucsd.edu/classes/sp16/cse105-ab/ Today's learning goals Sipser Ch 2 Define push down automata Trace the computation of a push down automaton Design

More information

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1

Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS Berkley University 1 Lecture VII Part 2: Syntactic Analysis Bottom-up Parsing: LR Parsing. Prof. Bodik CS 164 -- Berkley University 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

More information

Formal Languages, Grammars and Automata Lecture 5

Formal Languages, Grammars and Automata Lecture 5 Formal Languages, Grammars and Automata Lecture 5 Helle Hvid Hansen helle@cs.ru.nl http://www.cs.ru.nl/~helle/ Foundations Group Intelligent Systems Section Institute for Computing and Information Sciences

More information

Sharpening the empirical claims of generative syntax through formalization

Sharpening the empirical claims of generative syntax through formalization Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities NASSLLI, June 2014 Part 1: Grammars and cognitive hypotheses What is a grammar?

More information

A brief introduction to Logic. (slides from

A brief introduction to Logic. (slides from A brief introduction to Logic (slides from http://www.decision-procedures.org/) 1 A Brief Introduction to Logic - Outline Propositional Logic :Syntax Propositional Logic :Semantics Satisfiability and validity

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University Normal Forms We want a cfg with either Chomsky or Greibach normal form Chomsky normal form

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information