Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Size: px
Start display at page:

Download "Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation"

Transcription

1 atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation

2 Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing evaluation Unsupervised estimation of PCFGs (?) Grammar refinement (producing a state-of-the-art parser) (?)

3 Syntactic parsing Syntax The study of the patterns of formation of sentences and phrases from words Borders with semantics and morphology sometimes blurred Parsing The process of predicting syntactic representations Syntactic Representations Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesinee in Turkish mean "as if you are one of the people that we thought to be originating from Afyonkarahisar" [wikipedia] P My ifferent types of syntactic representations are possible, for example: dog S V ate VP a sausage Each internal node correspond to a phrase root poss nsubj dobj det root My dog ate a sausage P V Constuent (a.k.a. phrasestructure) tree ependency tree 3

4 Constituent trees S Internal nodes correspond to phrases VP S a sentence P My dog V ate a sausage (oun Phrase): My dog, a sandwich, lakes,.. VP (Verb Phrase): ate a sausage, barked, PP (Prepositional phrases): with a friend, in a car, odes immediately above words are PoS tags (aka preterminals) P pronoun determiner V verb noun P preposition 4

5 Bracketing notation S VP P V My dog ate a sausage It is often convenient to represent a tree as a bracketed sequence (S ) ( (P My) ( og) ) (VP (V ate) ( ( a ) ( sausage) ) ) We will use this format in the assignment 5

6 ependency trees root esignated root symbol poss nsubj dobj det root My dog ate a sausage P V odes are words (along with PoS tags) irected arcs encode syntactic dependencies between them Labels are types of relations between the words poss possesive dobj direct object nsub - subject det - determiner 6

7 Constituent and dependency representations Constituent trees can (potentially) be converted to dependency trees P S V VP One potential rule to extract nsubj dependency. (In practice, they are specified a bit differently) My dog ate ependency trees can (potentially) be converted to constituent trees Recovering labels is harder S VP a sandwich poss root nsubj Roughly: every word along with all its dependents corresponds to a phrase = to an inner node in the constituent tree P My dog V barked root My dog barked P V 7

8 Recovering shallow semantics root poss nsubj dobj det root My dog ate a sausage P V Some semantic information can be (approximately) derived from syntactic information Subjects (nsubj) are (often) agents ("initiator / doers for an action") irect objects (dobj) are (often) patients ("affected entities") 8

9 Recovering shallow semantics root poss nsubj dobj det root My dog ate a sausage P V Some semantic information can be (approximately) derived from syntactic information Subjects (nsubj) are (often) agents ("initiator / doers for an action") irect objects (dobj) are (often) patients ("affected entities") But even for agents and patients consider: Mary is baking a cake in the oven A cake is baking in the oven Syntactic subject corresponds to an agent (Who is baking?) Syntactic subject corresponds to a patient (What is being baked?) In general it is not trivial even for the most shallow forms of semantics E.g., consider prepositions: in can encode direction, position, temporal information, 9

10 Brief history of parsing Before mid-90s syntactic rule-based parsers producing rich linguistic information Provide low coverage (though hard to evaluate) Predict much more information than the formalisms we have just discussed Realization that basic PCFGs do not result in accurate predictive models We will discuss this in detail Mid-90s first accurate statistical parsers (e.g., [Magerman 1994, Collins 1996] ) o handcrafted grammars, estimated from large datasets (Penn Treebank WSJ) ow: better models, more efficient algorithms, more languages, 10

11 Ambiguity Ambiguous sentence: I saw a girl with a telescope What kind of ambiguity it has? What implications it has for parsing? 11

12 Why parsing is hard? Ambiguity Prepositional phrase attachment ambiguity S S VP P I V saw VP a girl P with PP a telescope P I V saw a girl P with PP a telescope What kind of clues would be useful? PP-attachment ambiguity is just one example type of ambiguity How serious is this problem in practice? 12

13 Why parsing is hard? Ambiguity Example with 3 preposition phrases, 5 interpretations: Put the block ((in the box on the table) in the kitchen) Put the block (in the box (on the table in the kitchen)) Put ((the block in the box) on the table) in the kitchen. Put (the block (in the box on the table)) in the kitchen. Put (the block in the box) (on the table in the kitchen) 13

14 Why parsing is hard? Ambiguity Example with 3 preposition phrases, 5 interpretations: Put the block ((in the box on the table) in the kitchen) Put the block (in the box (on the table in the kitchen)) Put ((the block in the box) on the table) in the kitchen. Put (the block (in the box on the table)) in the kitchen. Put (the block in the box) (on the table in the kitchen) A general case: 2n Cat n = n 2n 4n n 1 n 3/2p Catalan numbers 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786,... 14

15 Why parsing is hard? Ambiguity Example with 3 preposition phrases, 5 interpretations: Put the block ((in the box on the table) in the kitchen) Put the block (in the box (on the table in the kitchen)) Put ((the block in the box) on the table) in the kitchen. Put (the block (in the box on the table)) in the kitchen. Put (the block in the box) (on the table in the kitchen) A general case: 2n Cat n = n 2n 4n n 1 n 3/2p Catalan numbers 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786,... This is only one type of ambiguity but you have many types in almost all sentences 15

16 Why parsing is hard? Ambiguity A typical tree from a standard dataset (Penn treebank WSJ) [from Michael Collins slides] Canadian Utilities had 1988 revenue of $ 1.16 billion, mainly from its natural gas and electric utility businesses in Alberta, where the company serves about 800,000 customers. 16

17 Key problems Recognition problem: is the sentence grammatical? Parsing problem: what is a (most plausible) derivation (tree) corresponding the sentence? Parsing problem encompasses the recognition problem A more interesting question from practical point of view 17

18 Context-Free Grammar Context-free grammar is a tuple of 4 elements G =(V,,R,S) V - the set of non-terminals In our case: phrase categories (VP,,..) and PoS tags (, V,.. aka preterminals) - the set of terminals Words - the set of rules of the form, where, R X! Y 1,Y 2,...,Y n n 0 X 2 V S, Y i 2 V [ is a dedicated start symbol S! VP! girl!! telescope! P V! saw! PP V! eat PP! P

19 An example grammar V = {S, V P,, P P,, V, P, P } = {girl, telescope, sandwich, I, saw, ate, with, in, a, the} S = {S} R : Called Inner rules Preterminal rules (correspond to the HMM emission model aka task model) S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ( A girl) (VP ate a sandwich) (V ate) ( a sandwich) (VP saw a girl) (PP with a telescope) ( a girl) (PP with a sandwich) ( a) ( sandwich) (P with) ( with a sandwich)! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the 19

20 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 20

21 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 21

22 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 22

23 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 23

24 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 24

25 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 25

26 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 26

27 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope 27

28 CFGs S S! VP VP! V VP! V VP! VP PP! girl! telescope! sandwich P! I V! saw P I V saw VP P PP! PP!! P PP! P V! ate P! with P! in! a! the a girl with a telescope CFG defines both: - a set of substrings (a language) - structures used to represent sentences (constituent trees) 28

29 Why context-free? S VP What can be a sub-tree is only affected by what the phrase type is (VP) but not the context V The dog ate a sandwich 29

30 Why context-free? S VP What can be a sub-tree is only affected by what the phrase type is (VP) but not the context V The dog ate VP a sandwich VP ot grammatical V V AVP ate bark A a sandwich often 30

31 Why context-free? S VP What can be a sub-tree is only affected by what the phrase type is (VP) but not the context V The dog ate VP a sandwich VP ot grammatical V V AVP ate bark A a sandwich often Matters if we want to generate language (e.g., language modeling) but is this relevant to parsing? 31

32 Introduced coordination ambiguity Here, the coarse VP and categories cannot enforce subject-verb agreement in number resulting in the coordination ambiguity S S. VP VP V koalas eat Coordination leaves C and barks koalas VP V eat leaves C and VP V barks "Bark" can refer both to a noun or a verb This tree would be ruled out if the context would be somehow captured (subject-verb agreement) 32

33 Introduced coordination ambiguity Here, the coarse VP and categories cannot enforce subject-verb agreement in number resulting in the coordination ambiguity S S. S koalas VBP eat Coordination Even more detailed PoS tags are not going to help here VP S leaves CC and "Bark" can refer both to a noun or a verb S koalas S barks VBP eat VP S leaves VP CC and This tree would be ruled out if the context would be somehow captured (subject-verb agreement) VP VBZ barks 33

34 Key problems Recognition problem: does the sentence belong to the language defined by CFG? That is: is there a derivation which yields the sentence? Parsing problem: what is a (most plausible) derivation (tree) corresponding the sentence? Parsing problem encompasses the recognition problem 34

35 How to deal with ambiguity? There are (exponentially) many derivation for a typical sentence S ]] S VP VP S VP VB PP Put the block P in the box P on PP P the table in V Put PP the kitchen the block P in PP the box P on V Put PP PP P the table in the kitchen the block P in PP P the box on PP the table P in PP the kitchen Put the block in the box on the table in the kitchen We want to score all the derivations to encode how plausible they are 35

36 An example probabilistic CFG Associate probabilities with the rules p(x! ) : 8 X! 2 R : 0 apple p(x! ) apple 1 X 8X 2 : p(x! ) =1 :X! 2R ow we can score a tree as a product of probabilities corresponding to the used rules S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ( A girl) (VP ate a sandwich) (VP ate) ( a sandwich) (VP saw a girl) (PP with ) ( a girl) (PP with.) ( a) ( sandwich) (P with) ( with a sandwich)! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the

37 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

38 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

39 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

40 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

41 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

42 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

43 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

44 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

45 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

46 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

47 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

48 istribution over trees We defined a distribution over production rules for each nonterminal Our goal was to define a distribution over parse trees Unfortunately, not all PCFGs give rise to a proper distribution over trees, i.e. the sum over probabilities of all Xtrees the grammar can generate may be less than 1: P (T ) < 1 T 48

49 istribution over trees We defined a distribution over production rules for each nonterminal Our goal was to define a distribution over parse trees Unfortunately, not all PCFGs give rise to a proper distribution over trees, i.e. the sum over probabilities of all Xtrees the grammar can generate may be less than 1: P (T ) < 1 T Good news: any PCFG estimated with the maximum likelihood procedure are always proper [Chi and Geman, 98] 49

50 istribution over trees Let us denote by G(x) the set of derivations for the sentence x The probability distribution defines the scoring P (T ) over the trees T 2 G(x) Finding the best parse for the sentence according to PCFG: arg max T 2G(x) P (T ) We will look at this at the next class 50

51 Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing evaluation Unsupervised estimation of PCFGs (?) Grammar refinement (producing a state-of-the-art parser) (?) 51

52 ML estimation 52 A treebank: a collection sentences annotated with constituent trees S P My dog VP V ate a sausage S VP V Put the block PP P in the box PP P on the table PP P in the kitchen S P I VP V saw a girl PP P with a telescope S koalas VP VP V eat leaves C and VP V barks...

53 ML estimation A treebank: a collection sentences annotated with constituent trees P My dog S V ate VP a sausage An estimated probability of a rule (maximum likelihood estimates) p(x! ) = V Put the block S VP P in PP the C(X! ) C(X) box P on the PP table P in PP the kitchen P I S V saw a VP girl P with PP a telescope... koalas The number of times the rule used in the corpus The number of times the nonterminal X appears in the treebank V eat S VP leaves VP C and VP V barks 53

54 ML estimation A treebank: a collection sentences annotated with constituent trees P My dog S V ate VP a sausage An estimated probability of a rule (maximum likelihood estimates) p(x! ) = V Put the block S VP P in PP the C(X! ) C(X) box P on the PP table P in PP the kitchen P I S V saw a VP girl P with PP a telescope... koalas The number of times the rule used in the corpus The number of times the nonterminal X appears in the treebank V eat S VP leaves VP C and VP V barks 54

55 ML estimation A treebank: a collection sentences annotated with constituent trees P My dog S V ate VP a sausage An estimated probability of a rule (maximum likelihood estimates) p(x! ) = Smoothing is helpful V Put the block S VP P in PP the C(X! ) C(X) box P on the PP table P in PP the kitchen P I S V saw a VP girl P with PP a telescope... koalas The number of times the rule used in the corpus The number of times the nonterminal X appears in the treebank V eat S VP leaves VP C and VP V barks Especially important for preterminal rules, i.e. generation of words ( = task mode in PoS tagging) See smoothing techniques in Section 4.5 of the J&M book (+ section for some techniques specifically dealing with to unknown words) 55

56 ML estimation: an example [Ex. from Collins IWPT 01] A toy treebank: S S n 1 n B C 2 n C 3 S B a a a a a a a a Without smoothing: Rule Count Prob. estimate S! BC n 1 n 1 /(n 1 + n 2 + n 3 ) S! C n 2 n 2 /(n 1 + n 2 + n 3 ) S! B n 3 n 3 /(n 1 + n 2 + n 3 ) B! aa n 1 n 1 /(n 1 + n 3 ) B! a n 3 n 3 /(n 1 + n 3 ) C! aa n 1 n 1 /(n 1 + n 2 ) C! aaa n 2 n 2 /(n 1 + n 2 ) 56

57 ML estimation: an example [Ex. from Collins IWPT 01] A toy treebank: S S n 1 n B C 2 n C 3 S B a a a a a a a a Without smoothing: Rule Count Prob. estimate S! BC n 1 n 1 /(n 1 + n 2 + n 3 ) S! C n 2 n 2 /(n 1 + n 2 + n 3 ) S! B n 3 n 3 /(n 1 + n 2 + n 3 ) B! aa n 1 n 1 /(n 1 + n 3 ) B! a n 3 n 3 /(n 1 + n 3 ) C! aa n 1 n 1 /(n 1 + n 2 ) C! aaa n 2 n 2 /(n 1 + n 2 ) 57

58 ML estimation: an example [Ex. from Collins IWPT 01] A toy treebank: S S n 1 n B C 2 n C 3 S B a a a a a a a a Without smoothing: Rule Count Prob. estimate S! BC n 1 n 1 /(n 1 + n 2 + n 3 ) S! C n 2 n 2 /(n 1 + n 2 + n 3 ) S! B n 3 n 3 /(n 1 + n 2 + n 3 ) B! aa n 1 n 1 /(n 1 + n 3 ) B! a n 3 n 3 /(n 1 + n 3 ) C! aa n 1 n 1 /(n 1 + n 2 ) C! aaa n 2 n 2 /(n 1 + n 2 ) 58

59 Penn Treebank: peculiarities Wall street journal: around 40, 000 annotated sentences, 1,000,000 words Fine-grain part of speech tags (45), e.g., for verbs VB VBG VBP VBZ M Verb, past tense Verb, gerund or present participle Verb, present (non-3 rd person singular) Verb, present (3 rd person singular) Modal Flat s (no attempt to disambiguate attachment) T JJ a hot dog food cart 59

60 Outline Unsupervised estimation: forward-backward Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing evaluation Unsupervised estimation of PCFGs (?) Grammar refinement (producing a state-of-the-art parser) (?) 60

61 Parsing Parsing is search through the space of all possible parses e.g., we may want either any parse, all parses or the highest scoring parse (if PCFG): The probability by the arg max P (T ) PCFG model T 2G(x) Bottom-up: Set of all trees given by the grammar for the sentence x One starts from words and attempt to construct the full tree Top-down Start from the start symbol and attempt to expand to get the sentence 61

62 CKY algorithm (aka CYK) Cocke-Kasami-Younger algorithm Independently discovered in late 60s / early 70s An efficient bottom up parsing algorithm for (P)CFGs can be used both for the recognition and parsing problems Very important in LP (and beyond) We will start with the non-probabilistic version 62

63 Constraints on the grammar The basic CKY algorithm supports only rules in the Chomsky ormal Form (CF): C! x Unary preterminal rules (generation of words given PoS tags! telescope,! the, ) C! C 1 C 2 Binary inner rules (e.g., S! VP,! ) 63

64 Constraints on the grammar The basic CKY algorithm supports only rules in the Chomsky ormal Form (CF): C! x C! C 1 C 2 Unary preterminal rules (generation of words given PoS tags! telescope,! the, ) Binary inner rules (e.g., S! VP,! ) Any CFG can be converted to an equivalent CF Equivalent means that they define the same language However (syntactic) trees will look differently Makes linguists unhappy It is possible to address it but defining such transformations that allows for easy reverse transformation 64

65 Transformation to CF form What one need to do to convert to CF form Get rid of empty (aka epsilon) productions: C! Get rid of unary rules: C! C 1 -ary rules: C! C 1 C 2...C n (n>2) Generally not a problem as there are not empty production in the standard (postprocessed) treebanks ot a problem, as our CKY algorithm will support unary rules Crucial to process them, as required for efficient parsing 65

66 Transformation to CF form: binarization Consider! T V BG T VBG the utch publishing group How do we get a set of binary rules which are equivalent? 66

67 Transformation to CF form: binarization Consider! T V BG T VBG the utch publishing group How do we get a set of binary rules which are equivalent?! T X X! Y Y! VBG 67

68 Transformation to CF form: binarization Consider! T V BG T VBG the utch publishing group How do we get a set of binary rules which are equivalent?! T X X! Y Y! VBG A more systematic way to refer to new non-terminals! T! VBG 68

69 Transformation to CF form: binarization Instead of binarizing tules we can binarize trees on preprocessing: T the utch VBG publishing group Also known as lossless Markovization in the context of PCFGs T Can be easily reversed on postprocessing T utch VBG publishing group 69

70 Transformation to CF form: binarization Instead of binarizing tules we can binarize trees on preprocessing: T VBG the utch publishing group T T utch VBG T VBG group 70

71 CKY: Parsing task We a given start symbol a grammar G =(V,,R,S) a sequence of words w =(w 1,w 2,...,w n ) Our goal is to produce a parse tree for w 71

72 CKY: Parsing task We a given start symbol [Some illustrations and slides in this lecture are from Marco Kuhlmann] a grammar G =(V,,R,S) a sequence of words w =(w 1,w 2,...,w n ) Our goal is to produce a parse tree for w We need an easy way to refer to substrings of w indices refer to fenceposts span (i, j) refers to words between fenceposts i and j 72

73 Key problems Recognition problem: does the sentence belong to the language defined by CFG? Is there a derivation which yields the sentence? Parsing problem: what is a derivation (tree) corresponding the sentence? Probabilistic parsing: what is the most probable tree for the sentence? 73

74 Parsing one word C! w i 74

75 Parsing one word C! w i 75

76 Parsing one word C! w i 76

77 Parsing longer spans C! C 1 C 2 Check through all C 1,C 2, mid 77

78 Parsing longer spans C! C 1 C 2 Check through all C 1,C 2, mid 78

79 Parsing longer spans 79

80 Signatures Applications of rules is independent of inner structure of a parse tree We only need to know the corresponding span and the root label of the tree Its signature [min, max, C] Also known as an edge 80

81 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones 81

82 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, how do we detect that the sentence is grammatical? 82

83 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, check if S is among admissible labels for the whole sentence, if yes the sentence belong to the language That is if a tree with signature [0, n, S] exists 83

84 CKY idea Compute for every span a set of admissible labels (may be empty for some spans) Start from small trees (single words) and proceed to larger ones When done, check if S is among admissible labels for the whole sentence, if yes the sentence belong to the language That is if a tree with signature [0, n, S] exists Unary rules? 84

85 CKY in action lead can poison S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

86 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = 3 S? Chart (aka parsing triangle) VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

87 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

88 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

89 CKY in action lead can poison max = 1 max = 2 max = 3 S? min = 1 min = 2 min = 3 lead can poison S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

90 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = 3 S? VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

91 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 max = 1 max = 2 max = S? VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

92 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3? 2? 3? VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

93 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3? 2? 3? VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

94 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V 2,M 3,V VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

95 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 2,M 3,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

96 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2?,M 3,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

97 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2?,M 3,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

98 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 3,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

99 CKY in action lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 3 Check about unary rules: no unary rules here,v,vp S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

100 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 5 3?,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

101 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 5 3 S, V P,,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

102 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 5 3 S, V P,,V,VP Check about unary rules: no unary rules here VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

103 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 6 5 3? S, V P,,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

104 CKY in action S! VP lead can poison min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 6 5 3? S, V P,,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

105 CKY in action S! VP lead can poison mid = 1 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 6 S, 5 3 S, V P,,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

106 CKY in action S! VP lead can poison mid = 2 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 6 S, S(?!) 5 3 S, V P,,V,VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

107 CKY in action lead can poison mid = 2 min = 0 min = 1 min = 2 1 max = 1 max = 2 max = 3,V,VP 4 2,M 6 S, S(?!) S, V P,,V Apparently the sentence is ambiguous,vpfor the grammar: (as the grammar overgenerates) 5 3 S! VP VP! VP VP! M V VP! V!!! can! lead! poison M! can M! must V! poison V! lead Preterminal rules Inner rules

108 Ambiguity S S lead M can VP V poison lead can VP V poison o subject-verb agreement, and poison used as an intransitive verb Apparently the sentence is ambiguous for the grammar: (as the grammar overgenerates)

109 CKY more formally Chart can be represented by a Boolean array chart[min][max][label] chart[min][max][c] Relevant entries have 0 < min < max apple n Here we assume that labels (C) are integer indices chart[min][max][c] = true if the signature (min, max, C) is already added to the chart; false otherwise. max = 1 max = 2 max = 3 min = 0 1,V,VP 4 6 S,V P, min = 1 2,M 5 S, V P, 3,V min = 2,VP 109

110 CKY more formally Chart can be represented by a Boolean array chart[min][max][label] chart[min][max][c] Relevant entries have 0 < min < max apple n Here we assume that labels (C) are integer indices chart[min][max][c] = true if the signature (min, max, C) is already added to the chart; false otherwise. max = 1 max = 2 max = 3 min = 0 1,V,VP 4 6 S,V P, min = 1 2,M 5 S, V P, 3,V min = 2,VP 110

111 Implementation: preterminal rules 111

112 Implementation: binary rules 112

113 Unary rules How to integrate unary rules C! C 1? 113

114 Implementation: unary rules 114

115 Implementation: unary rules But we forgot something! 115

116 Unary closure What if the grammar contained 2 rules: A! B B! C But C can be derived from A by a chain of rules: A! B! C One could support chains in the algorithm but it is easier to extend the grammar, to get the transitive closure A! B B! C ) A! B B! C A! C 116

117 Unary closure What if the grammar contained 2 rules: A! B B! C But C can be derived from A by a chain of rules: A! B! C One could support chains in the algorithm but it is easier to extend the grammar, to get the reflexive transitive closure A! B B! C ) A! B B! C A! C A! A B! B C! C Convenient for programming reasons in the PCFG case 117

118 Implementation: skeleton 118

119 Algorithm analysis Time complexity? (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 119

120 Algorithm analysis Time complexity? (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 120

121 Algorithm analysis Time complexity? A few seconds for sentences under < 20 words for a nonoptimized parser (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 121

122 Algorithm analysis Time complexity? A few seconds for sentences under < 20 words for a nonoptimized parser (n 3 R ) R, where is the number of rules in the grammar There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general) 122

123 Practical time complexity Time complexity? (for the PCFG version) n 3.6 [Plot by an Klein] 123

124 Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing evaluation Unsupervised estimation of PCFGs (?) Grammar refinement (producing a state-of-the-art parser) (?) 124

125 Probabilistic parsing We discussed the recognition problem: check if a sentence is parsable with a CFG ow we consider parsing with PCFGs Recognition with PCFGs: what is the probability of the most probable parse tree? Parsing with PCFGs: What is the most probable parse tree? 125

126 CFGs P I S V 0.5 saw 1.0 VP P a girl with PP S! VP VP! V VP! V VP! VP PP! PP!! P PP! P ! girl! telescope! sandwich P! I V! saw V! ate P! with P! in! a! the a telescope p(t )= =

127 istribution over trees Let us denote by G(x) the set of derivations for the sentence x The probability distribution defines the scoring P (T ) over the trees T 2 G(x) Finding the best parse for the sentence according to PCFG: arg max T 2G(x) P (T ) This will be another Viterbi / Max-Product algorithm! 127

128 CKY with PCFGs Chart is represented by a double array chart[min][max][label] chart[min][max][c] It stores probabilities for the most probable subtree with a given signature chart[0][n][s] parse tree will store the probability of the most probable full 128

129 Intuition C! C 1 C 2 For every C choose C 1, C 2 and mid such that P (T 1 ) P (T 2 ) P (C! C 1 C 2 ) is maximal, where T 1 and T 2 are left and right subtrees. 129

130 Implementation: preterminal rules 130

131 Implementation: binary rules 131

132 Unary rules Similarly to CFGs: after producing scores for signatures (c, i, j), try to improve the scores by applying unary rules (and rule chains) If improved, update the scores 132

133 Unary (reflexive transitive) closure A! B B! C ) A! B B! C A! C A! A B! B C! C ote that this is not a PCFG anymore as the rules do not sum to 1 for each parent

134 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C ote that this is not a PCFG anymore as the rules do not sum to 1 for each parent

135 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C ote that this is not a PCFG anymore as the rules do not sum to 1 for each parent A! B B! C A! C e 5 ) A! B B! C A! C A! A B! B C! C

136 Unary (reflexive transitive) closure The fact that the rule is composite needs to be stored to recover the true tree A! B B! C ) A! B B! C A! C A! A B! B C! C ote that this is not a PCFG anymore as the rules do not sum to 1 for each parent A! B B! C A! C e 5 ) A! B B! C A! C A! A B! B C! C What about loops, like: A! B! A! C? 136

137 Recovery of the tree For each signature we store backpointers to the elements from which it was built (e.g., rule and, for binary rules, midpoint) start recovering from [0, n, S] Be careful with unary rules Basically you can assume that you always used an unary rule from the closure (but it could be the trivial one ) C! C 137

138 Speeding up the algorithm (approximate search) Any ideas? 138

139 Speeding up the algorithm (approximate search) Basic pruning (roughly): For every span (i,j) store only labels which have the probability at most times smaller than the probability of the most probable label for this span Check not all rules but only rules yielding subtree labels having non-zero probability Coarse-to-fine pruning Parse with a smaller (simpler) grammar, and precompute (posterior) probabilities for each spans, and use only the ones with non-negligible probability from the previous grammar 139

140 Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing evaluation Unsupervised estimation of PCFGs (?) Grammar refinement (producing a state-of-the-art parser) (?) 140

141 Parser evaluation Though has many drawbacks it is easier and allows to track stateof-the-art across many years Intrinsic evaluation: Automatic: evaluate against annotation provided by human experts (gold standard) according to some predefined measure Manual: according to human judgment Extrinsic evaluation: score syntactic representation by comparing how well a system using this representation performs on some task E.g., use syntactic representation as input for a semantic analyzer and compare results of the analyzer using syntax predicted by different parsers. 141

142 Standard evaluation setting in parsing Automatic intrinsic evaluation is used: parsers are evaluated against gold standard by provided by linguists There is a standard split into the parts: training set: used for estimation of model parameters development set: used for tuning the model (initial experiments) test set: final experiments to compare against previous work 142

143 Automatic evaluation of constituent parsers Exact match: percentage of trees predicted correctly Bracket score: scores how well individual phrases (and their boundaries) are identified Crossing brackets: percentage of phrases boundaries crossing The most standard measure; we will focus on it ependency metrics: scores dependency structure corresponding to the constituent tree (percentage of correctly identified heads) 143

144 Brackets scores The most standard score is bracket score It regards a tree as a collection of brackets: The set of brackets predicted by a parser is compared against the set of brackets in the tree annotated by a linguist Precision, recall and F1 are used as scores Subtree signatures for CKY [min, max, C] 144

145 Bracketing notation S VP P V My dog ate a sausage The same tree as a bracketed sequence (S ( (P My) ( og) ) (VP (V ate) ( ( a ) ( sausage) ) ) ) 145

146 Brackets scores Pr = Re = number of brackets the parser and annotation agree on number of brackets predicted by the parser number of brackets the parser and annotation agree on number of brackets in annotation F 1= 2 Pr Re Pr+ Re Harmonic mean of precision and recall 146

147 Preview: F1 bracket score Treebank PCFG Unlexicalized Lexicalized PCFG PCFG (Klein and (Collins, 1999) Manning, 2003) Automatically Induced PCFG (Petrov et al., 2006) The best results reported (as of 2012) 147

148 Preview: F1 bracket score Treebank PCFG Unlexicalized Lexicalized PCFG PCFG (Klein and (Collins, 1999) Manning, 2003) Automatically Induced PCFG (Petrov et al., 2006) The best results reported (as of 2012) Looked into this in BSc KI at UvA We will look into this one 148

LECTURER: BURCU CAN Spring

LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences

More information

10/17/04. Today s Main Points

10/17/04. Today s Main Points Part-of-speech Tagging & Hidden Markov Model Intro Lecture #10 Introduction to Natural Language Processing CMPSCI 585, Fall 2004 University of Massachusetts Amherst Andrew McCallum Today s Main Points

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment. even Lectures on tatistical Parsing Christopher Manning LA Linguistic Institute 7 LA Lecture Attendee information Please put on a piece of paper: ame: Affiliation: tatus (undergrad, grad, industry, prof,

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th,

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

In this chapter, we explore the parsing problem, which encompasses several questions, including:

In this chapter, we explore the parsing problem, which encompasses several questions, including: Chapter 12 Parsing Algorithms 12.1 Introduction In this chapter, we explore the parsing problem, which encompasses several questions, including: Does L(G) contain w? What is the highest-weight derivation

More information

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs (based on slides by Sharon Goldwater and Philipp Koehn) 21 February 2018 Nathan Schneider ENLP Lecture 11 21

More information

A Context-Free Grammar

A Context-Free Grammar Statistical Parsing A Context-Free Grammar S VP VP Vi VP Vt VP VP PP DT NN PP PP P Vi sleeps Vt saw NN man NN dog NN telescope DT the IN with IN in Ambiguity A sentence of reasonable length can easily

More information

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

Introduction to Computational Linguistics

Introduction to Computational Linguistics Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP  NP PP 1.0. N people 0. /6/7 CS 6/CS: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang The grammar: Binary, no epsilons,.9..5

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015 Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch COMP-599 Oct 1, 2015 Announcements Research skills workshop today 3pm-4:30pm Schulich Library room 313 Start thinking about

More information

Multilevel Coarse-to-Fine PCFG Parsing

Multilevel Coarse-to-Fine PCFG Parsing Multilevel Coarse-to-Fine PCFG Parsing Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David Ellis, Isaac Haxton, Catherine Hill, Shrivaths Iyengar, Jeremy Moore, Michael Pozar, and Theresa

More information

CS838-1 Advanced NLP: Hidden Markov Models

CS838-1 Advanced NLP: Hidden Markov Models CS838-1 Advanced NLP: Hidden Markov Models Xiaojin Zhu 2007 Send comments to jerryzhu@cs.wisc.edu 1 Part of Speech Tagging Tag each word in a sentence with its part-of-speech, e.g., The/AT representative/nn

More information

Hidden Markov Models

Hidden Markov Models CS 2750: Machine Learning Hidden Markov Models Prof. Adriana Kovashka University of Pittsburgh March 21, 2016 All slides are from Ray Mooney Motivating Example: Part Of Speech Tagging Annotate each word

More information

CS 6120/CS4120: Natural Language Processing

CS 6120/CS4120: Natural Language Processing CS 6120/CS4120: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Assignment/report submission

More information

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark Penn Treebank Parsing Advanced Topics in Language Processing Stephen Clark 1 The Penn Treebank 40,000 sentences of WSJ newspaper text annotated with phrasestructure trees The trees contain some predicate-argument

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

Lecture 5: UDOP, Dependency Grammars

Lecture 5: UDOP, Dependency Grammars Lecture 5: UDOP, Dependency Grammars Jelle Zuidema ILLC, Universiteit van Amsterdam Unsupervised Language Learning, 2014 Generative Model objective PCFG PTSG CCM DMV heuristic Wolff (1984) UDOP ML IO K&M

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

Algorithms for Syntax-Aware Statistical Machine Translation

Algorithms for Syntax-Aware Statistical Machine Translation Algorithms for Syntax-Aware Statistical Machine Translation I. Dan Melamed, Wei Wang and Ben Wellington ew York University Syntax-Aware Statistical MT Statistical involves machine learning (ML) seems crucial

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 26 February 2018 Recap: tagging POS tagging is a sequence labelling task.

More information

Remembering subresults (Part I): Well-formed substring tables

Remembering subresults (Part I): Well-formed substring tables Remembering subresults (Part I): Well-formed substring tables Detmar Meurers: Intro to Computational Linguistics I OSU, LING 684.01, 1. February 2005 Problem: Inefficiency of recomputing subresults Two

More information

Log-Linear Models, MEMMs, and CRFs

Log-Linear Models, MEMMs, and CRFs Log-Linear Models, MEMMs, and CRFs Michael Collins 1 Notation Throughout this note I ll use underline to denote vectors. For example, w R d will be a vector with components w 1, w 2,... w d. We use expx

More information

Latent Variable Models in NLP

Latent Variable Models in NLP Latent Variable Models in NLP Aria Haghighi with Slav Petrov, John DeNero, and Dan Klein UC Berkeley, CS Division Latent Variable Models Latent Variable Models Latent Variable Models Observed Latent Variable

More information

Constituency Parsing

Constituency Parsing CS5740: Natural Language Processing Spring 2017 Constituency Parsing Instructor: Yoav Artzi Slides adapted from Dan Klein, Dan Jurafsky, Chris Manning, Michael Collins, Luke Zettlemoyer, Yejin Choi, and

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov Models Murhaf Fares & Stephan Oepen Language Technology Group (LTG) October 27, 2016 Recap: Probabilistic Language

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 17 October 2016 updated 9 September 2017 Recap: tagging POS tagging is a

More information

Lecture 13: Structured Prediction

Lecture 13: Structured Prediction Lecture 13: Structured Prediction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501: NLP 1 Quiz 2 v Lectures 9-13 v Lecture 12: before page

More information

Hidden Markov Models (HMMs)

Hidden Markov Models (HMMs) Hidden Markov Models HMMs Raymond J. Mooney University of Texas at Austin 1 Part Of Speech Tagging Annotate each word in a sentence with a part-of-speech marker. Lowest level of syntactic analysis. John

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

c(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a)

c(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a) Chapter 13 Statistical Parsg Given a corpus of trees, it is easy to extract a CFG and estimate its parameters. Every tree can be thought of as a CFG derivation, and we just perform relative frequency estimation

More information

Spectral Unsupervised Parsing with Additive Tree Metrics

Spectral Unsupervised Parsing with Additive Tree Metrics Spectral Unsupervised Parsing with Additive Tree Metrics Ankur Parikh, Shay Cohen, Eric P. Xing Carnegie Mellon, University of Edinburgh Ankur Parikh 2014 1 Overview Model: We present a novel approach

More information

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite

To make a grammar probabilistic, we need to assign a probability to each context-free rewrite Notes on the Inside-Outside Algorithm To make a grammar probabilistic, we need to assign a probability to each context-free rewrite rule. But how should these probabilities be chosen? It is natural to

More information

Features of Statistical Parsers

Features of Statistical Parsers Features of tatistical Parsers Preliminary results Mark Johnson Brown University TTI, October 2003 Joint work with Michael Collins (MIT) upported by NF grants LI 9720368 and II0095940 1 Talk outline tatistical

More information

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG

Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing. Part I. Formal Properties of TAG. Outline: Formal Properties of TAG Grammar formalisms Tree Adjoining Grammar: Formal Properties, Parsing Laura Kallmeyer, Timm Lichte, Wolfgang Maier Universität Tübingen Part I Formal Properties of TAG 16.05.2007 und 21.05.2007 TAG Parsing

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov Models Murhaf Fares & Stephan Oepen Language Technology Group (LTG) October 18, 2017 Recap: Probabilistic Language

More information

DT2118 Speech and Speaker Recognition

DT2118 Speech and Speaker Recognition DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language

More information

CS 545 Lecture XVI: Parsing

CS 545 Lecture XVI: Parsing CS 545 Lecture XVI: Parsing brownies_choco81@yahoo.com brownies_choco81@yahoo.com Benjamin Snyder Parsing Given a grammar G and a sentence x = (x1, x2,..., xn), find the best parse tree. We re not going

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing! CS 6120 Spring 2014! Northeastern University!! David Smith! with some slides from Jason Eisner & Andrew

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models 1 University of Oslo : Department of Informatics INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Language Models & Hidden Markov Models Stephan Oepen & Erik Velldal Language

More information

The Infinite PCFG using Hierarchical Dirichlet Processes

The Infinite PCFG using Hierarchical Dirichlet Processes S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise S NP VP NP PRP VP VBD NP NP DT NN PRP she VBD heard DT the NN noise

More information

Today s Agenda. Need to cover lots of background material. Now on to the Map Reduce stuff. Rough conceptual sketch of unsupervised training using EM

Today s Agenda. Need to cover lots of background material. Now on to the Map Reduce stuff. Rough conceptual sketch of unsupervised training using EM Today s Agenda Need to cover lots of background material l Introduction to Statistical Models l Hidden Markov Models l Part of Speech Tagging l Applying HMMs to POS tagging l Expectation-Maximization (EM)

More information

Introduction to Probablistic Natural Language Processing

Introduction to Probablistic Natural Language Processing Introduction to Probablistic Natural Language Processing Alexis Nasr Laboratoire d Informatique Fondamentale de Marseille Natural Language Processing Use computers to process human languages Machine Translation

More information

Roger Levy Probabilistic Models in the Study of Language draft, October 2,

Roger Levy Probabilistic Models in the Study of Language draft, October 2, Roger Levy Probabilistic Models in the Study of Language draft, October 2, 2012 224 Chapter 10 Probabilistic Grammars 10.1 Outline HMMs PCFGs ptsgs and ptags Highlight: Zuidema et al., 2008, CogSci; Cohn

More information

Probabilistic Context Free Grammars

Probabilistic Context Free Grammars 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set

More information

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 18 Maximum Entropy Models I Welcome back for the 3rd module

More information

Hidden Markov Models

Hidden Markov Models CS769 Spring 2010 Advanced Natural Language Processing Hidden Markov Models Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu 1 Part-of-Speech Tagging The goal of Part-of-Speech (POS) tagging is to label each

More information

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009 CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov

More information

Midterm sample questions

Midterm sample questions Midterm sample questions CS 585, Brendan O Connor and David Belanger October 12, 2014 1 Topics on the midterm Language concepts Translation issues: word order, multiword translations Human evaluation Parts

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Natural Language Processing. Lecture 13: More on CFG Parsing

Natural Language Processing. Lecture 13: More on CFG Parsing Natural Language Processing Lecture 13: More on CFG Parsing Probabilistc/Weighted Parsing Example: ambiguous parse Probabilistc CFG Ambiguous parse w/probabilites 0.05 0.05 0.20 0.10 0.30 0.20 0.60 0.75

More information

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees Recap: Lexicalized PCFGs We now need to estimate rule probabilities such as P rob(s(questioned,vt) NP(lawyer,NN) VP(questioned,Vt) S(questioned,Vt)) 6.864 (Fall 2007): Lecture 5 Parsing and Syntax III

More information

Marrying Dynamic Programming with Recurrent Neural Networks

Marrying Dynamic Programming with Recurrent Neural Networks Marrying Dynamic Programming with Recurrent Neural Networks I eat sushi with tuna from Japan Liang Huang Oregon State University Structured Prediction Workshop, EMNLP 2017, Copenhagen, Denmark Marrying

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

Decoding and Inference with Syntactic Translation Models

Decoding and Inference with Syntactic Translation Models Decoding and Inference with Syntactic Translation Models March 5, 2013 CFGs S NP VP VP NP V V NP NP CFGs S NP VP S VP NP V V NP NP CFGs S NP VP S VP NP V NP VP V NP NP CFGs S NP VP S VP NP V NP VP V NP

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Information Extraction, Hidden Markov Models Sameer Maskey Week 5, Oct 3, 2012 *many slides provided by Bhuvana Ramabhadran, Stanley Chen, Michael Picheny Speech Recognition

More information

HMM and Part of Speech Tagging. Adam Meyers New York University

HMM and Part of Speech Tagging. Adam Meyers New York University HMM and Part of Speech Tagging Adam Meyers New York University Outline Parts of Speech Tagsets Rule-based POS Tagging HMM POS Tagging Transformation-based POS Tagging Part of Speech Tags Standards There

More information

Lecture 9: Hidden Markov Model

Lecture 9: Hidden Markov Model Lecture 9: Hidden Markov Model Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501 Natural Language Processing 1 This lecture v Hidden Markov

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 27 SMT Assignment; HMM recap; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th March, 2011 CMU Pronunciation

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2016 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Sequence Labeling: HMMs & Structured Perceptron

Sequence Labeling: HMMs & Structured Perceptron Sequence Labeling: HMMs & Structured Perceptron CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu HMM: Formal Specification Q: a finite set of N states Q = {q 0, q 1, q 2, q 3, } N N Transition

More information

Probabilistic Graphical Models

Probabilistic Graphical Models CS 1675: Intro to Machine Learning Probabilistic Graphical Models Prof. Adriana Kovashka University of Pittsburgh November 27, 2018 Plan for This Lecture Motivation for probabilistic graphical models Directed

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

Sharpening the empirical claims of generative syntax through formalization

Sharpening the empirical claims of generative syntax through formalization Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities NASSLLI, June 2014 Part 1: Grammars and cognitive hypotheses What is a grammar?

More information

NLP Homework: Dependency Parsing with Feed-Forward Neural Network

NLP Homework: Dependency Parsing with Feed-Forward Neural Network NLP Homework: Dependency Parsing with Feed-Forward Neural Network Submission Deadline: Monday Dec. 11th, 5 pm 1 Background on Dependency Parsing Dependency trees are one of the main representations used

More information

A brief introduction to Conditional Random Fields

A brief introduction to Conditional Random Fields A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood

More information

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 A DOP Model for LFG Rens Bod and Ronald Kaplan Kathrin Spreyer Data-Oriented Parsing, 14 June 2005 Lexical-Functional Grammar (LFG) Levels of linguistic knowledge represented formally differently (non-monostratal):

More information

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7 Quiz 1, COMS 4705 Name: 10 30 30 20 Good luck! 4705 Quiz 1 page 1 of 7 Part #1 (10 points) Question 1 (10 points) We define a PCFG where non-terminal symbols are {S,, B}, the terminal symbols are {a, b},

More information

RNA Secondary Structure Prediction

RNA Secondary Structure Prediction RNA Secondary Structure Prediction 1 RNA structure prediction methods Base-Pair Maximization Context-Free Grammar Parsing. Free Energy Methods Covariance Models 2 The Nussinov-Jacobson Algorithm q = 9

More information

Probabilistic Context-Free Grammars and beyond

Probabilistic Context-Free Grammars and beyond Probabilistic Context-Free Grammars and beyond Mark Johnson Microsoft Research / Brown University July 2007 1 / 87 Outline Introduction Formal languages and Grammars Probabilistic context-free grammars

More information

Introduction to Semantic Parsing with CCG

Introduction to Semantic Parsing with CCG Introduction to Semantic Parsing with CCG Kilian Evang Heinrich-Heine-Universität Düsseldorf 2018-04-24 Table of contents 1 Introduction to CCG Categorial Grammar (CG) Combinatory Categorial Grammar (CCG)

More information

CHAPTER THREE: RELATIONS AND FUNCTIONS

CHAPTER THREE: RELATIONS AND FUNCTIONS CHAPTER THREE: RELATIONS AND FUNCTIONS 1 Relations Intuitively, a relation is the sort of thing that either does or does not hold between certain things, e.g. the love relation holds between Kim and Sandy

More information

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018.

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018. Recap: HMM ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2018 Elements of HMM: Set of states (tags) Output alphabet (word types) Start state (beginning of sentence) State transition probabilities

More information

Sharpening the empirical claims of generative syntax through formalization

Sharpening the empirical claims of generative syntax through formalization Sharpening the empirical claims of generative syntax through formalization Tim Hunter University of Minnesota, Twin Cities ESSLLI, August 2015 Part 1: Grammars and cognitive hypotheses What is a grammar?

More information