LECTURER: BURCU CAN Spring

Size: px

Start display at page:

Download "LECTURER: BURCU CAN Spring"

Justina Mitchell
5 years ago
Views:

1 LECTURER: BURCU CAN Spring

2 Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can model a more powerful class of languages than HMMs. Can we take advantage of this property?

3 Production Rule: <Left-Hand Side> <Right-Hand Side> (Probability) Example Grammar: S N V (1.0) N Bob (0.3) Jane (0.7) Example Parse: S V V V N (0.4) loves (0.6) N Jane V N loves Bob

4 Natural Language Processing: parsing written sentences BioInformatics: RNA sequences Stock Markets: model rise/fall of the Dow Jones Computer Vision: parsing architectural scenes

5 Statistical parsing uses a probabilistic model of syntax in order to assign probabilities to each parse tree. Provides principled approach to resolving syntactic ambiguity. Allows supervised learning of parsers from tree-banks of parse trees provided by human linguists. Also allows unsupervised learning of parsers from unannotated text, but the accuracy of such parsers has been limited. 5

6 A PCFG is a probabilistic version of a CFG where each production has a probability. Probabilities of all productions rewriting a given non-terminal must add to 1, defining a distribution for each non-terminal.

9 S NP VP VP V NP VP V PP VP V NP PP NP NP NP NP NP PP NP N PP P NP people fish tanks people fish with rods N people N fish N tanks N rods V people V fish V tanks P with

10 S NP VP 1.0 VP V NP 0.6 VP V NP PP 0.4 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with 1.0

11 P(t) The probability of a tree t is the product of the probabilities of the rules used to generate it. P(s) The probability of the string s is the sum of the probabilities of the trees which have that string as their yield P(s) = Σ j P(s, t) where t is a parse of s = Σ j P(t)

12 EXAMPLE - 1

14 s = people fish tanks with rods P(t 1 ) = = P(t 2 ) = = P(s) = P(t 1 ) + P(t 2 ) = =

15 EXAMPLE - 2

19 All rules are of the form X Y Z or X w X, Y, Z N (non-terminals) and w T (terminals) A transformation to this form doesn t change the generative capacity of a CFG That is, it recognizes the same language But maybe with different trees Empties and unaries are removed recursively n-ary rules are divided by introducing new nonterminals (n > 2)

24 Observation likelihood: To classify and order sentences. Most likely derivation: To determine the most likely parse tree for a sentence. Maximum likelihood training: To train a PCFG to fit empirical training data. 24

25 There is an analog to the Viterbi algorithm to efficiently determine the most probable derivation (parse tree) for a sentence. S NP VP S VP NP Det A N NP NP PP NP PropN A ε A Adj A PP Prep NP VP V NP VP VP PP English John liked the dog in the pen. S XNP VP PCFG John V NP PP Parser liked the dog in the pen

26 There is an analog to the Viterbi algorithm to efficiently determine the most probable derivation (parse tree) for a sentence. S NP VP S VP NP Det A N NP NP PP NP PropN A ε A Adj A PP Prep NP VP V NP VP VP PP English John liked the dog in the pen. PCFG Parser S NP VP John V NP liked the dog in the pen 26

27 CKY can be modified for PCFG parsing by including in each cell a probability for each non-terminal. Cell[i,j] must retain the most probable derivation of each constituent (non-terminal) covering words i +1 through j together with its associated probability. When transforming the grammar to CNF, must set production probabilities to preserve the probability of derivations.

28 Original Grammar S NP VP S Aux NP VP S VP NP Pronoun NP Proper-Noun NP Det Nominal Nominal Noun Nominal Nominal Noun Nominal Nominal PP VP Verb VP Verb NP VP VP PP PP Prep NP Chomsky Normal Form S NP VP S X1 VP X1 Aux NP S book include prefer S Verb NP S VP PP NP I he she me NP Houston NWA NP Det Nominal Nominal book flight meal money Nominal Nominal Noun Nominal Nominal PP VP book include prefer VP Verb NP VP VP PP PP Prep NP

29 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 29

30 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 30

31 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 31

32 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 Prep:.2 32

33 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 33

34 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 34

35 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 Det:.6 NP:.6*.6*.15 =.054 NP:.6*.6*.0024 = Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 35

36 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 S:.05*.5* = Det:.6 NP:.6*.6*.15 =.054 NP:.6*.6*.0024 = Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 36

37 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 S:.03*.0135*.032 = S: Det:.6 NP:.6*.6*.15 =.054 NP:.6*.6*.0024 = Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 37

38 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 Det:.6 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 NP:.6*.6*.15 =.054 Nominal:.15 Noun:.5 S: NP:.6*.6*.0024 = Nominal:.5*.15*.032 =.0024 Pick most probable parse, i.e. take max to combine probabilities of multiple derivations of each constituent in each cell. Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 38

39 S NP VP VP V NP VP V NP PP NP NP NP NP NP PP NP N NP e PP P NP N people N fish N tanks N rods V people V fish V tanks P with Epsilon removal.

40 S NP VP S VP VP V NP VP V VP V NP PP VP V PP NP NP NP NP NP NP NP PP NP PP NP N PP P NP PP P N people N fish N tanks N rods V people V fish V tanks P with Remove unary rules. (remove S VP)

41 S NP VP VP V NP S V NP VP V S V VP V NP PP S V NP PP VP V PP S V PP NP NP NP NP NP NP NP PP NP PP NP N N people N fish N tanks N rods V people V fish V tanks P with PP P NP PP P Remove unary rules. (remove S V)

42 S NP VP VP V NP S V NP VP V VP V NP PP S V NP PP VP V PP S V PP NP NP NP NP NP NP NP PP NP PP NP N PP P NP PP P N people N fish N tanks N rods V people S people V fish S fish V tanks S tanks P with Remove unary rules. (remove VP V)

43 S NP VP VP V NP S V NP VP V NP PP S V NP PP VP V PP S V PP NP NP NP NP NP NP NP PP NP PP NP N PP P NP PP P N people N fish N tanks N rods V people S people VP people V fish S fish VP fish V tanks S tanks VP tanks P with Remove unary rules. (remove NP NP, NP N, PP P)

44 S NP VP VP V NP S V NP VP V NP PP S V NP PP VP V PP S V PP NP NP NP NP NP PP NP P NP PP P NP NP people NP fish NP tanks NP rods V people S people VP people V fish S fish VP fish V tanks S tanks VP tanks P with PP with Binarize now.

45 S NP VP VP V NP S V NP NP PP NP PP VP V PP S V PP NP NP NP NP NP PP NP P NP PP P NP NP people NP fish NP tanks NP rods V people S people VP people V fish S fish VP fish V tanks S tanks VP tanks P with PP with Chomsky Normal Form!

46 S NP VP VP V NP VP V NP PP NP NP NP NP NP PP NP N NP e PP P NP N people N fish N tanks N rods V people V fish V tanks P with Initial grammar.

47 S NP VP VP V NP S V NP NP PP NP PP VP V PP S V PP NP NP NP NP NP PP NP P NP PP P NP NP people NP fish NP tanks NP rods V people S people VP people V fish S fish VP fish V tanks S tanks VP tanks P with PP with Final CNF grammar.

48 You should think of this as a transformation for efficient parsing Binarization is crucial for cubic time CFG parsing The rest isn t necessary; it just makes the algorithms cleaner and a bit quicker

49 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with 1.0

50 fish people fish tanks score[0][1] score[0][2] score[0][3] score[0][4] 1 score[1][2] score[1][3] score[1][4] 2 score[2][3] score[2][4] 3 score[3][4] 4

51 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP fish people fish tanks N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with for i=0;; i<#(words);; i++ for A in nonterms if A -> words[i] in grammar score[i][i+1][a] = P(A -> words[i]);;

52 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP fish people fish tanks N fish 0.2 V fish 0.6 N people 0.5 V people 0.1 N fish 0.2 V fish 0.6 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with 1.0 // handle unaries boolean 3 added = true while added added = false for A, B in nonterms if score[i][i+1][b] > 0 && A->B in grammar prob = P(A->B)*score[i][i+1][B] if(prob > score[i][i+1][a]) 4 score[i][i+1][a] = prob back[i][i+1][a] = B added = true N tanks 0.2 V tanks 0.1

53 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP //handle binaries prob=score[begin][split][b]*score[split][end][c]*p(a->bc) if (prob > score[begin][end][a]) score[begin]end][a] = prob back[begin][end][a] = new Triple(split,B,C) N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

54 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S NP VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP //handle unaries boolean added = true while added added = false for A, B in nonterms prob = P(A->B)*score[begin][end][B];; if prob > score[begin][end][a] score[begin][end][a] = prob back[begin][end][a] = B added = true NP NP NP VP V NP S NP VP N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S NP VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

55 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP NP NP NP VP V NP S NP VP N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP for split = begin+1 to end-1 for A,B,C in nonterms prob=score[begin][split][b]*score[split][end][c]*p(a->bc) if prob > score[begin][end][a] score[begin]end][a] = prob back[begin][end][a] = new Triple(split,B,C) NP NP NP VP V NP S VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

56 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP NP NP NP VP V NP S NP VP NP NP NP VP V NP S NP VP N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP for split = begin+1 to end-1 for A,B,C in nonterms prob=score[begin][split][b]*score[split][end][c]*p(a->bc) if prob > score[begin][end][a] score[begin]end][a] = prob back[begin][end][a] = new Triple(split,B,C) NP NP NP VP V NP S VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

57 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks 0.3 P with fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP NP NP NP VP V NP S NP VP NP NP NP VP V NP S NP VP N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP for split = begin+1 to end-1 for A,B,C in nonterms prob=score[begin][split][b]*score[split][end][c]*p(a->bc) if prob > score[begin][end][a] score[begin]end][a] = prob back[begin][end][a] = new Triple(split,B,C) NP NP NP VP V NP S NP VP NP NP NP VP V NP S VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

58 S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP 0.3 VP V PP NP PP 1.0 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 N people 0.5 N fish 0.2 N tanks 0.2 N rods 0.1 V people 0.1 V fish 0.6 V tanks fish people fish tanks N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP NP NP NP VP V NP S VP N people 0.5 V people 0.1 NP N 0.35 VP V 0.01 S VP NP NP NP VP V NP S NP VP NP NP NP VP V NP S NP VP N fish 0.2 V fish 0.6 NP N 0.14 VP V 0.06 S VP P with 1.0 Call buildtree(score, back) to get the best parse NP NP NP VP V NP S NP VP NP NP NP VP V NP S NP VP NP NP NP VP V NP S VP N tanks 0.2 V tanks 0.1 NP N 0.14 VP V 0.03 S VP 0.003

59 There is an analog to Forward algorithm for HMMs called the Inside algorithm for efficiently determining how likely a string is to be produced by a PCFG. Can use a PCFG as a language model to choose between alternative sentences for speech recognition or machine translation. 59 S NP VP S VP NP Det A N NP NP PP NP PropN A ε A Adj A PP Prep NP VP V NP VP VP PP English ?? O 1 The dog big barked. The big dog barked O 2 P(O 2 English) > P(O 1 English)?

60 Use CKY probabilistic parsing algorithm but combine probabilities of multiple derivations of any constituent using addition instead of max. 60

61 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 S: S: Det:.6 NP:.6*.6*.15 =.054 NP:.6*.6*.0024 = Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 61

62 Book the flight through Houston S :.01, VP:.1, Verb:.5 Nominal:.03 Noun:.1 Det:.6 S:.05*.5*.054 = VP:.5*.5*.054 =.0135 NP:.6*.6*.15 =.054 S: Sum probabilities = of multiple derivations NP:.6*.6*.0024 = of each constituent in each cell. Nominal:.15 Noun:.5 Nominal:.5*.15*.032 =.0024 Prep:.2 PP:1.0*.2*.16 =.032 NP:.16 PropNoun:.8 62

63 If parse trees are provided for training sentences, a grammar and its parameters can be can all be estimated directly from counts accumulated from the treebank (with appropriate smoothing). S Tree Bank NP 63 VP John V NP PP NP S put the dog in the pen VP John V NP PP put the dog in the pen. Supervised PCFG Training S NP VP S VP NP Det A N NP NP PP NP PropN A ε A Adj A PP Prep NP VP V NP VP VP PP English

64 Set of production rules can be taken directly from the set of rewrites in the treebank. Parameters can be directly estimated from frequency counts in the treebank. P( a b a ) = count( a b ) count( a g ) å g = count( a b ) count( a ) 64

65 Given a set of sentences, induce a grammar that maximizes the probability that this data was generated from this grammar. Assume the number of non-terminals in the grammar is specified. Only need to have an unannotated set of sequences generated from the model. Does not need correct parse trees for these sentences. In this sense, it is unsupervised. 65

66 Training Sentences John ate the apple A dog bit Mary Mary hit the dog John gave Mary the cat.. PCFG Training S NP VP S VP NP Det A N NP NP PP NP PropN A ε A Adj A PP Prep NP VP V NP VP VP PP English

67 The Inside-Outside algorithm is a version of EM for unsupervised learning of a PCFG. Analogous to Baum-Welch (forward-backward) for HMMs Given the number of non-terminals, construct all possible CNF productions with these non-terminals and observed terminal symbols. Use EM to iteratively train the probabilities of these productions to locally maximize the likelihood of the data. See Manning and Schütze text for details Experimental results are not impressive, but recent work imposes additional constraints to improve unsupervised grammar learning.

68 Specialized productions can be generated by including the head word and its POS of each non-terminal as part of that non-terminal s symbol. NP NNP John S John-NNP liked-vbd VBD liked VP liked-vbd the DT NP Nominal Nominal PP NN dog dog-nn dog-nn IN in Nominaldog-NN Nominaldog-NN PPin-IN dog-nn in-in NP DT pen-nn Nominal pen-nn the NN pen

69 NP NNP John S John-NNP VBD put-vbd VP put DT the VP put-vbd NP put-vbd dog-nn Nominal NN dog dog-nn IN PP in-in NP in DT the VPput-VBD VPput-VBD PPin-IN pen-nn Nominal NN pen pen-nn

70 Accurately estimating parameters on such a large number of very specialized productions could require enormous amounts of treebank data. Need some way of estimating parameters for lexicalized productions that makes reasonable independence assumptions so that accurate probabilities for very specific rules can be learned.

71 English Penn Treebank: Standard corpus for testing syntactic parsing consists of 1.2 M words of text from the Wall Street Journal (WSJ). Typical to train on about 40,000 parsed sentences and test on an additional standard disjoint test set of 2,416 sentences. Chinese Penn Treebank: 100K words from the Xinhua news service. 71

72 72 ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT the) (NN board) ) (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director) )) (NP-TMP (NNP Nov.) (CD 29) ))) (..) ))

73 Eryiğit et al. Multiword Expressions in Statistical Dependency Parsing, SPMRL, sentences Uses CoNLL format

74 PARSEVAL metrics measure the fraction of the constituents that match between the computed and human parse trees. If P is the system s parse tree and T is the human parse tree (the gold standard ): Recall = (# correct constituents in P) / (# constituents in T) Precision = (# correct constituents in P) / (# constituents in P) F 1 is the harmonic mean of precision and recall.

75 Correct Tree T S Computed Tree P S Verb book VP the NP Det Nominal Nominal Noun flight Prep PP through NP Proper-Noun Houston Verb book VP the NP Det Nominal Noun flight Prep through # Constituents: 12 # Constituents: 12 # Correct Constituents: 10 Recall = 10/12= 83.3% Precision = 10/12=83.3% F 1 = 83.3% VP PP NP Proper-Noun Houston

76 Statistical models such as PCFGs allow for probabilistic resolution of ambiguities. PCFGs can be easily learned from treebanks. Lexicalization is required to effectively resolve many ambiguities. Current statistical parsers are quite accurate but not yet at the level of human-expert agreement. 76

77 Raymond Mooney, Statistical Parsing. University of Texas Grant Schindler, PCFGs. Julia Hockenmaier, More on PCFG Parsing

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic