CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th, 10 th March, 2011 (Lectures 21 and 22 were on Sentiment Analysis (Lectures 21 and 22 were on Sentiment Analysis by Aditya Joshi)
A note on Language Modeling Example sentence ^ The tortoise beat the hare in the race. Guided Guided by Guided by by frequency Language world Knowledge Knowledge N-gram (n=3) CFG Probabilistic CFG Dependency Grammar ^ the tortoise 5*10-3 S-> NP VP S->NP VP 10 1.0 Semantic Roles agt, obj, sen, etc. the tortoise beat NP->DT N NP->DT N Semantic Rules 3*10-2 0.5 are always tortoise beat the VP->V NP VP->V NP PP between 7*10-5 PP 0.4 Heads beat the hare PP-> P NP PP-> P NP 5*10-6 1.0 Prob. DG Semantic Roles with probabilities
Parse Tree S NP VP DT N V NP PP The Tortoise beat DT N P NP the hare in DT N the race
UNL Expression beat@past agt obj scn (scene) tortoise@def hare@def race@def
Purpose of LM Prediction of next word (Speech Processing) Language Identification (for same script) Belongingness check (parsing) P(NP->DT N) means what is the probability that the YIELD of the non terminal NP is DT N
Need for Deep Parsing Sentences are linear structures But there is a hierarchy- a tree- hidden behind the linear structure There are constituents and branches
PPs are at the same level: flat with respect to the head word book NP No distinction in terms of dominance or c-command command The AP PP PP book with the blue cover big of poems [The big book of poems with the [The big book of poems with the Blue cover] is on the table.
Constituency test of Replacement runs into problems One-replacement: Ib bought htthe big [book of poems with iththe blue cover] not the small [one] One-replacemen targets book of poems with the blue cover Another one-replacement: I bought the big [book of poems] with the blue cover not the small [one] with the red cover One-replacemen targets book of poems
More deeply embedded structure NP N 1 The AP N 2 big N 3 PP N book PP with the blue cover of poems
Grammar and Parsing Algorithms
A simplified grammar S NP VP NP DT N N VP V ADV V
Example Sentence People e laugh 1 2 3 These are positions Lexicon: People - N, V Laugh - N, V This indicate that both Noun and Verb is possible for the word People
Top-Down Parsing State Backup State Action ----------------------------------------------------------------------------------------------------- 1. ((S) 1) - - Position of input pointer 2. ((NP VP)1) - - 3a. ((DT N VP)1) ((N VP) 1) - 3b. ((N VP)1) - - 4. ((VP)2) - Consume People 5a. ((V ADV)2) ((V)2) - 6. ((ADV)3) ((V)2) Consume laugh 5b. ((V)2) - - 6. ((.)3) - Consume laugh Termination Condition : All inputs over. No symbols remaining. Note: Input symbols can be pushed back.
Discussion for Top-Down Parsing This kind of searching is goal driven This kind of searching is goal driven. Gives importance to textual precedence (rule precedence). No regard for data, a priori (useless expansions made).
Bottom-Up Parsing Some conventions: N 12 Represents positions S 1? -> NP 12 VP 2? End position unknown Work on the LHS done, while the work on RHS remaining
Bottom-Up Parsing (pictorial representation) S -> NP 12 VP 23 People Laugh 1 2 3 N 12 N 23 12 23 V 12 V 23 NP 12 -> N 12 NP 23 -> N 23 VP 12 -> V 12 VP 23 -> V 23 12 12 23 23 S 1? -> NP 12 VP 2?
Problem with Top-Down Parsing Left Recursion Suppose you have A-> AB rule. Then we will have the expansion as follows: ((A)K) -> ((AB)K) -> ((ABB)K)..
Combining i top-down and bottom-up strategies
Top-Down Bottom-Up Chart Parsing Combines advantages of top-down & bottomup pparsing. Does not work in case of left recursion. e.g. People laugh People noun, verb Laugh noun, verb Grammar S NP VP NP DT N N VP V ADV V
Transitive Closure People laugh 1 2 3 S NP VP NP N VP V NP DT N S NP VP S NP VP NP N N VP V V ADV success VP V
Arcs in Parsing Each arc represents a chart which records Completed work (left of ) Expected work (right of )
Example People laugh loudly 1 2 3 4 S NP VP NP N VP V VP V ADV NP DT N S NP VP VP V ADV S NP VP NP N VP V ADV S NP VP VP V
Dealing With Structural Ambiguity Multiple parses for a sentence The man saw the boy with a telescope. The man saw the mountain with a telescope. The man saw the boy with the ponytail. At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2 nd &3 rd sentence.
Prepositional Phrase (PP) Attachment Problem V NP 1 P NP 2 (Here P means preposition) NP 2 attaches to NP 1? or NP 2 attaches to V?
Parse Trees for a Structurally Ambiguous Sentence Let the grammar be S NP VP NP DT N DT N PP PP PNP VP V NP PP V NP For the sentence, I saw a boy with a telescope
Parse Tree - 1 S NP VP N V NP I saw Det N a boy PP P NP with Det N a telescope
Parse Tree -2 S NP VP N V NP PP I saw Det N a boy P with NP Det N a telescope
Parsing Structural Ambiguity
Parsing for Structurally Ambiguous Sentences Sentence I saw a boy with a telescope Grammar: S NP VP NP ART N ART N PP PRON VP V NP PP V NP ART a an the N boy telescope PRON I V saw
Ambiguous Parses Two possible parses: PP attached with Verb (i.e. I used a telescope to see) ( S ( NP ( PRON I ))() VP ( V saw ) ( NP ( (ART a ) ( N boy )) ( PP (P with ) (NP ( ART a ) ( N telescope ))))) PP attached with Noun (i.e. boy had a telescope) ( S ( NP ( PRON I ) ) ( VP ( V saw ) ( NP ( (ART a ) ( N boy ) (PP (P with ) (NP ( ART a ) ( N telescope ))))))
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( )
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) (b) ( ( PRON VP ) 1) match I, backup state (b) used
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) match I, backup
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 )
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ((PP)5) Consumed
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ((PP)5) Consumed
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) Consumed 11 ( ( NP ) 6 ) with
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6)
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6) 13 ( ( N ) 7 ) Consumed a
Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6) 13 ( ( N ) 7 ) Consumed a 14 ( ( ) 8 ) Consume telescope Finish Parsing
Top Down Parsing - Observations Top down parsing gave us the Verb Attachment Parse Tree (i.e., Iuseda telescope) To obtain the alternate parse tree, the backup state in step 5 will have to be invoked Is there an efficient way to obtain all parses?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purple for Noun Attachment Parse Red for Invalid Parse
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 S 1? NP 12 VP 2?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP??
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58 VP 28 V 23 NP 35 PP 58 VP 28 V 23 NP 38
Bottom Up Parse I saw a boy with a telescope 1 2 3 4 5 6 7 8 NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58 VP 28 V 23 NP 35 PP 58 VP 28 V 23 NP 38 S 18 NP 12 VP 28
Bottom Up Parsing - Observations Both Noun Attachment and Verb Attachment Parses obtained by simply systematically applying the rules Numbers in subscript help in verifying the parse and getting chunks from the parse
Exercise For the sentence, The man saw the boy with a telescope & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.
Start of Probabilistic Parsing
Example of Sentence labeling: Parsing [ S1[ S[ S[ VP[ VBCome][ NP[ NNPJuly]]]] [,,] [ CC and] [ S [ NP [ DT the] [ JJ IIT] [ NN campus]] [ VP [ AUX is] [ ADJP [ JJ abuzz] [ PP [ IN with] [ NP [ ADJP [ JJ new] [ CC and] [ VBG returning]] [ NNS students]]]]]] [..]]]
Noisy Channel Modeling Source sentence Noisy Channel Target parse T*= argmax [P(T S)] T = argmax [P(T).P(S T)] ( )] T = argmax [P(T)], since given the parse the T sentence is completely determined and P(S T)=1
Corpus A collection of text called corpus, is used for collecting various language data With annotation: more information, but manual labor intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged words. Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics
Discriminative vs. Generative Model W * = argmax (P(W SS)) W Discriminativei i i Model Generative Model Compute directly from P(W SS) Compute from P(W).P(SS W)
Language Models N-grams: sequence of n consecutive words/characters Probabilistic / Stochastic Context Free Grammars: Simple probabilistic models capable of handling recursion A CFG with probabilities attached to rules Rule probabilities how likely is it that a particular rewrite rule is used?
PCFGs Why PCFGs? Intuitive probabilistic models for tree-structured languages Algorithms are extensions of HMM algorithms Better than the n-gram model for language modeling.
Formal Definition of PCFG A PCFG consists of A set of terminals {w k }, k = 1,.,V {w k } = { child, teddy, bear, played } A set of non-terminals {N i }, i = 1,,n {N i } = { NP, VP, DT } A designated start symbol N 1 A set of rules {N i ζ j }, where ζ j is a sequence of terminals & non-terminals NP DT NN A corresponding set of rule probabilities bili i
Rule Probabilities Rule probabilities are such that i j i P(N ζ ) = 1 i Eg E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = 03 0.3 P( NP DT NN) = 0.2 Means 20 % of the training data parses Means 20 % of the training data parses use the rule NP DT NN
Probabilistic Context Free Grammars S NP VP 10 1.0 DT the 10 1.0 NP DT NN 0.5 NN gunman 0.5 NP NNS 03 0.3 NN building 0.5 NP NP PP 0.2 VBD sprayed 1.0 PP PNP 1.0 NNS bullets 1.0 VP VP PP 0.6 VP VBD NP 0.4
Example Parse t 1` The gunman sprayed the building with bullets. S 1.0 NP 0.5 VP 0.6 P (t 1 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = DT 1.0 NN 0.00225 0.5 VP 0.4 PP 1.0 The gunman VBD 1.0 NP 0.5 P 1.0 NP 0.3 sprayed DT 1.0 NN 0.5 with NNS 1.0 the building bullets
Another Parse t 2 The gunman sprayed the building with bullets. S 1.0 P (t 2 ) = 1.0 * 0.5 * 1.0 * 0.5 * NP 0.5 VP 0.4 DT 1.0 NN 0.5 VBD 1.0 NP 0.2 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015 The gunman sprayed NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 th building with NNS 1.0 e bullet s
Probability of a sentence Notation : w ab subsequence w a.w b N j NP N j dominates w a.w b or yield(n j ) = w a.w b w a..w b the..sweet..teddy..be Probability of a sentence = P(w 1m) ) Pw ( ) = Pw (, t) 1m 1m t = PtPw ( ) ( t) t 1m Where t is a parse tree of the sentence = Pt () Q Pw ( 1m t ) = 1 t: yield ( t) = w1 m If t is a parse tree for the sentence w 1m, this will be 1!!
Assumptions of the PCFG model Place invariance : P(NP DT NN) is same in locations 1 and 2 Context-free : P(NP DT NN anything outside The child ) = P(NP DT NN) Ancestor free : At 2, P(NP DT NN its ancestor is VP) =P(NP DT NN) 1 NP The child S VP 2 NP The toy
Probability of a parse tree Domination :We say N j dominates from k to l, symbolized as, if W k,l is derived from N j P (tree sentence) = P (tree S 1l 1,l ) where S 1,l means that the start symbol S dominates the word sequence W 1,l P (t s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)
Probability of a parse tree (cont.) S 1,l NP 1,2 VP 3,l P ( t s ) = P (t S 1,l ) DT 1 N 2 V 3,3 PP 4,l = P ( NP 1,2, DT 1,1, w 1, w w P NP 1 2 w 3 4,4 5,l N 2,2, w 2, VP 3,l, V 3,3, w 3, PP 4,l, P 4,4, w 4, NP 5,l, w 5 l S 1,l ) w 4 w 5 w l = P ( NP 1,2, VP 3,l S 1,l ) * P ( DT 1,1, N 2,2 NP 1,2 ) * D(w 1 DT 1,1 ) * P( (w 2 N 2,2 )*P(V 3,3, PP 4,l VP 3,l )*P( P(w 3 V 3,3 )*P(P P 4,4, NP 5,l PP 4,l ) * P(w 4 P 4,4 ) * P (w 5 l NP 5,l ) (Using Chain Rule, Context Freeness and Ancestor Freeness )
HMM PCFG O observed sequence w 1m sentence X state sequence t parse tree μ model G grammar Three fundamental questions
HMM PCFG How likely is a certain observation given the model? How likely l is a sentence given the grammar? PO ( μ ) Pw ( G ) How to choose a state sequence which best explains the observations? How to choose a parse which best supports the sentence? 1m arg max PX ( Oμ, ) arg max P( t w1 m, G) X t
HMM PCFG How to choose the model parameters that best explain the observed data? How to choose rule probabilities which maximize the probabilities of the observed sentences? arg max PO ( ) μ μ P w1 m G arg max ( G)
Interesting Probabilities N 1 What is the probability of having a NP at this position such that it will derive the building? - β NP (4,5) NP Inside Probabilities The gunman sprayed the building with bullets 1 2 3 4 5 6 7 Outside Probabilities biliti What is the probability of starting from N 1 and deriving The gunman sprayed, a NP and with bullets? - α NP (4,5)
Interesting Probabilities Random variables to be considered The non-terminal being expanded. E.g., NP The word-span covered by the non-terminal. E.g., (4,5) refers to words the building While calculating gp probabilities, consider: The rule to be used for expansion : E.g., NP DT NN The probabilities associated with the RHS nonterminals : E.g., DT subtree s inside/outside probabilities & NN subtree s inside/outside probabilities
Outside Probability α j(p,q) : The probability of beginning with N 1 & generating the non-terminal N j pq and all words outside w p..w q ( p, q) P( w, N, w G) j α j = 1( p 1) pq ( q+ 1) m N 1 N j w 1 w p-1 w p w q w q+1 w m
Inside Probabilities β j (p,q) : The probability of generating the words w p..w q starting with the non-terminal N j pq. β ( pq, ) = Pw ( N j, G) j pq pq N 1 α N j β w 1 w p-1 w p w q w q+1 w m
α = NP Outside & Inside Probabilities: example (4,5) for "the building" P (The gunman sprayed, NP,with bullets G) β (4,5) for "the building" = (the building, ) NP P NP4,5 G N 1 4,5 NP The gunman sprayed the building with bullets 1 2 3 4 5 6 7