CS460/626 : Natural Language

Size: px
Start display at page:

Download "CS460/626 : Natural Language"


1 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 23, 24 Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th, 10 th March, 2011 (Lectures 21 and 22 were on Sentiment Analysis (Lectures 21 and 22 were on Sentiment Analysis by Aditya Joshi)

2 A note on Language Modeling Example sentence ^ The tortoise beat the hare in the race. Guided Guided by Guided by by frequency Language world Knowledge Knowledge N-gram (n=3) CFG Probabilistic CFG Dependency Grammar ^ the tortoise 5*10-3 S-> NP VP S->NP VP Semantic Roles agt, obj, sen, etc. the tortoise beat NP->DT N NP->DT N Semantic Rules 3* are always tortoise beat the VP->V NP VP->V NP PP between 7*10-5 PP 0.4 Heads beat the hare PP-> P NP PP-> P NP 5* Prob. DG Semantic Roles with probabilities

3 Parse Tree S NP VP DT N V NP PP The Tortoise beat DT N P NP the hare in DT N the race

4 UNL Expression agt obj scn (scene)

5 Purpose of LM Prediction of next word (Speech Processing) Language Identification (for same script) Belongingness check (parsing) P(NP->DT N) means what is the probability that the YIELD of the non terminal NP is DT N

6 Need for Deep Parsing Sentences are linear structures But there is a hierarchy- a tree- hidden behind the linear structure There are constituents and branches

7 PPs are at the same level: flat with respect to the head word book NP No distinction in terms of dominance or c-command command The AP PP PP book with the blue cover big of poems [The big book of poems with the [The big book of poems with the Blue cover] is on the table.

8 Constituency test of Replacement runs into problems One-replacement: Ib bought htthe big [book of poems with iththe blue cover] not the small [one] One-replacemen targets book of poems with the blue cover Another one-replacement: I bought the big [book of poems] with the blue cover not the small [one] with the red cover One-replacemen targets book of poems

9 More deeply embedded structure NP N 1 The AP N 2 big N 3 PP N book PP with the blue cover of poems

10 Grammar and Parsing Algorithms

11 A simplified grammar S NP VP NP DT N N VP V ADV V

12 Example Sentence People e laugh These are positions Lexicon: People - N, V Laugh - N, V This indicate that both Noun and Verb is possible for the word People

13 Top-Down Parsing State Backup State Action ((S) 1) - - Position of input pointer 2. ((NP VP)1) - - 3a. ((DT N VP)1) ((N VP) 1) - 3b. ((N VP)1) ((VP)2) - Consume People 5a. ((V ADV)2) ((V)2) - 6. ((ADV)3) ((V)2) Consume laugh 5b. ((V)2) ((.)3) - Consume laugh Termination Condition : All inputs over. No symbols remaining. Note: Input symbols can be pushed back.

14 Discussion for Top-Down Parsing This kind of searching is goal driven This kind of searching is goal driven. Gives importance to textual precedence (rule precedence). No regard for data, a priori (useless expansions made).

15 Bottom-Up Parsing Some conventions: N 12 Represents positions S 1? -> NP 12 VP 2? End position unknown Work on the LHS done, while the work on RHS remaining

16 Bottom-Up Parsing (pictorial representation) S -> NP 12 VP 23 People Laugh N 12 N V 12 V 23 NP 12 -> N 12 NP 23 -> N 23 VP 12 -> V 12 VP 23 -> V S 1? -> NP 12 VP 2?

17 Problem with Top-Down Parsing Left Recursion Suppose you have A-> AB rule. Then we will have the expansion as follows: ((A)K) -> ((AB)K) -> ((ABB)K)..

18 Combining i top-down and bottom-up strategies

19 Top-Down Bottom-Up Chart Parsing Combines advantages of top-down & bottomup pparsing. Does not work in case of left recursion. e.g. People laugh People noun, verb Laugh noun, verb Grammar S NP VP NP DT N N VP V ADV V

20 Transitive Closure People laugh S NP VP NP N VP V NP DT N S NP VP S NP VP NP N N VP V V ADV success VP V

21 Arcs in Parsing Each arc represents a chart which records Completed work (left of ) Expected work (right of )

22 Example People laugh loudly S NP VP NP N VP V VP V ADV NP DT N S NP VP VP V ADV S NP VP NP N VP V ADV S NP VP VP V

23 Dealing With Structural Ambiguity Multiple parses for a sentence The man saw the boy with a telescope. The man saw the mountain with a telescope. The man saw the boy with the ponytail. At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2 nd &3 rd sentence.

24 Prepositional Phrase (PP) Attachment Problem V NP 1 P NP 2 (Here P means preposition) NP 2 attaches to NP 1? or NP 2 attaches to V?

25 Parse Trees for a Structurally Ambiguous Sentence Let the grammar be S NP VP NP DT N DT N PP PP PNP VP V NP PP V NP For the sentence, I saw a boy with a telescope

26 Parse Tree - 1 S NP VP N V NP I saw Det N a boy PP P NP with Det N a telescope

27 Parse Tree -2 S NP VP N V NP PP I saw Det N a boy P with NP Det N a telescope

28 Parsing Structural Ambiguity

29 Parsing for Structurally Ambiguous Sentences Sentence I saw a boy with a telescope Grammar: S NP VP NP ART N ART N PP PRON VP V NP PP V NP ART a an the N boy telescope PRON I V saw

30 Ambiguous Parses Two possible parses: PP attached with Verb (i.e. I used a telescope to see) ( S ( NP ( PRON I ))() VP ( V saw ) ( NP ( (ART a ) ( N boy )) ( PP (P with ) (NP ( ART a ) ( N telescope ))))) PP attached with Noun (i.e. boy had a telescope) ( S ( NP ( PRON I ) ) ( VP ( V saw ) ( NP ( (ART a ) ( N boy ) (PP (P with ) (NP ( ART a ) ( N telescope ))))))

31 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( )

32 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON

33 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) (b) ( ( PRON VP ) 1) match I, backup state (b) used

34 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) match I, backup

35 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup

36 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used

37 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw

38 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 )

39 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a

40 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ((PP)5) Consumed

41 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 2 ( ( NP VP ) 1 ) Use NP ART N ART N PP PRON 3 ((( ART N VP ) 1 ) (a) ( ART N PP VP ART does not (( ) th I b k 1 ) state (b) used (b) ( ( PRON VP ) 1) 3 B ( ( PRON VP ) 1 ) 4 ( ( VP ) 2 ) Consumed I match I, backup 5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) Verb Attachment Rule used 6 ( ( NP PP ) 3 ) Consumed saw 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ((PP)5) Consumed

42 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) Consumed 11 ( ( NP ) 6 ) with

43 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6)

44 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6) 13 ( ( N ) 7 ) Consumed a

45 Top Down Parse State Backup State Action Comments ((S)1) Use S NP VP 1 ( ) 7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 ) (b) ( ( PRON PP ) 3 ) 8 ( ( N PP) 4 ) Consumed a 9 ( ( PP ) 5 ) Consumed boy 10 ( ( P NP ) 5 ) 11 ((NP)6) ( ) Consumed with 12 ( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 ) (b) ( ( PRON ) 6) 13 ( ( N ) 7 ) Consumed a 14 ( ( ) 8 ) Consume telescope Finish Parsing

46 Top Down Parsing - Observations Top down parsing gave us the Verb Attachment Parse Tree (i.e., Iuseda telescope) To obtain the alternate parse tree, the backup state in step 5 will have to be invoked Is there an efficient way to obtain all parses?

47 Bottom Up Parse I saw a boy with a telescope Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purple for Noun Attachment Parse Red for Invalid Parse

48 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 S 1? NP 12 VP 2?

49 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP??

50 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5?

51 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5?

52 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34N 45 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?

53 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?

54 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?

55 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5?

56 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68

57 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58

58 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58 VP 28 V 23 NP 35 PP 58 VP 28 V 23 NP 38

59 Bottom Up Parse I saw a boy with a telescope NP 12 PRON 12 VP 2? V 23 NP 3? NP 35 ART 34 N 45 NP 35 ART 34 N 45 PP 5? P 56 NP 6? NP 68 ART 67 N 7? NP 68 ART 67 N 78 S 1? NP 12 VP 2? VP 2? V 23 NP 3? PP?? NP 3? ART 34 N 45 PP 5? NP 3? ART 34 N 45 PP 5? NP6? ART 67 N 78 PP 8? VP 25 V 23 NP 35 S 15 NP 12 VP 25 VP 2? V 23 NP 35 PP 5? PP 58 P 56 NP 68 NP 38 ART 34 N 45 PP 58 VP 28 V 23 NP 35 PP 58 VP 28 V 23 NP 38 S 18 NP 12 VP 28

60 Bottom Up Parsing - Observations Both Noun Attachment and Verb Attachment Parses obtained by simply systematically applying the rules Numbers in subscript help in verifying the parse and getting chunks from the parse

61 Exercise For the sentence, The man saw the boy with a telescope & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.

62 Start of Probabilistic Parsing

63 Example of Sentence labeling: Parsing [ S1[ S[ S[ VP[ VBCome][ NP[ NNPJuly]]]] [,,] [ CC and] [ S [ NP [ DT the] [ JJ IIT] [ NN campus]] [ VP [ AUX is] [ ADJP [ JJ abuzz] [ PP [ IN with] [ NP [ ADJP [ JJ new] [ CC and] [ VBG returning]] [ NNS students]]]]]] [..]]]

64 Noisy Channel Modeling Source sentence Noisy Channel Target parse T*= argmax [P(T S)] T = argmax [P(T).P(S T)] ( )] T = argmax [P(T)], since given the parse the T sentence is completely determined and P(S T)=1

65 Corpus A collection of text called corpus, is used for collecting various language data With annotation: more information, but manual labor intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged words. Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics

66 Discriminative vs. Generative Model W * = argmax (P(W SS)) W Discriminativei i i Model Generative Model Compute directly from P(W SS) Compute from P(W).P(SS W)

67 Language Models N-grams: sequence of n consecutive words/characters Probabilistic / Stochastic Context Free Grammars: Simple probabilistic models capable of handling recursion A CFG with probabilities attached to rules Rule probabilities how likely is it that a particular rewrite rule is used?

68 PCFGs Why PCFGs? Intuitive probabilistic models for tree-structured languages Algorithms are extensions of HMM algorithms Better than the n-gram model for language modeling.

69 Formal Definition of PCFG A PCFG consists of A set of terminals {w k }, k = 1,.,V {w k } = { child, teddy, bear, played } A set of non-terminals {N i }, i = 1,,n {N i } = { NP, VP, DT } A designated start symbol N 1 A set of rules {N i ζ j }, where ζ j is a sequence of terminals & non-terminals NP DT NN A corresponding set of rule probabilities bili i

70 Rule Probabilities Rule probabilities are such that i j i P(N ζ ) = 1 i Eg E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = P( NP DT NN) = 0.2 Means 20 % of the training data parses Means 20 % of the training data parses use the rule NP DT NN

71 Probabilistic Context Free Grammars S NP VP DT the NP DT NN 0.5 NN gunman 0.5 NP NNS NN building 0.5 NP NP PP 0.2 VBD sprayed 1.0 PP PNP 1.0 NNS bullets 1.0 VP VP PP 0.6 VP VBD NP 0.4

72 Example Parse t 1` The gunman sprayed the building with bullets. S 1.0 NP 0.5 VP 0.6 P (t 1 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = DT 1.0 NN VP 0.4 PP 1.0 The gunman VBD 1.0 NP 0.5 P 1.0 NP 0.3 sprayed DT 1.0 NN 0.5 with NNS 1.0 the building bullets

73 Another Parse t 2 The gunman sprayed the building with bullets. S 1.0 P (t 2 ) = 1.0 * 0.5 * 1.0 * 0.5 * NP 0.5 VP 0.4 DT 1.0 NN 0.5 VBD 1.0 NP * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = The gunman sprayed NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 th building with NNS 1.0 e bullet s

74 Probability of a sentence Notation : w ab subsequence w a.w b N j NP N j dominates w a.w b or yield(n j ) = w a.w b w a..w b the..sweet..teddy..be Probability of a sentence = P(w 1m) ) Pw ( ) = Pw (, t) 1m 1m t = PtPw ( ) ( t) t 1m Where t is a parse tree of the sentence = Pt () Q Pw ( 1m t ) = 1 t: yield ( t) = w1 m If t is a parse tree for the sentence w 1m, this will be 1!!

75 Assumptions of the PCFG model Place invariance : P(NP DT NN) is same in locations 1 and 2 Context-free : P(NP DT NN anything outside The child ) = P(NP DT NN) Ancestor free : At 2, P(NP DT NN its ancestor is VP) =P(NP DT NN) 1 NP The child S VP 2 NP The toy

76 Probability of a parse tree Domination :We say N j dominates from k to l, symbolized as, if W k,l is derived from N j P (tree sentence) = P (tree S 1l 1,l ) where S 1,l means that the start symbol S dominates the word sequence W 1,l P (t s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)

77 Probability of a parse tree (cont.) S 1,l NP 1,2 VP 3,l P ( t s ) = P (t S 1,l ) DT 1 N 2 V 3,3 PP 4,l = P ( NP 1,2, DT 1,1, w 1, w w P NP 1 2 w 3 4,4 5,l N 2,2, w 2, VP 3,l, V 3,3, w 3, PP 4,l, P 4,4, w 4, NP 5,l, w 5 l S 1,l ) w 4 w 5 w l = P ( NP 1,2, VP 3,l S 1,l ) * P ( DT 1,1, N 2,2 NP 1,2 ) * D(w 1 DT 1,1 ) * P( (w 2 N 2,2 )*P(V 3,3, PP 4,l VP 3,l )*P( P(w 3 V 3,3 )*P(P P 4,4, NP 5,l PP 4,l ) * P(w 4 P 4,4 ) * P (w 5 l NP 5,l ) (Using Chain Rule, Context Freeness and Ancestor Freeness )

78 HMM PCFG O observed sequence w 1m sentence X state sequence t parse tree μ model G grammar Three fundamental questions

79 HMM PCFG How likely is a certain observation given the model? How likely l is a sentence given the grammar? PO ( μ ) Pw ( G ) How to choose a state sequence which best explains the observations? How to choose a parse which best supports the sentence? 1m arg max PX ( Oμ, ) arg max P( t w1 m, G) X t

80 HMM PCFG How to choose the model parameters that best explain the observed data? How to choose rule probabilities which maximize the probabilities of the observed sentences? arg max PO ( ) μ μ P w1 m G arg max ( G)

81 Interesting Probabilities N 1 What is the probability of having a NP at this position such that it will derive the building? - β NP (4,5) NP Inside Probabilities The gunman sprayed the building with bullets Outside Probabilities biliti What is the probability of starting from N 1 and deriving The gunman sprayed, a NP and with bullets? - α NP (4,5)

82 Interesting Probabilities Random variables to be considered The non-terminal being expanded. E.g., NP The word-span covered by the non-terminal. E.g., (4,5) refers to words the building While calculating gp probabilities, consider: The rule to be used for expansion : E.g., NP DT NN The probabilities associated with the RHS nonterminals : E.g., DT subtree s inside/outside probabilities & NN subtree s inside/outside probabilities

83 Outside Probability α j(p,q) : The probability of beginning with N 1 & generating the non-terminal N j pq and all words outside w p..w q ( p, q) P( w, N, w G) j α j = 1( p 1) pq ( q+ 1) m N 1 N j w 1 w p-1 w p w q w q+1 w m

84 Inside Probabilities β j (p,q) : The probability of generating the words w p..w q starting with the non-terminal N j pq. β ( pq, ) = Pw ( N j, G) j pq pq N 1 α N j β w 1 w p-1 w p w q w q+1 w m

85 α = NP Outside & Inside Probabilities: example (4,5) for "the building" P (The gunman sprayed, NP,with bullets G) β (4,5) for "the building" = (the building, ) NP P NP4,5 G N 1 4,5 NP The gunman sprayed the building with bullets

Processing/Speech, NLP and the Web

Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25 Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March, 2011 Bracketed Structure: Treebank Corpus [ S1[

More information

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012 Parsing Problem Semantics Part of Speech Tagging NLP Trinity Morph Analysis

More information

Artificial Intelligence

Artificial Intelligence CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20-21 Natural Language Parsing Parsing of Sentences Are sentences flat linear structures? Why tree? Is

More information

CS : Speech, NLP and the Web/Topics in AI

CS : Speech, NLP and the Web/Topics in AI CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; insideoutside probabilities Probability of a parse tree (cont.) S 1,l NP 1,2

More information

CS460/626 : Natural Language

CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 27 SMT Assignment; HMM recap; Probabilistic Parsing cntd) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th March, 2011 CMU Pronunciation

More information

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):

More information

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09 Natural Language Processing : Probabilistic Context Free Grammars Updated 5/09 Motivation N-gram models and HMM Tagging only allowed us to process sentences linearly. However, even simple sentences require

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other

More information

Probabilistic Context-free Grammars

Probabilistic Context-free Grammars Probabilistic Context-free Grammars Computational Linguistics Alexander Koller 24 November 2017 The CKY Recognizer S NP VP NP Det N VP V NP V ate NP John Det a N sandwich i = 1 2 3 4 k = 2 3 4 5 S NP John

More information

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) Parsing Based on presentations from Chris Manning s course on Statistical Parsing (Stanford) S N VP V NP D N John hit the ball Levels of analysis Level Morphology/Lexical POS (morpho-synactic), WSD Elements

More information

Probabilistic Context-Free Grammar

Probabilistic Context-Free Grammar Probabilistic Context-Free Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

Advanced Natural Language Processing Syntactic Parsing

Advanced Natural Language Processing Syntactic Parsing Advanced Natural Language Processing Syntactic Parsing Alicia Ageno ageno@cs.upc.edu Universitat Politècnica de Catalunya NLP statistical parsing 1 Parsing Review Statistical Parsing SCFG Inside Algorithm

More information


LECTURER: BURCU CAN Spring LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can

More information

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Probabilistic Context-Free Grammars. Michael Collins, Columbia University Probabilistic Context-Free Grammars Michael Collins, Columbia University Overview Probabilistic Context-Free Grammars (PCFGs) The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University September 27, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

CS460/626 : Natural Language Processing/Speech, NLP and the Web

CS460/626 : Natural Language Processing/Speech, NLP and the Web CS460/626 : Natural Language Processing/Speech, NLP and the Web Lecture 23: Binding Theory Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th Oct, 2012 Parsing Problem Semantics Part of Speech Tagging NLP

More information

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016 CKY & Earley Parsing Ling 571 Deep Processing Techniques for NLP January 13, 2016 No Class Monday: Martin Luther King Jr. Day CKY Parsing: Finish the parse Recognizer à Parser Roadmap Earley parsing Motivation:

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Stochastic Grammars Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(22) Structured Classification

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars Berlin Chen 2005 References: 1. Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2. Speech and Language Processing, chapters 9, 10 NLP-Berlin Chen 1 Grammars

More information

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities 1 / 22 Inside L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 Inside- 2 / 22 for PCFGs 3 questions for Probabilistic Context Free Grammars (PCFGs): What is the probability of a sentence

More information

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering Review Earley Algorithm Chapter 13.4 Lecture #9 October 2009 Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up

More information

DT2118 Speech and Speaker Recognition

DT2118 Speech and Speaker Recognition DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 26 February 2018 Recap: tagging POS tagging is a sequence labelling task.

More information

Lecture 12: Algorithms for HMMs

Lecture 12: Algorithms for HMMs Lecture 12: Algorithms for HMMs Nathan Schneider (some slides from Sharon Goldwater; thanks to Jonathan May for bug fixes) ENLP 17 October 2016 updated 9 September 2017 Recap: tagging POS tagging is a

More information

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging

ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging ACS Introduction to NLP Lecture 2: Part of Speech (POS) Tagging Stephen Clark Natural Language and Information Processing (NLIP) Group sc609@cam.ac.uk The POS Tagging Problem 2 England NNP s POS fencers

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins

Probabilistic Context Free Grammars. Many slides from Michael Collins Probabilistic Context Free Grammars Many slides from Michael Collins Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic Context-Free Grammar

More information

10/17/04. Today s Main Points

10/17/04. Today s Main Points Part-of-speech Tagging & Hidden Markov Model Intro Lecture #10 Introduction to Natural Language Processing CMPSCI 585, Fall 2004 University of Massachusetts Amherst Andrew McCallum Today s Main Points

More information

Parsing with Context-Free Grammars

Parsing with Context-Free Grammars Parsing with Context-Free Grammars CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences

More information

Lecture 13: Structured Prediction

Lecture 13: Structured Prediction Lecture 13: Structured Prediction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501: NLP 1 Quiz 2 v Lectures 9-13 v Lecture 12: before page

More information

A Context-Free Grammar

A Context-Free Grammar Statistical Parsing A Context-Free Grammar S VP VP Vi VP Vt VP VP PP DT NN PP PP P Vi sleeps Vt saw NN man NN dog NN telescope DT the IN with IN in Ambiguity A sentence of reasonable length can easily

More information

CS 6120/CS4120: Natural Language Processing

CS 6120/CS4120: Natural Language Processing CS 6120/CS4120: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Assignment/report submission

More information

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22 Parsing Probabilistic CFG (PCFG) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 22 Table of contents 1 Introduction 2 PCFG 3 Inside and outside probability 4 Parsing Jurafsky

More information

Introduction to Probablistic Natural Language Processing

Introduction to Probablistic Natural Language Processing Introduction to Probablistic Natural Language Processing Alexis Nasr Laboratoire d Informatique Fondamentale de Marseille Natural Language Processing Use computers to process human languages Machine Translation

More information

Probabilistic Context Free Grammars

Probabilistic Context Free Grammars 1 Defining PCFGs A PCFG G consists of Probabilistic Context Free Grammars 1. A set of terminals: {w k }, k = 1..., V 2. A set of non terminals: { i }, i = 1..., n 3. A designated Start symbol: 1 4. A set

More information

The Noisy Channel Model and Markov Models

The Noisy Channel Model and Markov Models 1/24 The Noisy Channel Model and Markov Models Mark Johnson September 3, 2014 2/24 The big ideas The story so far: machine learning classifiers learn a function that maps a data item X to a label Y handle

More information

{Probabilistic Stochastic} Context-Free Grammars (PCFGs)

{Probabilistic Stochastic} Context-Free Grammars (PCFGs) {Probabilistic Stochastic} Context-Free Grammars (PCFGs) 116 The velocity of the seismic waves rises to... S NP sg VP sg DT NN PP risesto... The velocity IN NP pl of the seismic waves 117 PCFGs APCFGGconsists

More information

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark Penn Treebank Parsing Advanced Topics in Language Processing Stephen Clark 1 The Penn Treebank 40,000 sentences of WSJ newspaper text annotated with phrasestructure trees The trees contain some predicate-argument

More information

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning Probabilistic Context Free Grammars Many slides from Michael Collins and Chris Manning Overview I Probabilistic Context-Free Grammars (PCFGs) I The CKY Algorithm for parsing with PCFGs A Probabilistic

More information

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 8 POS tagset) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th Jan, 2012

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 8 POS tagset) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th Jan, 2012 CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 8 POS tagset) Pushpak Bhattacharyya CSE Dept., IIT Bombay 17 th Jan, 2012 HMM: Three Problems Problem Problem 1: Likelihood of a

More information

CS 712: Topics in NLP Linguistic Phrases and Statistical Phrases

CS 712: Topics in NLP Linguistic Phrases and Statistical Phrases CS 712: Topics in NLP Linguistic Phrases and Statistical Phrases Pushpak Bhattacharyya, CSE Department, IIT Bombay 18 March, 2013 (main text: Syntax by Adrew Carnie, Blackwell Publication, 2002) Domination

More information

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing L445 / L545 / B659 Dept. of Linguistics, Indiana University Spring 2016 1 / 46 : Overview Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the

More information

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46. : Overview L545 Dept. of Linguistics, Indiana University Spring 2013 Input: a string Output: a (single) parse tree A useful step in the process of obtaining meaning We can view the problem as searching

More information

Spectral Unsupervised Parsing with Additive Tree Metrics

Spectral Unsupervised Parsing with Additive Tree Metrics Spectral Unsupervised Parsing with Additive Tree Metrics Ankur Parikh, Shay Cohen, Eric P. Xing Carnegie Mellon, University of Edinburgh Ankur Parikh 2014 1 Overview Model: We present a novel approach

More information

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation atural Language Processing 1 lecture 7: constituent parsing Ivan Titov Institute for Logic, Language and Computation Outline Syntax: intro, CFGs, PCFGs PCFGs: Estimation CFGs: Parsing PCFGs: Parsing Parsing

More information

Part A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 )

Part A. P (w 1 )P (w 2 w 1 )P (w 3 w 1 w 2 ) P (w M w 1 w 2 w M 1 ) P (w 1 )P (w 2 w 1 )P (w 3 w 2 ) P (w M w M 1 ) Part A 1. A Markov chain is a discrete-time stochastic process, defined by a set of states, a set of transition probabilities (between states), and a set of initial state probabilities; the process proceeds

More information

Lecture 9: Hidden Markov Model

Lecture 9: Hidden Markov Model Lecture 9: Hidden Markov Model Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501 Natural Language Processing 1 This lecture v Hidden Markov

More information

Statistical Methods for NLP

Statistical Methods for NLP Statistical Methods for NLP Sequence Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakim.nivre@lingfil.uu.se Statistical Methods for NLP 1(21) Introduction Structured

More information

POS-Tagging. Fabian M. Suchanek

POS-Tagging. Fabian M. Suchanek POS-Tagging Fabian M. Suchanek 100 Def: POS A Part-of-Speech (also: POS, POS-tag, word class, lexical class, lexical category) is a set of words with the same grammatical role. Alizée wrote a really great

More information

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment. even Lectures on tatistical Parsing Christopher Manning LA Linguistic Institute 7 LA Lecture Attendee information Please put on a piece of paper: ame: Affiliation: tatus (undergrad, grad, industry, prof,

More information

Features of Statistical Parsers

Features of Statistical Parsers Features of tatistical Parsers Preliminary results Mark Johnson Brown University TTI, October 2003 Joint work with Michael Collins (MIT) upported by NF grants LI 9720368 and II0095940 1 Talk outline tatistical

More information

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing CS 562: Empirical Methods in Natural Language Processing Unit 2: Tree Models Lectures 19-23: Context-Free Grammars and Parsing Oct-Nov 2009 Liang Huang (lhuang@isi.edu) Big Picture we have already covered...

More information

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this.

This kind of reordering is beyond the power of finite transducers, but a synchronous CFG can do this. Chapter 12 Synchronous CFGs Synchronous context-free grammars are a generalization of CFGs that generate pairs of related strings instead of single strings. They are useful in many situations where one

More information

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018.

Recap: HMM. ANLP Lecture 9: Algorithms for HMMs. More general notation. Recap: HMM. Elements of HMM: Sharon Goldwater 4 Oct 2018. Recap: HMM ANLP Lecture 9: Algorithms for HMMs Sharon Goldwater 4 Oct 2018 Elements of HMM: Set of states (tags) Output alphabet (word types) Start state (beginning of sentence) State transition probabilities

More information

CSCI 5832 Natural Language Processing. Today 2/19. Statistical Sequence Classification. Lecture 9

CSCI 5832 Natural Language Processing. Today 2/19. Statistical Sequence Classification. Lecture 9 CSCI 5832 Natural Language Processing Jim Martin Lecture 9 1 Today 2/19 Review HMMs for POS tagging Entropy intuition Statistical Sequence classifiers HMMs MaxEnt MEMMs 2 Statistical Sequence Classification

More information

Chapter 14 (Partially) Unsupervised Parsing

Chapter 14 (Partially) Unsupervised Parsing Chapter 14 (Partially) Unsupervised Parsing The linguistically-motivated tree transformations we discussed previously are very effective, but when we move to a new language, we may have to come up with

More information

Multiword Expression Identification with Tree Substitution Grammars

Multiword Expression Identification with Tree Substitution Grammars Multiword Expression Identification with Tree Substitution Grammars Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning Stanford University EMNLP 2011 Main Idea Use syntactic

More information

Probabilistic Linguistics

Probabilistic Linguistics Matilde Marcolli MAT1509HS: Mathematical and Computational Linguistics University of Toronto, Winter 2019, T 4-6 and W 4, BA6180 Bernoulli measures finite set A alphabet, strings of arbitrary (finite)

More information

Multilevel Coarse-to-Fine PCFG Parsing

Multilevel Coarse-to-Fine PCFG Parsing Multilevel Coarse-to-Fine PCFG Parsing Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David Ellis, Isaac Haxton, Catherine Hill, Shrivaths Iyengar, Jeremy Moore, Michael Pozar, and Theresa

More information

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7

Quiz 1, COMS Name: Good luck! 4705 Quiz 1 page 1 of 7 Quiz 1, COMS 4705 Name: 10 30 30 20 Good luck! 4705 Quiz 1 page 1 of 7 Part #1 (10 points) Question 1 (10 points) We define a PCFG where non-terminal symbols are {S,, B}, the terminal symbols are {a, b},

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

X-bar theory. X-bar :

X-bar theory. X-bar : is one of the greatest contributions of generative school in the filed of knowledge system. Besides linguistics, computer science is greatly indebted to Chomsky to have propounded the theory of x-bar.

More information

Lecture 3: ASR: HMMs, Forward, Viterbi

Lecture 3: ASR: HMMs, Forward, Viterbi Original slides by Dan Jurafsky CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 3: ASR: HMMs, Forward, Viterbi Fun informative read on phonetics The

More information

Statistical methods in NLP, lecture 7 Tagging and parsing

Statistical methods in NLP, lecture 7 Tagging and parsing Statistical methods in NLP, lecture 7 Tagging and parsing Richard Johansson February 25, 2014 overview of today's lecture HMM tagging recap assignment 3 PCFG recap dependency parsing VG assignment 1 overview

More information

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0 Massachusetts Institute of Technology 6.863J/9.611J, Natural Language Processing, Spring, 2001 Department of Electrical Engineering and Computer Science Department of Brain and Cognitive Sciences Handout

More information

Constituency Parsing

Constituency Parsing CS5740: Natural Language Processing Spring 2017 Constituency Parsing Instructor: Yoav Artzi Slides adapted from Dan Klein, Dan Jurafsky, Chris Manning, Michael Collins, Luke Zettlemoyer, Yejin Choi, and

More information

Maxent Models and Discriminative Estimation

Maxent Models and Discriminative Estimation Maxent Models and Discriminative Estimation Generative vs. Discriminative models (Reading: J+M Ch6) Introduction So far we ve looked at generative models Language models, Naive Bayes But there is now much

More information

Sequence Labeling: HMMs & Structured Perceptron

Sequence Labeling: HMMs & Structured Perceptron Sequence Labeling: HMMs & Structured Perceptron CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu HMM: Formal Specification Q: a finite set of N states Q = {q 0, q 1, q 2, q 3, } N N Transition

More information

CS838-1 Advanced NLP: Hidden Markov Models

CS838-1 Advanced NLP: Hidden Markov Models CS838-1 Advanced NLP: Hidden Markov Models Xiaojin Zhu 2007 Send comments to jerryzhu@cs.wisc.edu 1 Part of Speech Tagging Tag each word in a sentence with its part-of-speech, e.g., The/AT representative/nn

More information

Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically

Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically NP S AUX VP Ali will V NP help D N the man A node is any point in the tree diagram and it can

More information

A gentle introduction to Hidden Markov Models

A gentle introduction to Hidden Markov Models A gentle introduction to Hidden Markov Models Mark Johnson Brown University November 2009 1 / 27 Outline What is sequence labeling? Markov models Hidden Markov models Finding the most likely state sequence

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Hidden Markov Models Murhaf Fares & Stephan Oepen Language Technology Group (LTG) October 27, 2016 Recap: Probabilistic Language

More information

Lecture 5: UDOP, Dependency Grammars

Lecture 5: UDOP, Dependency Grammars Lecture 5: UDOP, Dependency Grammars Jelle Zuidema ILLC, Universiteit van Amsterdam Unsupervised Language Learning, 2014 Generative Model objective PCFG PTSG CCM DMV heuristic Wolff (1984) UDOP ML IO K&M

More information

Graphical models for part of speech tagging

Graphical models for part of speech tagging Indian Institute of Technology, Bombay and Research Division, India Research Lab Graphical models for part of speech tagging Different Models for POS tagging HMM Maximum Entropy Markov Models Conditional

More information

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov

Probabilistic Graphical Models: MRFs and CRFs. CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Probabilistic Graphical Models: MRFs and CRFs CSE628: Natural Language Processing Guest Lecturer: Veselin Stoyanov Why PGMs? PGMs can model joint probabilities of many events. many techniques commonly

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs

Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs Empirical Methods in Natural Language Processing Lecture 11 Part-of-speech tagging and HMMs (based on slides by Sharon Goldwater and Philipp Koehn) 21 February 2018 Nathan Schneider ENLP Lecture 11 21

More information

Hidden Markov Models

Hidden Markov Models CS 2750: Machine Learning Hidden Markov Models Prof. Adriana Kovashka University of Pittsburgh March 21, 2016 All slides are from Ray Mooney Motivating Example: Part Of Speech Tagging Annotate each word

More information

Basic Text Analysis. Hidden Markov Models. Joakim Nivre. Uppsala University Department of Linguistics and Philology

Basic Text Analysis. Hidden Markov Models. Joakim Nivre. Uppsala University Department of Linguistics and Philology Basic Text Analysis Hidden Markov Models Joakim Nivre Uppsala University Department of Linguistics and Philology joakimnivre@lingfiluuse Basic Text Analysis 1(33) Hidden Markov Models Markov models are

More information

Hidden Markov Models in Language Processing

Hidden Markov Models in Language Processing Hidden Markov Models in Language Processing Dustin Hillard Lecture notes courtesy of Prof. Mari Ostendorf Outline Review of Markov models What is an HMM? Examples General idea of hidden variables: implications

More information

Stochastic Parsing. Roberto Basili

Stochastic Parsing. Roberto Basili Stochastic Parsing Roberto Basili Department of Computer Science, System and Production University of Roma, Tor Vergata Via Della Ricerca Scientifica s.n.c., 00133, Roma, ITALY e-mail: basili@info.uniroma2.it

More information

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models 1 University of Oslo : Department of Informatics INF4820: Algorithms for Artificial Intelligence and Natural Language Processing Language Models & Hidden Markov Models Stephan Oepen & Erik Velldal Language

More information

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

CMPT-825 Natural Language Processing. Why are parsing algorithms important? CMPT-825 Natural Language Processing Anoop Sarkar http://www.cs.sfu.ca/ anoop October 26, 2010 1/34 Why are parsing algorithms important? A linguistic theory is implemented in a formal system to generate

More information

A* Search. 1 Dijkstra Shortest Path

A* Search. 1 Dijkstra Shortest Path A* Search Consider the eight puzzle. There are eight tiles numbered 1 through 8 on a 3 by three grid with nine locations so that one location is left empty. We can move by sliding a tile adjacent to the

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing CS 4120/6120 Spring 2017 Northeastern University David Smith with some slides from Jason Eisner & Andrew

More information

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015 Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch COMP-599 Oct 1, 2015 Announcements Research skills workshop today 3pm-4:30pm Schulich Library room 313 Start thinking about

More information

Introduction to Computational Linguistics

Introduction to Computational Linguistics Introduction to Computational Linguistics Olga Zamaraeva (2018) Based on Bender (prev. years) University of Washington May 3, 2018 1 / 101 Midterm Project Milestone 2: due Friday Assgnments 4& 5 due dates

More information

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP  NP PP 1.0. N people 0. /6/7 CS 6/CS: Natural Language Processing Instructor: Prof. Lu Wang College of Computer and Information Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang The grammar: Binary, no epsilons,.9..5

More information


Alessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005 Alessandro Mazzei Dipartimento di Informatica Università di Torino MATER DI CIENZE COGNITIVE GENOVA 2005 04-11-05 Natural Language Grammars and Parsing Natural Language yntax Paolo ama Francesca yntactic

More information

Soft Inference and Posterior Marginals. September 19, 2013

Soft Inference and Posterior Marginals. September 19, 2013 Soft Inference and Posterior Marginals September 19, 2013 Soft vs. Hard Inference Hard inference Give me a single solution Viterbi algorithm Maximum spanning tree (Chu-Liu-Edmonds alg.) Soft inference

More information

Natural Language Processing

Natural Language Processing SFU NatLangLab Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class Simon Fraser University October 9, 2018 0 Natural Language Processing Anoop Sarkar anoopsarkar.github.io/nlp-class

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Computational Models - Lecture 4 1

Computational Models - Lecture 4 1 Computational Models - Lecture 4 1 Handout Mode Iftach Haitner. Tel Aviv University. November 21, 2016 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice Herlihy, Brown University.

More information

CSE 490 U Natural Language Processing Spring 2016

CSE 490 U Natural Language Processing Spring 2016 CSE 490 U Natural Language Processing Spring 2016 Feature Rich Models Yejin Choi - University of Washington [Many slides from Dan Klein, Luke Zettlemoyer] Structure in the output variable(s)? What is the

More information

LING 473: Day 10. START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars

LING 473: Day 10. START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars LING 473: Day 10 START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars 1 Issues with Projects 1. *.sh files must have #!/bin/sh at the top (to run on Condor) 2. If run.sh is supposed

More information

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis 1 Introduction Parenthesis Matching Problem Describe the set of arithmetic expressions with correctly matched parenthesis. Arithmetic expressions with correctly matched parenthesis cannot be described

More information

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example

Administrivia. Test I during class on 10 March. Bottom-Up Parsing. Lecture An Introductory Example Administrivia Test I during class on 10 March. Bottom-Up Parsing Lecture 11-12 From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS14 Lecture 11 1 2/20/08 Prof. Hilfinger CS14 Lecture 11 2 Bottom-Up

More information

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009

CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models. The ischool University of Maryland. Wednesday, September 30, 2009 CMSC 723: Computational Linguistics I Session #5 Hidden Markov Models Jimmy Lin The ischool University of Maryland Wednesday, September 30, 2009 Today s Agenda The great leap forward in NLP Hidden Markov

More information

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing Natural Language Processing! CS 6120 Spring 2014! Northeastern University!! David Smith! with some slides from Jason Eisner & Andrew

More information

Computational Models - Lecture 4

Computational Models - Lecture 4 Computational Models - Lecture 4 Regular languages: The Myhill-Nerode Theorem Context-free Grammars Chomsky Normal Form Pumping Lemma for context free languages Non context-free languages: Examples Push

More information