Alessandro Mazzei MASTER DI SCIENZE COGNITIVE GENOVA 2005

Similar documents
Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Probabilistic Context-Free Grammars. Michael Collins, Columbia University

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Dependency grammar. Recurrent neural networks. Transition-based neural parsing. Word representations. Informs Models

Probabilistic Context Free Grammars. Many slides from Michael Collins

CS460/626 : Natural Language

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

Multiword Expression Identification with Tree Substitution Grammars

Computational Linguistics

LECTURER: BURCU CAN Spring

A Context-Free Grammar

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

Parsing with Context-Free Grammars

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

Parsing with Context-Free Grammars

A* Search. 1 Dijkstra Shortest Path

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

Natural Language Processing

c(a) = X c(a! Ø) (13.1) c(a! Ø) ˆP(A! Ø A) = c(a)

Statistical Methods for NLP

Probabilistic Context Free Grammars. Many slides from Michael Collins and Chris Manning

Transition-Based Parsing

Probabilistic Context-free Grammars

Natural Language Processing 1. lecture 7: constituent parsing. Ivan Titov. Institute for Logic, Language and Computation

Processing/Speech, NLP and the Web

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

Advanced Natural Language Processing Syntactic Parsing

Probabilistic Context-Free Grammar

Unterspezifikation in der Semantik Scope Semantics in Lexicalized Tree Adjoining Grammars

Introduction to Probablistic Natural Language Processing

Chapter 14 (Partially) Unsupervised Parsing

Spectral Unsupervised Parsing with Additive Tree Metrics

Ontology based interlingua translation

The Formal Architecture of. Lexical-Functional Grammar. Ronald M. Kaplan and Mary Dalrymple

CISC 4090 Theory of Computation

Constituency Parsing

Recap: Lexicalized PCFGs (Fall 2007): Lecture 5 Parsing and Syntax III. Recap: Charniak s Model. Recap: Adding Head Words/Tags to Trees

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Artificial Intelligence

Review. Earley Algorithm Chapter Left Recursion. Left-Recursion. Rule Ordering. Rule Ordering

Topics in Lexical-Functional Grammar. Ronald M. Kaplan and Mary Dalrymple. Xerox PARC. August 1995

CS 6120/CS4120: Natural Language Processing

Probabilistic Context-Free Grammars and beyond

10/17/04. Today s Main Points

Handout 8: Computation & Hierarchical parsing II. Compute initial state set S 0 Compute initial state set S 0

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Features of Statistical Parsers

Grammars and Context-free Languages; Chomsky Hierarchy

Dependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing

Bringing machine learning & compositional semantics together: central concepts

The relation of surprisal and human processing

Effectiveness of complex index terms in information retrieval

Grammar and Feature Unification

The SUBTLE NL Parsing Pipeline: A Complete Parser for English Mitch Marcus University of Pennsylvania

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Language Models & Hidden Markov Models

Sharpening the empirical claims of generative syntax through formalization

Strong connectivity hypothesis and generative power in TAG

A DOP Model for LFG. Rens Bod and Ronald Kaplan. Kathrin Spreyer Data-Oriented Parsing, 14 June 2005

Lecture 5: UDOP, Dependency Grammars

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Driving Semantic Parsing from the World s Response

Stochastic models for complex machine learning

Computational Psycholinguistics Lecture 2: human syntactic parsing, garden pathing, grammatical prediction, surprisal, particle filters

DT2118 Speech and Speaker Recognition

INF4820: Algorithms for Artificial Intelligence and Natural Language Processing. Hidden Markov Models

Ch. 2: Phrase Structure Syntactic Structure (basic concepts) A tree diagram marks constituents hierarchically

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 10 : Context-Free Grammars

Latent Variable Models in NLP

Dependency Parsing. COSI 114 Computational Linguistics Marie Meteer. March 21, 2015 Brandeis University

PCFGs 2 L645 / B659. Dept. of Linguistics, Indiana University Fall PCFGs 2. Questions. Calculating P(w 1m ) Inside Probabilities

CS 662 Sample Midterm

The Infinite PCFG using Hierarchical Dirichlet Processes

Introduction to Computational Linguistics

CISC 4090 Theory of Computation

Stochastic Parsing. Roberto Basili

Aspects of Tree-Based Statistical Machine Translation

Sharpening the empirical claims of generative syntax through formalization

Context-Free Parsing: CKY & Earley Algorithms and Probabilistic Parsing

Computational Models - Lecture 4 1

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.

Dynamics, Dependency Grammar and Incremental Interpretation

POS-Tagging. Fabian M. Suchanek

Model-Theory of Property Grammars with Features

Context-Free Grammars and Languages. Reading: Chapter 5

An introduction to PRISM and its applications

Compositionality and Syntactic Structure Marcus Kracht Department of Linguistics UCLA 3125 Campbell Hall 405 Hilgard Avenue Los Angeles, CA 90095

An Efficient Context-Free Parsing Algorithm. Speakers: Morad Ankri Yaniv Elia

Remembering subresults (Part I): Well-formed substring tables

13A. Computational Linguistics. 13A. Log-Likelihood Dependency Parsing. CSC 2501 / 485 Fall 2017

CS : Speech, NLP and the Web/Topics in AI

Unit 2: Tree Models. CS 562: Empirical Methods in Natural Language Processing. Lectures 19-23: Context-Free Grammars and Parsing

CS 545 Lecture XVI: Parsing

Synchronous Grammars

Transcription:

Alessandro Mazzei Dipartimento di Informatica Università di Torino MATER DI CIENZE COGNITIVE GENOVA 2005 04-11-05 Natural Language Grammars and Parsing

Natural Language yntax Paolo ama Francesca yntactic Parsing: deriving a syntactic structure from the word sequence NP VP N V N Paolo ama Francesca sub ama obj Paolo Francesca

yntax and emantics Paolo ama Francesca Francesca ama Paolo yntactic Parsing yntactic Parsing NP VP N V N Paolo ama Francesca NP VP N V Francesca ama N Paolo

Dependency and PCFG ummary Dependency relations Dependency grammars and parsers Lexicalized PCFG

Anatomy of a Parser (1) Grammar Context-Free,... (2) Algorithm I. earch strategy top-down, bottom-up, left-to-right,... II.Memory organization (3) Oracle back-tracking, dynamic programming,... Probabilistic, rule-based,...

Generative Grammars and Natural Languages Generative Grammars can model the natural language as a formal language The derivation tree can model the syntactic structure of the sentences

Generative grammar G=(Σ,V,,P) Σ = alphabet V = {A,B,...} V P = {Ψ θ,...}

Grammar 3 G 4 =(Σ 4,{,NP,VP,V 1,V 2 },,P 4 }) Σ 4 = {I,Anna,John,Harry,saw,see,swimming} P 4 = { NP VP, VP V 1, VP V 2, NP I John Harry Anna, V 1 saw see, V 2 swimming}

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming

Grammar 3 Derivation NP VP VP V 1 NP VP VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP

Grammar 3 Derivation NP VP VP V 1 NP VP I VP VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP I VP

Grammar 3 Derivation NP VP VP V 1 NP VP I VP I V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP I V 1 VP

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP I VP I V 1 I saw NP I V 1 saw VP

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP I VP I V 1 I saw I saw NP VP NP I V 1 VP saw NP VP

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP I VP I V 1 I saw I saw NP VP I saw Harry VP NP I V 1 VP saw NP Harry VP

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP I VP I V 1 I saw I saw NP VP I saw Harry VP I saw Harry V 2 NP I V 1 VP saw NP Harry VP V 2

Grammar 3 Derivation NP VP VP V 1 VP V 2 NP I John Harry Anna V 1 saw see V 2 swimming NP VP I VP I V 1 I saw I saw NP VP I saw Harry VP I saw Harry V 2 I saw Harry swimming NP I V 1 VP saw NP Harry VP V 2 swimming

Dependency and PCFG ummary Dependency relations Dependency grammars and parsers Lexicalized PCFG

A different syntactic structure: Dependency Constituent structure represents the grouping relations among the words Dependence structure represents the dependency relations among the words NP VP N V N Paolo ama Francesca sub ama obj Paolo Francesca

Dependency relation Relation among two words: Head: dominant word Dependent: dominated word The head selects his dependents and determines their properties Example: the verb determines the number of his arguments

Dependency relation Head: dominant word ama Paolo Francesca

Dependency relation Dependent: dominated word ama Paolo Francesca

Dependency relation Dependent argument arg ama arg Paolo Francesca

Dependency relation Dependent argument modifier arg Paolo corre mod velocemente

Dependency relation Dependent argument modifier mod il cane mod giallo

Constituency and Dependency Constituency relation captures dependency relation in the X-bar theory X'' arg head X' X' arg mod NP Paolo VP VP V N ama Francesca ADV dolcemente

Constituency and Dependency Constituency relation captures dependency relation in the X-bar theory X'' arg head X' X' arg mod NP Paolo VP VP V N ama Francesca ADV dolcemente Problem with free-word order languages

Constituency and Dependency sub Paolo obj ama Francesca mod dolcemente NP Paolo VP VP V N ama Francesca ADV dolcemente

Constituency and Dependency sub Paolo :1 obj ama :2 Francesca :3 mod dolcemente :4 NP Paolo VP VP V N ama Francesca ADV dolcemente

Turin University Treebank Dependency Treebank: 1800 sentences, ~40000 words Various genres: newspaper, civil law, albanian, miscellaneous Augmented Relational tructure (AR) Morpho-syntactic yntactic-functional emantic

Turin University Treebank ************** FRAE ALB-4 ************** 1 Il (IL ART DEF M ING) [5;VERB-UBJ] 2 Governo (GOVERNO NOUN COMMON M ING) [1;DET+DEF-ARG] 3 di (DI PREP MONO) [2;PREP-RMOD] 4 Berisha ( Berisha NOUN PROPER) [3;PREP-ARG] 5 appare (APPARIRE VERB MAIN IND PRE INTRAN 3 ING) [0;TOP-VERB] 6 in (IN PREP MONO) [5;VERB-PREDCOMPL+UBJ] 7 difficolta' ( difficolta` NOUN COMMON F ALLVAL) [6;PREP-ARG] 8. (#\. PUNCT) [5;END]

Turin University Treebank

Generative Grammars and Natural Languages Generative Grammars model the generation of the sentences The derivation tree can model the constituency structure of the sentences

Generative Grammars and Natural Languages Generative Grammars model the generation of the sentences The derivation tree can model the constituency structure of the sentences Representation vs. Generation

Dependency and PCFG ummary Dependency relations Dependency grammars and parsers Lexicalized PCFG

Dependency grammars and parsers How can we generate a dependency structure? dependency grammar How can we build the dependency structure of a sentence? dependency parser

Dependency grammars In the constituency paradigm: generative grammars rewriting rule In the dependency paradigm: constraint grammars constraint

Dependency parsers: Turin University Parser A rule-based dependency parser that uses subcategorization frames Chunk parser (~bottom-up) AR annotation Morpho-syntactic yntactic-functional emantic

Turin University Parser 1) Non verbal Rules: (ADJ-QUALIF BEFORE (ADV (TYPE MANNER)) ADVMOD-MANNER ) If an adverb of subcategory (TYPE) MANNER immediately precedes a qualificative adjective, then it can depend from it via an arc labelled as ADVMOD-MANNER.... davvero veloce... veloce davvero ADVMOD-MANNER

Turin University Parser 2) Verbal Rules based on a taxonomy of subcategorization classes: VERB TRAN... INTRAN... INTRAN-INDOBJ-PRED (Ex. La casa gli sembra bella )...

Turin University Parser Paolo è davvero veloce 1) NVR Paolo è veloce ADVMOD-MANNER davvero 2) VR VERB-UBJ è VERB-PREDCOMPL Paolo veloce ADVMOD-MANNER davvero

Anatomy of the TUP (1) Grammar Dependency grammar (constraint),... (2) Algorithm I. earch strategy top-down, ~bottom-up, left-to-right,... II.Memory organization (3) Oracle depth-first, back-tracking, dynamic programming,... Probabilistic, rule-based,...

Dependency and PCFG ummary Dependency relations Dependency grammars and parsers Lexicalized PCFG

Probabilistic CFG G=(Σ,V,,P) A β [p] p (0,1)

PCFG P(T a ) =.15 *.4 *.05 *.05 *.35 *.75 *.4 *.4 *.4 *.3 *.4 *.5 = = 1.5 x 10-6 P(T b ) =.15 *.4 *.4 *.05 *.05 *.75 *.4 *.4 *.4 *.3 *.4 *.5 = = 1.7 x 10-6

Problem with PCFG Independence assumption: no structural and lexical preferences

Problem with PCFG Independence assumption: no structural and lexical preferences

Problem with PCFG Independence assumption: no structural and lexical preferences

Lexicalized PCFG Each CF rule is augmented with information about the heads of the constituents involved A BC A(head A ) B(head B ) C(head C ) Middle point between dependency and constituency paradigm

Lexicalized PCFG VP VBD NP PP VP(dumped) VBD(dumped) NP(sacks) PP(into) [3x10-10 ] VP(dumped) VBD(dumped) NP(cats) PP(into) [8x10-11 ] VP(dumped) VBD(dumped) NP(hats) PP(into) [4x10-10 ] VP(dumped) VBD(dumped) NP(sacks) PP(above) [1x10-12 ]

Lexicalized PCFG

Lexicalized PCFG

Conclusions yntactic structure: constituency and dependency relations Parsing: generative and constraint paradigm Lexicalized Probabilistic CFGs Treebank

References PEECH and LANGUAGE PROCEING D. Jurafsky and J.H. Martin Prentice Hall 2000 An Introduction to yntax R.D. Van Valin Cambridge 2001 TUT and TUP: http://www.di.unito.it/~gull