Dependency grammar Morphology Word order Transition-based neural parsing Word representations Recurrent neural networks Informs Models
Dependency grammar Morphology Word order Transition-based neural parsing Word representations Recurrent neural networks
Dependency Grammar I Modern theories of dependency grammar originate with Lucien Tesnière I Reference: Lucien Tesnière (1959). Éléments de syntaxe structurale, Klincksieck, Paris. ISBN 2-252-01861-5 I Underlying ideas date back to Panini and his system of karakas I Di erent contemporary frameworks of dependency grammar, including the Prague School s Functional Generative Description, Melcuk s Meaning-Text Theory, and Hudson s Word Grammar.
Dependency Grammar The sentence is an organized whole, whose constituent elements are words. [1.2] Every word that belongs to a sentence ceases by itself to be isolated as in the dictionary. Between the word and its neighbors, the mind perceives connections, the totality of which forms the structure of the sentence. [1.3] The structural connections establish dependency relations between the words. Each connection in principle unites a superior term and an inferior term. [2.1] The superior term receives the name governor. Theinferiortermreceivesthenamesubordinate. Thus, in the sentence Alfred parle [... ], parle is the governor and Alfred the subordinate. from: Tesnière (1959)
Advantages of Dependency Grammars I a completely word-based framework (no phrasal projections). I most dependency grammar frameworks are non-derivational and mono-stratal. I allows for a surface-level syntactic account of languages with flexible word order and syntactic constructions with discontinuous elements. However, these syntactic phenomena raise also challenging questions about the dependency grammar formalism and the notion of projectivity of dependency structures.
Parsing with Dependency Grammars I Parsing a sentence is not a goal in itself, but ultimately needs to help provide an adequate answer to the question: Who did what to whom, when, where, and why? In other words: syntactic structure needs to be linked in a systematic fashion to semantic representation/interpretation. I Dependency grammar o ers a direct interface between syntax and semantics: dependency relations between a governor (lexical head) and its lexical dependents can link lexical representations of the main participants of an event or state of a airs with lexical representations of the cicrumstances under which they occurred or hold. I Parsing with dependency grammars benefits from the lexicalist character of dependency relations. This is beneficial, inter alia, for parsing long-distance dependencies and coordinations (see Kübler and Prokic 2006)
UD English treebank: treatment of nominal arguments nsubj aux root obj punct det compound you should get a cocker spaniel. you should get a cocker spaniel. PRON AUX VERB DET NOUN NOUN PUNCT
UD English treebank: treatment of PP adjuncts root obl punct nsubj obj case He announced this in January : he announce this in January : PRON VERB PRON ADP PROPN PUNCT
UD English treebank: treatment of clausal subjects punct root csubj mark obj obl case Great to have you on board! great to have you on board! ADJ PART VERB PRON ADP NOUN PUNCT
UD English treebank: treatment of relative clauses punct nsubj root obj advmod acl:relcl det det nsubj amod Every move Google makes brings this particular future closer. every move Google make bring this particular future closer. DET NOUN PROPN VERB VERB DET ADJ NOUN ADV PUNCT
UD English treebank: treatment of relative clauses vocative punct nsubj acl:relcl nsubj root obj punct Malach, What you say makes sense. Malach, what you say make sense. PROPN PUNCT PRON PRON VERB VERB NOUN PUNCT
UD English treebank: treatment of direct questions advmod auxpass nsubjpass advmod root nmod nmod:tmod punct Why were they suddenly acted on Saturday? why be they suddenly act on Saturday? ADV AUX PRON ADV VERB ADP PROPN PUNCT
Heads and Dependents Tests for identifying a head H and a dependent D in a syntactic construction C: 1. H determines the syntactic category of C and can often replace C. 2. H determines the semantic category of C; D gives semantic specification 3. H is obligatory; D may be optional. 4. H selects D and determines whether D is obligatory or optional. 5. The form of D depends on H (agreement or government). 6. The linear position of D is specified with reference to H. from: Kübler et. al. (2009), p.3f.
Heads and Dependents: Some Unclear Cases I Auxiliary-main-verb constructions I determiner-adjective-noun constructions I prepositional phrases I Coordination structures The answer often depends on di erent purposes that the dependency structure is put to use for.
Case Study: Strong and Weak Adjectives in Dutch (1) a. de bruine beer the brown [weak] beer [masc] b. een a the brown beer bruine brown a brown beer beer [strong] beer [masc] c. de bruine beest the brown [weak] animal [neut] d. een a the brown animal bruin beest brown [strong] animal [neut] a brown animal
Universal Dependency Initiative I objective: develop cross-linguistically consistent treebank annotation for many languages I goal: facilitate multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective I strategy: provide a universal inventory of categories and guidelines to facilitate consistent annotation of similar constructions across languages, while allowing language-specific extensions when necessary
UD annotations across languages some examples
Universal Dependency Relations
Universal Tagset Open class words Closed class words Other ADJ ADP PUNCT ADV AUX SYM INTJ CCONJ X NOUN DET PROPN VERB NUM PART PRON SCONJ
Some Definitions: Sentence and Arc Labels Definition 2.1. A sentence is a sequence of tokens denoted by: S = w 0 w 1...w n,wherew 0 = root Definition 2.2. Let R = {r 1,...,r m } be a finite set of possible dependency relation types that can hold between any two words in a sentence. A relation type r 2 R is additionally called an arc label. Acknowledgement: Definitions 2.1-2.4; 2.16-2.18, and Notation 2.6-2.9 and are all taken from Kübler, McDonald, and Nivre (2009), chapt. 2
Dependency Structures and Dependency Trees Definition 2.3. A dependency graph G =(V, A) isalabeled directed graph (digraph) in the standard graph-theoretic sense and consists of nodes, V,andarcs,A, suchthatforsentence S = w 0 w 1...w n and label set R the following holds: 1. V {w 0, w 1,...,w n } 2. A V R V 3. if (w i, r, w j ) 2 A then (w i, r 0, w j ) /2 A for all r 0 6= r The spanning node set V S = {w 0, w 0,...,w n } contains all and only the words of a sentence, including w 0 = root.
Dependency Trees Definition 2.4. A well-formed dependency graph G =(V, A) for an input sentence S and dependency relation set R is any dependency graph that is adirectedtreeoriginating out of node w 0 and has the spanning node set V = V S.Wecallsuch dependency graphs dependency trees.
Unique Head Property Remark: Dependency trees rule out the following dependency configuration: arc 1 arc 2 head dep head Some putative counterexample: In cases of VP coordination, as in Sandy listened and smiled, itappearsatleastpausibleto establish a dependency relation between each verbal head to the nominal dependent.
Some Notation Notation 2.6. The notation w i! w j indicates the unlabeled dependency relation (or dependency relation for short) in a tree G =(V, A). That is, w i! w j if and only if (w i, r, w j ) 2 A for some r 2 R. Notation 2.7. The notation w i! w j indicates the reflexive transitive closure of the dependency relation in a tree G =(V, A). That is, w i! w j if and only if i = j (reflexive) or both w i! w i 0 and w i 0! w j hold (for some w i 0 2 V ). Notation 2.8. The notation w i $ w j indicates the undirected dependency relation in a tree G =(V, A). That is, w i $ w j if and only if either w i! w j or w j! w i. Notation 2.9. The notation w i $ w j indicates the reflexive transitive closure of the undirected dependency relation in a tree G =(V, A). That is, w i $ w j if and only if i = j (reflexive) or both w i $ w i 0 and w i 0 $ w j hold (for some w i 0 2 V ).
Connectedness AdependencytreeG =(V, A) satisfiestheconnectedness property, whichstatesthatforallw i, w j 2 V it is the case that w i $ w j. That is, there is a path connecting every two words in a dependency tree when the direction of the arc (dependency relation) is ignored.
(Non-)Projective Dependency Trees Definition 2.16. An arc (w i, r, w j ) 2 A in a dependency tree G =(V, A) isprojective if and only if w i! w k for all i < k < j when i < j, orj < k < i when j < i. Definition 2.17. AdependencytreeG =(V, A) isaprojective dependency tree if (1) it is a dependency tree (definition 2.4), and (2) all (w i, r; w j ) 2 A are projective. Definition 2.18. A dependency tree G = (V;A) is a non-projective dependency tree if (1) it is a dependency tree (definition 2.4), and (2) it is not projective.
Converting Non-projective to Projective Dependency Trees PU root ATT TMP PC DET SBJ VC ATT A hearing is scheduled on the issue today. VC:TMP PU root SBJ:ATT PC DET SBJ VC ATT A hearing is scheduled on the issue today.
Adependencygammartreebank A dependency gammar treebank consists of pairs of sentences S and their corresponding dependency trees G : T = {(S d, G d )} T d=0 The dependency trees G can be obtained by I manual annotation by one or more human annotators I automatically annotated by a parser I derived automatically by a conversion algorithm from a constituent grammar treebank
Tübingen Treebank of Written German (TüBa-D/Z) I developed by my research group at the Seminar für Sprachwissenschaft at the University of Tübingen since 1999. I language data taken from the German newspaper die tageszeitung (taz). I largest manually annotated treebank for German I total of 104,787 sentences I average sentence length: 18.7 words per sentence. I total number of tokens: 1,959,474.
Tübingen Treebank of Written German (TüBa-D/Z) I orginally annotated for constituent structure I now also available in dependency structure format I The annotation guidelines are published in the Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z) http://www.sfs.uni-tuebingen.de/fileadmin/user_ upload/ascl/\tuebadz-stylebook-1508.pdf I Information on how to obtain the data can be found at: http://www.sfs.uni-tuebingen.de/en/ascl/ resources/corpora/tueba-dz.html
=<, > = 0 1...
Θ()
= 0 1... =< σ,β, > σ σ : β β : (,, )
= 0 1... =< σ,β, > σ σ : β β : (,, )
σ β
σ =[, ] β =[, ] = {(,, )}
0 () ([ 0 ], [ 1... ], ) ([ 0 ], [], ) 0 =
[σ ] σ [ β] β
(σ, [ β], ) ([σ ],β,) ([σ ],β,) ([σ ]β, (,, )) 0 ([σ ],β,) ([σ ],β, (,, ))
0, =( 0, 1,..., ) 0 0 () 1... = ( 1 ) =<, >
σ β...
σ β...
σ β...
σ β... 1 = {(,, )}
σ β... 1 = {(,, )} 1
σ β... 1 = {(,, )} 1 1
σ β... 1 = {(,, )} 1 1 2 = 1 {(,, )}
σ β... 1 = {(,, )} 1 1 2 = 1 {(,, )} 3 = 2 {(,, )}
σ β... 1 = {(,, )} 1 1 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )}
,
() ( {}) ()
() ( {}) () ( )= {
(σ, [ β], ) ([σ ],β,) ([σ ],β,) ([σ ]β, (,, )) 0 ([σ ],β,) ([σ ],β, (,, )) ([σ ],β,) ([σ ],β,)
σ β...
σ β...
σ β...
σ β... 1 = {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )} [σ 1 2 ] 2
σ β...
σ β...
σ β...
σ β... 1 = {(,, )}
σ β... 1 = {(,, )} 1
σ β... 1 = {(,, )} 1 1
σ β... 1 = {(,, )} 1 1 1
σ β... 1 = {(,, )} 1 1 1 2 = 1 {(,, )}
σ β... 1 = {(,, )} 1 1 1 2 = 1 {(,, )} 3 = 2 {(,, )}
σ β... 1 = {(,, )} 1 1 1 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )}
σ β... 1 = {(,, )} 1 1 1 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )} 5 = 4 {(,, )}
(σ, [ β], ) ([σ ],β,) ([σ ], [ β], ) ([σ ],β, (,, )) ([σ ], [ β], ) (σ, [ β], (,, )) 0 (,, ) ([σ ],β,) (σ, β, ) (,, )
(σ, [ β], ) ([σ ],β,) ([σ ], [ β], ) ([σ ],β, (,, )) ([σ ], [ β], ) (σ, [ β], (,, )) 0 (,, ) ([σ ],β,) (σ, β, ) (,, )
σ β...
σ β...
σ β... 1 = {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )} 3 = 2 {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )} 5 = 4 {(,, )}
σ β... 1 = {(,, )} 2 = 1 {(,, )} 3 = 2 {(,, )} 4 = 3 {(,, )} 5 = 4 {(,, )} 4 5
2 :
0 () () () ()
O()
O()
O() 1
O() 1 1