Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

Similar documents
Bringing machine learning & compositional semantics together: central concepts

Graph-based Dependency Parsing. Ryan McDonald Google Research

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Advanced Graph-Based Parsing Techniques

Driving Semantic Parsing from the World s Response

Computational Linguistics

Lab 12: Structured Prediction

Introduction to Computational Linguistics

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

Penn Treebank Parsing. Advanced Topics in Language Processing Stephen Clark

Language Learning Problems in the Principles and Parameters Framework

Generalized Pigeonhole Properties of Graphs and Oriented Graphs

Tree Decompositions and Tree-Width

S NP VP 0.9 S VP 0.1 VP V NP 0.5 VP V 0.1 VP V PP 0.1 NP NP NP 0.1 NP NP PP 0.2 NP N 0.7 PP P NP 1.0 VP NP PP 1.0. N people 0.

Polyhedral Outer Approximations with Application to Natural Language Parsing

Advanced Natural Language Processing Syntactic Parsing

Relations Graphical View

Parameterized Domination in Circle Graphs

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS

Marrying Dynamic Programming with Recurrent Neural Networks

Cases Where Finding the Minimum Entropy Coloring of a Characteristic Graph is a Polynomial Time Problem

Decoding and Inference with Syntactic Translation Models

Learning Dependency-Based Compositional Semantics

Combinatorial Optimization

UNIT II REGULAR LANGUAGES

Acyclic and Oriented Chromatic Numbers of Graphs

Causal Belief Decomposition for Planning with Sensing: Completeness Results and Practical Approximation

Cycle Double Cover Conjecture

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Proof of Theorem 1. Tao Lei CSAIL,MIT. Here we give the proofs of Theorem 1 and other necessary lemmas or corollaries.

Automata Theory, Computability and Complexity

Learning Goals of CS245 Logic and Computation

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

Min-max model for the network reduction problem

Decomposing planar cubic graphs

CKY & Earley Parsing. Ling 571 Deep Processing Techniques for NLP January 13, 2016

The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

An introduction to PRISM and its applications

Handout: Proof of the completeness theorem

Graphical Model Inference with Perfect Graphs

Undirected Graphical Models

1. For the following sub-problems, consider the following context-free grammar: S AA$ (1) A xa (2) A B (3) B yb (4)

NLU: Semantic parsing

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung

Dual Decomposition for Inference

Introduction to Semantic Parsing with CCG

Notes. Relations. Introduction. Notes. Relations. Notes. Definition. Example. Slides by Christopher M. Bourke Instructor: Berthe Y.

Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction

A* Search. 1 Dijkstra Shortest Path

CS 188: Artificial Intelligence Spring Announcements

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

NLP Homework: Dependency Parsing with Feed-Forward Neural Network

Lecture 9: Decoding. Andreas Maletti. Stuttgart January 20, Statistical Machine Translation. SMT VIII A. Maletti 1

Preliminaries. Introduction to EF-games. Inexpressivity results for first-order logic. Normal forms for first-order logic

Lecture 3: Decidability

Models of Adjunction in Minimalist Grammars

arxiv: v3 [cs.dm] 18 Oct 2017

Probabilistic Context-free Grammars

Language Technology. Unit 1: Sequence Models. CUNY Graduate Center. Lecture 4a: Probabilities and Estimations

Relation between Graphs

Foundations of Informatics: a Bridging Course

A Syntax-based Statistical Machine Translation Model. Alexander Friedl, Georg Teichtmeister

Geometric Steiner Trees

15.1 Matching, Components, and Edge cover (Collaborate with Xin Yu)

A Tabular Method for Dynamic Oracles in Transition-Based Parsing

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Natural Language Processing

HOMEWORK #2 - MATH 3260

Logic: Propositional Logic Truth Tables

Attendee information. Seven Lectures on Statistical Parsing. Phrase structure grammars = context-free grammars. Assessment.

Chapter 3: Propositional Calculus: Deductive Systems. September 19, 2008

Median orders of tournaments: a tool for the second neighbourhood problem and Sumner s conjecture.

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Outline. Logical Agents. Logical Reasoning. Knowledge Representation. Logical reasoning Propositional Logic Wumpus World Inference

CS Lecture 29 P, NP, and NP-Completeness. k ) for all k. Fall The class P. The class NP

Probabilistic Graphical Models (I)

Dependency Parsing. Statistical NLP Fall (Non-)Projectivity. CoNLL Format. Lecture 9: Dependency Parsing

Automata, Logic and Games: Theory and Application

CSCI 1010 Models of Computa3on. Lecture 17 Parsing Context-Free Languages

Polynomial Space. The classes PS and NPS Relationship to Other Classes Equivalence PS = NPS A PS-Complete Problem

Rough Sets. V.W. Marek. General introduction and one theorem. Department of Computer Science University of Kentucky. October 2013.

Natural Language Processing

Properties of context-free Languages

Transition-based dependency parsing

GENERALIZED PIGEONHOLE PROPERTIES OF GRAPHS AND ORIENTED GRAPHS

Easy Shortcut Definitions

Computing if a token can follow

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

The inefficiency of equilibria

arxiv: v3 [cs.ds] 24 Jul 2018

Unit 1: Sequence Models

NP-Completeness. Until now we have been designing algorithms for specific problems

arxiv: v1 [cs.ds] 26 Feb 2016

Bayes Nets: Independence

Hamiltonian paths in tournaments A generalization of sorting DM19 notes fall 2006

Learning Bayesian networks

Logic: Top-down proof procedure and Datalog

Transcription:

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan Institute of Computer Science and Technology Peking University September 5, 2017 1 of 41

Overview The Problem First-order Algorithm Second-order Algorithm Experiments 2 of 41

Outline The Problem First-order Algorithm Second-order Algorithm Experiments 3 of 41

Semantic dependency parsing Example arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy Predicate argument analysis, bi-lexical relations Long-distance dependencies Graph-structured representations, many crossing arcs Not a tree: single-headed ( ), cycle-free ( ) 4 of 41

Semantic dependency parsing Example arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy Predicate argument analysis, bi-lexical relations Long-distance dependencies Graph-structured representations, many crossing arcs Not a tree: single-headed ( ), cycle-free ( ) 4 of 41

Semantic dependency parsing Example arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy Predicate argument analysis, bi-lexical relations Long-distance dependencies Graph-structured representations, many crossing arcs Not a tree: single-headed ( ), cycle-free ( ) 4 of 41

Semantic dependency parsing Example arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy Predicate argument analysis, bi-lexical relations Long-distance dependencies Graph-structured representations, many crossing arcs Not a tree: single-headed ( ), cycle-free ( ) 4 of 41

Semantic dependency parsing Example arg1 arg1 arg1 arg1 The company that Mark wants to buy Predicate argument analysis, bi-lexical relations Long-distance dependencies Graph-structured representations, many crossing arcs Not a tree: single-headed ( ), cycle-free ( ) 4 of 41

Maximum Subgraph Input A directed graph G = (V, A) Output Subgraph G = (V, A A) with maximum total weight such that G belongs to G G (s) = arg max ScorePart(s, p) H G(s,G ) p H Example When G is tree, Maximum Subgraph = Maximum Spanning Tree Complexity G and the order of ScorePart determine the complexity of inference. 5 of 41

Complexity G O Algo Arbitrary 1 O(n 2 ) Arbitrary 2 NP-hard (Du et al., 2015) Acyclic 1 NP-hard (Kuhlmann and Jonsson, 2015) Noncrossing 1 O(n 3 ) (Kuhlmann and Jonsson, 2015) Noncrossing 2 O(n 4 ) (Sun et al., 2017) 1-endpoint-crossing 1 O(n 5 ) Ongoing work 1-endpoint-crossing 1 O(n 5 ) (Cao et al., 2017) pagenumber-2 1-endpoint-crossing 1 O(n 4 ) (Cao et al., 2017) pagenumber-2, C-free 1-endpoint-crossing 2 O(n 4 ) This paper pagenumber-2, C-free 6 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. 7 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy 7 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. arg1 arg1 The company that Mark wants to buy 7 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. arg1 The company that Mark wants to buy 7 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. arg1 The company that Mark wants to buy 7 of 41

1-Endpoint-Crossing Graphs Definition A dependency graph is 1-Endpoint-Crossing if for any edge e, all edges that cross e share an endpoint p named pencil point. arg1 The company that Mark wants to buy 7 of 41

Pagenumber-K Graphs Definition A dependency graph G is a pagenumber-k graph if G consists at most K subgraphs called pages. Each page contains all vertices, but only a subset of arcs that are not crossed with other arcs in this page. 8 of 41

Pagenumber-K graph Example The company that Mark wants to buy arg1 arg1 arg1 arg1 arg1 A Pagenumber-2 Graph 9 of 41

Pagenumber-K graph Example The company that Mark wants to buy arg1 arg1 arg1 arg1 arg1 A Pagenumber-3 Graph 9 of 41

Coverage PN 2 1EC EnjuBank DeepBank PCEDT CCGBank Yes Both 99.53% 99.69% 98.39% 98.09% Both Yes 97.28% 97.67% 97.53% 95.73% Yes Yes 97.28% 97.67% 97.53% 95.68% No Yes 0.0% 0.0% 0.0% 0.05% Yes No 2.25% 2.02% 0.86% 2.41% Sentences 100% 100% 100% 100% 10 of 41

Coverage PN 2 1EC EnjuBank DeepBank PCEDT CCGBank Yes Both 99.53% 99.69% 98.39% 98.09% Both Yes 97.28% 97.67% 97.53% 95.73% Yes Yes 97.28% 97.67% 97.53% 95.68% No Yes 0.0% 0.0% 0.0% 0.05% Yes No 2.25% 2.02% 0.86% 2.41% Sentences 100% 100% 100% 100% Most semantic dependency graphs are 1EC/P2 graphs. 10 of 41

Coverage PN 2 1EC EnjuBank DeepBank PCEDT CCGBank Yes Both 99.53% 99.69% 98.39% 98.09% Both Yes 97.28% 97.67% 97.53% 95.73% Yes Yes 97.28% 97.67% 97.53% 95.68% No Yes 0.0% 0.0% 0.0% 0.05% Yes No 2.25% 2.02% 0.86% 2.41% Sentences 100% 100% 100% 100% Most semantic dependency graphs are 1EC/P2 graphs. Theorem The pagenumber of a 1EC graph is at most 3. 10 of 41

Previous Work (1) G O Algo Arbitrary 1 O(n 2 ) Arbitrary 2 NP-hard (Du et al., 2015) Acyclic 1 NP-hard (Kuhlmann and Jonsson, 2015) Noncrossing 1 O(n 3 ) (Kuhlmann and Jonsson, 2015) Noncrossing 2 O(n 4 ) (Sun et al., 2017) 1-endpoint-crossing 1 O(n 5 ) Ongoing work 1-endpoint-crossing 1 O(n 5 ) (Cao et al., 2017) pagenumber-2 1-endpoint-crossing 1 O(n 4 ) (Cao et al., 2017) pagenumber-2, C-free 1-endpoint-crossing 2 O(n 4 ) This paper pagenumber-2, C-free 11 of 41

Previous Work (2) Key observation Every subgraph of a 1EC/P2 graph is still a 1EC/P2 graph. A dynamic programming algorithm gchsw In each construction step, usually more than one arcs are allowed to be constructed. Whether or not such arcs are created depends on their arc-weights. We are able to get a maximal 1EC/P2 graph, but just choose a subgraph of it with all positive arcs. 12 of 41

Challenge of High-order Factorization (1) A single step in gchsw i l k j e (i,k),e (l,j) and e (i,j) can be created at the same time. Eisner s algorithm 13 of 41 In a single step, which arc is created is deterministic!

Challenge of High-order Factorization (2) It is very difficult to enumerate all high-order features for crossing arcs. 14 of 41

Challenge of High-order Factorization (2) It is very difficult to enumerate all high-order features for crossing arcs. x r x i r i k j l j 14 of 41

Challenge of High-order Factorization (2) It is very difficult to enumerate all high-order features for crossing arcs. x r x i r i k j l j It is hard to cover sibling features between e (x,k) and e (x,rx ). 14 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. 15 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy 15 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy 15 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy 15 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy 15 of 41

Challenge of High-order Factorization (3) Pitler (2014) It is still possible to build accurate tree parsers by considering only higher-order features of noncrossing arcs. arg1 arg1 arg1 arg1 arg1 The company that Mark wants to buy Good news: Most of arcs are noncrossing even in crossing graphs. 15 of 41

Previous Work (3) O[s, e] s e C [s, e, l] s e s e = s + 1 e s e = s k + k e 16 of 41

Outline The Problem First-order Algorithm Second-order Algorithm Experiments 17 of 41

Sub-problem of C-free 1EC/P2 Int O [i, j] LR[i, j, x] N O [i, j, x] L O [i, j, x] R O [i, j, x] i j x i j x i j x i j x i j 18 of 41

Sub-problem of C-free 1EC/P2 Int O [i, j] LR[i, j, x] N O [i, j, x] L O [i, j, x] R O [i, j, x] i j x i j x i j x i j x i j Int C [i, j] N C [i, j, x] L C [i, j, x] R C [i, j, x] i j x i j x i j x i j 18 of 41

Sub-problem of C-free 1EC/P2 Int O [i, j] LR[i, j, x] N O [i, j, x] L O [i, j, x] R O [i, j, x] i j x i j x i j x i j x i j Int C [i, j] N C [i, j, x] L C [i, j, x] R C [i, j, x] i j x i j x i j x i j Open-structure can be transformed to close-structure if red arc exists. 18 of 41

Decomposition of Int C Decompose Int C considering farthest arc from i 1 No arc 2 Noncrossing edge 3 Crossing edge with outer pencil point 4 Crossing edge with inner pencil point 19 of 41

Decomposition of Int C (a) i j = i + 1 j If there is no arc from i to (i, j). 20 of 41

Decomposition of Int C (b) i k j = i k + k j If there is a noncrossing arc from i to (i, j). 21 of 41

Decomposition of Int C (c) i k Dashed edge exist? x j For a crossing arc e (i,k) with outer pt(i,k) = x 22 of 41

Decomposition of Int C (c) i k Dashed edge exist? x j (c.1) i k x j = i k x + k x + k x j For a crossing arc e (i,k) with outer pt(i,k) = x 22 of 41

Decomposition of Int C (c) i k Dashed edge exist? x j (c.1) i k x j = i k x + k x + k x j (c.2) i k x j = i k x + k x + x j For a crossing arc e (i,k) with outer pt(i,k) = x 22 of 41

Decomposition of Int C Dashed edge exist? i x k j For a crossing arc e (i,k) with inner pt(i,k) = x 23 of 41

Decomposition of Int C Dashed edge exist? i (d.1) i x x k k j = i x + i x k + x k j j For a crossing arc e (i,k) with inner pt(i,k) = x 23 of 41

Decomposition of Int C Dashed edge exist? i (d.1) i x (d.2) i x x k k k j j = i x + i x k + x k j j = i x k + x k + x k j For a crossing arc e (i,k) with inner pt(i,k) = x 23 of 41

C-free LR Decomposition x i j 24 of 41

C-free LR Decomposition x i j x i = j x i k + x k j If there exists k dividing [i,j] into two independent spans 24 of 41

C-free LR Decomposition For each vertex k, there must be edges from [i,k) to (k,j]. x i b 1 a 1 b 2 a 2 j, b 3 b 3 = j, there exists only e x,b1 or e x,a2. 25 of 41

C-free LR Decomposition For each vertex k, there must be edges from [i,k) to (k,j]. x i b 1 a 1 b 2 a 2 j, b 3 b 3 = j, there exists only e x,b1 or e x,a2. x i b 1 a 1 b 2 a 2 b 3 j, a 3 a 3 = j, there exists both e x,b1 and e x,b3. 25 of 41

Example The company that Mark wants to buy Int O [1, 7] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int O [1, 7] = Int C [1, 2] + Int O [2, 7] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int O [2, 7] = Int C [2, 7] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int c [2, 7] = Int c [2, 3] + Int O [3, 7] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int O [3, 7] = R O [3, 4; 5] + Int O [4, 5] + L O [5, 7; 4] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 R O [3, 4; 5] = Int O [3, 4] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int O [4, 5] = Int C [4, 5] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 L O [5, 7; 4] = L C [5, 7; 4] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 L C [5, 7; 4] = Int O [5, 6] + L O [6, 7; 5] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 L O [6, 7; 5] = L C [6, 7; 5] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 L C [6, 7; 5] = Int O [6, 7] = Int C [6, 7] 26 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Get All Arcs 26 of 41

Spurious Ambiguity A cross-type subproblem allows to build crossing arcs, but does not necessarily create crossing arcs. 27 of 41

Spurious Ambiguity A cross-type subproblem allows to build crossing arcs, but does not necessarily create crossing arcs. a b c d e 27 of 41

Spurious Ambiguity A cross-type subproblem allows to build crossing arcs, but does not necessarily create crossing arcs. a b c d e Int C [a, e] Int C [a, c] + Int O [c, e]. 27 of 41

Spurious Ambiguity A cross-type subproblem allows to build crossing arcs, but does not necessarily create crossing arcs. a b c d e Int C [a, e] LR[a, c, d] + Int O [k, d] + L O [d, e, c]; LR[a, c; d] L O [a, b; d] + R O [b, c, d] Int O [a, b] + Int O [b, c]. 27 of 41

Outline The Problem First-order Algorithm Second-order Algorithm Experiments 28 of 41

Crossing-sensitive Single-side Second-order algorithm G (s) = arg max G e Edge(G ) Score 1 (e) + s Sib(G ) max(score 2 (s), 0) 29 of 41

Crossing-sensitive Single-side Second-order algorithm G (s) = arg max G e Edge(G ) Score 1 (e) + s Sib(G ) max(score 2 (s), 0) Both sibling arcs are noncrossing 29 of 41

Second-order Factorization s e = s + 1 e 1 s e = s r s + rs e 1 s e = s + 1 l e + le e s e = s + 1 l e + le e 30 of 41

Second-order Factorization Noncrossing sibling features can only be captured by decomposing Int C 31 of 41

Second-order Factorization Noncrossing sibling features can only be captured by decomposing Int C (a.1) (b.1) (c.1) i j = i + 1 j 1 i j = i j 1 i j = + i ri ri j 1 (a.2) i j = i + 1 j (b.2) i j = i rj rj j (c.2) i j = + i ri ri j (a.3) i j = i + 1 lj + lj j (b.3) i j = + i rj rj j (c.3) i j = i ri + ri lj + lj j 31 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 32 of 41

Example The company that Mark wants to buy 1 2 3 4 5 6 7 Int c [2, 7] = Int c [2, 3] + Int O [3, 7] + sib(e (2,7), e (2,3) ) 32 of 41

Spurious Ambiguity (1) This model is somehow inadequate given that the second-order score function cannot penalize a bad factor. When a negative score is assigned to a second-order factor, it will be taken as 0 by our algorithm. 33 of 41

Spurious Ambiguity (1) This model is somehow inadequate given that the second-order score function cannot penalize a bad factor. When a negative score is assigned to a second-order factor, it will be taken as 0 by our algorithm. a b c d e 33 of 41

Spurious Ambiguity (1) This model is somehow inadequate given that the second-order score function cannot penalize a bad factor. When a negative score is assigned to a second-order factor, it will be taken as 0 by our algorithm. a b c d e Int C [a, e] Int C [a, c] + Int O [c, e] + S sib (e (a,e), e (a,c) ). 33 of 41

Spurious Ambiguity (1) This model is somehow inadequate given that the second-order score function cannot penalize a bad factor. When a negative score is assigned to a second-order factor, it will be taken as 0 by our algorithm. a b c d e Int C [a, e] LR[a, c, d] + Int O [k, d] + L O [d, e, c]; LR[a, c; d] L O [a, b; d] + R O [b, c, d] Int O [a, b] + Int O [b, c]. 33 of 41

Spurious Ambiguity (2) G (s) = arg max G e Edge(G ) Score 1 (e) + s Sib(G ) max(score 2 (s), 0) Score 2 (s) 0 Our algorithm selects the derivation that takes s into account since it increases the total score. Score 2 (s) < 0 Our algorithm avoids including s by selecting other paths. In other words, our algorithm treats this score as 0. 34 of 41

Outline The Problem First-order Algorithm Second-order Algorithm Experiments 35 of 41

Results 92 Without Tree 90 F-Score 88 86 84 DM PAS CCG PCEDT First Second 36 of 41

Results 94 Syntax Tree F-Score 92 90 DM PAS CCG PCEDT First Second 37 of 41

Conclusion Our contributions A new dynamic programming algorithm for first-order parsing to 1-endpiont-crossing, pagenumber-2, C-free graphs. A new quasi-second-order extension. Lesson learned Crossing-sensitive second-order features are helpful. 38 of 41

Game Over 39 of 41

Game Over QUESTIONS? COMMENTS? 39 of 41

References (1) Junjie Cao, Sheng Huang, Weiwei Sun, and Xiaojun Wan. 2017. Parsing to 1-endpoint-crossing, pagenumber-2 graphs. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2110 2120. Association for Computational Linguistics, Vancouver, Canada. URL http://aclweb.org/anthology/p17-1193. Yantao Du, Weiwei Sun, and Xiaojun Wan. 2015. A data-driven, factorization parser for CCG dependency structures. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1545 1555. Association for Computational Linguistics, Beijing, China. URL http://www.aclweb.org/anthology/p15-1149. Marco Kuhlmann and Peter Jonsson. 2015. Parsing to noncrossing dependency graphs. Transactions of the Association for Computational Linguistics, 3:559 570. Emily Pitler. 2014. A crossing-sensitive third-order factorization for dependency parsing. TACL, 2:41 54. URL http://www.transacl.org/wp-content/uploads/2014/02/39.pdf. 40 of 41

References (2) Weiwei Sun, Junjie Cao, and Xiaojun Wan. 2017. Semantic dependency parsing via book embedding. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 828 838. Association for Computational Linguistics, Vancouver, Canada. URL http://aclweb.org/anthology/p17-1077. 41 of 41