Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction

Similar documents
Capturing Argument Relationships for Chinese Semantic Role Labeling

Deep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

CIKM 18, October 22-26, 2018, Torino, Italy

Chunking with Support Vector Machines

Neural Architectures for Image, Language, and Speech Processing

Random Coattention Forest for Question Answering

Tracking the World State with Recurrent Entity Networks

Task-Oriented Dialogue System (Young, 2000)

ECE521 Lectures 9 Fully Connected Neural Networks

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018

CSC321 Lecture 16: ResNets and Attention

arxiv: v2 [cs.cl] 20 Apr 2017

Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Long-Short Term Memory

Lecture 11 Recurrent Neural Networks I

Lecture 11 Recurrent Neural Networks I

arxiv: v3 [cs.lg] 14 Jan 2018

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Parts 3-6 are EXAMPLES for cse634

Prediction and Uncertainty Quantification of Daily Airport Flight Delays

Recurrent Neural Networks. Jian Tang

On the use of Long-Short Term Memory neural networks for time series prediction

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Towards Universal Sentence Embeddings

Laconic: Label Consistency for Image Categorization

Applied Natural Language Processing

LECTURER: BURCU CAN Spring

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer

Two-Stream Bidirectional Long Short-Term Memory for Mitosis Event Detection and Stage Localization in Phase-Contrast Microscopy Images

Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Driving Semantic Parsing from the World s Response

EE-559 Deep learning Recurrent Neural Networks

Lecture 15: Exploding and Vanishing Gradients

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Models, Data, Learning Problems

Long-Short Term Memory and Other Gated RNNs

Generating Sequences with Recurrent Neural Networks

Lecture 5 Neural models for NLP

CSC321 Lecture 15: Exploding and Vanishing Gradients

Multimodal context analysis and prediction

smart reply and implicit semantics Matthew Henderson and Brian Strope Google AI

Introduction to Deep Neural Networks

Conditional Language modeling with attention

Introduction to Machine Learning Midterm Exam

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Contents. (75pts) COS495 Midterm. (15pts) Short answers

Chinese Character Handwriting Generation in TensorFlow

Information Extraction from Text

with Local Dependencies

Deep Learning for NLP

Introduction to RNNs!

Informal Definition: Telling things apart

Maschinelle Sprachverarbeitung

Maschinelle Sprachverarbeitung

Lecture 17: Neural Networks and Deep Learning

Deep Learning Recurrent Networks 2/28/2018

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

Tuning as Linear Regression

Machine Learning for Structured Prediction

arxiv: v2 [quant-ph] 16 Nov 2018

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

Recurrent Neural Network

Convolutional Dictionary Learning and Feature Design

Structured Neural Networks (I)

Deep Learning (CNNs)

Hidden Markov Models Part 1: Introduction

Recurrent Neural Networks. COMP-550 Oct 5, 2017

lecture 6: modeling sequences (final part)

Homework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

(2pts) What is the object being embedded (i.e. a vector representing this object is computed) when one uses

Slide credit from Hung-Yi Lee & Richard Socher

A Bayesian Model of Diachronic Meaning Change

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation

Memory-Augmented Attention Model for Scene Text Recognition

Day-ahead time series forecasting: application to capacity planning

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Recurrent and Recursive Networks

Natural Language Processing

YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model

Caesar s Taxi Prediction Services

Natural Language Processing

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials

Helping the Red Cross predict flooding in Togo

Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop

Feature selection. Micha Elsner. January 29, 2014

Predicting flight on-time performance

Attention Based Joint Model with Negative Sampling for New Slot Values Recognition. By: Mulan Hou

Natural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Università di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7

Tasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

Presented By: Omer Shmueli and Sivan Niv

UNSUPERVISED LEARNING

Transcription:

Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction Feng Qian,LeiSha, Baobao Chang, Zhifang Sui Institute of Computational Linguistics, Peking University {nickqian, shalei, chbb, szf}@pku.edu.cn November 29, 2017 Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and1 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and2 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and3 Tensor-B / 27

Introduction Event extraction is important for knowledge acquisition from large amounts of news text. The result of event extraction can be used to construct knowledge base, which can be applied to question answering, dialogue system, etc. Its paradigm is ubiquitous in our daily life: Knowledge Graph Structured summary of search engine Wikipedia infobox Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and4 Tensor-B / 27

Applications of Event Extraction The Google search result of September 11 attacks: Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and5 Tensor-B / 27

Applications of Event Extraction The Wikipedia infobox of September 11 attacks: Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and6 Tensor-B / 27

Applications of Event Extraction The extracted events can be transferred into triples and store in the knowledge graphs. The knowledge graphs can be leveraged by upper applications. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and7 Tensor-B / 27

Event Extraction What s an event? Event Type: Trigger Argument Business Release Company Microsoft Product Surface Pro Place USA Figure: Microsoft releases surface Pro in USA. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and8 Tensor-B / 27

Event Extraction What s an event? Event Type: Trigger Argument Attack Crash Attacker Five hijackers Target World Trade Center s North Tower Instrument American Airlines Flight 11 Time September 11th Figure: On September 11th, five hijackers crashed American Airlines Flight 11 into the World Trade Center s North Tower. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and9 Tensor-B / 27

Event Extraction from News Text What should we do? Extract trigger Identify arguments Classify roles Event Type: Trigger Argument Victim Place Instrument Die Die cameraman Baghdad American tank Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and10 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and11 Tensor-B / 27

Motivation Challenges of event extraction by the previous solutions p Using syntax information as feature Using syntax information as architecture p Capture two kinds of argument-argument relationship (Pos & Neg) Capture large amount of argument-argument relationship Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and12 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and13 Tensor-B / 27

Event Extraction from News Text Motivation 1: Dependency relation! Dependency bridge According to definition of dependency relation, dependency edges usually contain some information about temporal, consequence, conditional or purpose. Figure: Example of dependency parse tree. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and14 Tensor-B / 27

Event Extraction from News Text We add dependency bridges to conventional LSTM-RNN architecture. Bidirectionality: Forward: Set all dependency bridges as forward. Backward: Set all dependency bridges as backward. A cameraman died when tank fired on the hotel Figure: Dependency bridge on LSTM. Apart from the last LSTM cell, each cell also receives information from former syntactically related cells. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and15 Tensor-B / 27

Event Extraction from News Text Details of dependency bridge We add a new gate d t and change the calculation of hidden state. h t = o t tanh(c t )+d 1 t S in P(i,p)2S in a p h i h? 2 34567 h > 2 345894 h @ 2 :;<=> h # - #01 h #01 + ~ + #, # - # / # $ $ %&'h $ h # - # %&'h h # - #01 h #01 + ~ + # A #, # - # / # $ $ $ %&'h $ %&'h + - # h # " # " # Figure: The calculation detail of dependency bridge. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and16 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and17 Tensor-B / 27

Event Extraction from News Text Motivation 2: We represent each arg-arg relationship by a vector We use a tensor to represent all kinds of arg-arg relationships in a sentence DNN Representation of Candidate Arguments Tensor layer Figure: The calculation detail of tensor layer. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and18 Tensor-B / 27

Event Extraction from News Text The whole architecture... Tensor layer is applied to the hidden layer of the dependency bridge RNN Then we apply max-pooling over arguments to find the most important interactive features for the arguments Candidate trigger A cameraman died when tank fired on the hotel Candidate Arguments DBLSTM-RNN DNN DBRNN layer Tensor layer Pooling layer Output layer Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and19 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and20 Tensor-B / 27

Weights of each dependency relation Figure: The visualization of trained weights of each dependency relations. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and21 Tensor-B / 27

Overall performance Figure: Performances of various approaches on ACE 2005 dataset. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and22 Tensor-B / 27

Ablation tests dependency bridge Binary DB: The weight of DB belongs to 0, 1 Typed DB: The weight of DB can be any float numbers Method Trigger Argument Argument id+cl id id+cl Our model without DB 69.0 62.7 54.6 +binarydb 71.2 63.9 56.8 +typeddb(full) 71.9 64.4 57.2 Table: Comparison after adding dependency bridges (DB). The numbers are F 1 scores. We compare with two baselines: no dependency bridges considered and only binary dependency bridges. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and23 Tensor-B / 27

Ablation tests Tensor layer dbrnn-sma: only cast SMA away from the whole model dbrnn-mp: means cast the max-pooling feature matrix away dbrnn-tl: dbrnn without tensor layer dbrnn full model Method 1/1 1/N All dbrnn-sma 59.5 67.0 64.1 Argument dbrnn-mp 59.7 64.8 62.0 Identification dbrnn-tl 59.6 55.8 58.2 dbrnn 59.9 69.5 67.7 dbrnn-sma 54.6 56.5 56.0 Argument Role dbrnn-mp 54.7 55.8 55.2 Classification dbrnn-tl 54.9 52.3 53.1 dbrnn 54.6 60.9 58.7 Table: Comparison between di erent models. Here, we report the argument performance since the tensor layer is only applied to argument extraction. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and24 Tensor-B / 27

Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and25 Tensor-B / 27

Conclusion In this paper: We propose to add dependency bridges to sequential architecture We propose to add tensor layer for capturing various of argument relationships The weights of dependency bridges after training illuminates the importance of each dependency type in event extraction task The full model achieves high performance in all the three evaluation metrics, trigger classification, argument identification and role classification Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and26 Tensor-B / 27

Thank you. Any questions? Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and27 Tensor-B / 27