Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction
|
|
- Beverley Terry
- 5 years ago
- Views:
Transcription
1 Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction Feng Qian,LeiSha, Baobao Chang, Zhifang Sui Institute of Computational Linguistics, Peking University {nickqian, shalei, chbb, November 29, 2017 Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and1 Tensor-B / 27
2 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and2 Tensor-B / 27
3 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and3 Tensor-B / 27
4 Introduction Event extraction is important for knowledge acquisition from large amounts of news text. The result of event extraction can be used to construct knowledge base, which can be applied to question answering, dialogue system, etc. Its paradigm is ubiquitous in our daily life: Knowledge Graph Structured summary of search engine Wikipedia infobox Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and4 Tensor-B / 27
5 Applications of Event Extraction The Google search result of September 11 attacks: Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and5 Tensor-B / 27
6 Applications of Event Extraction The Wikipedia infobox of September 11 attacks: Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and6 Tensor-B / 27
7 Applications of Event Extraction The extracted events can be transferred into triples and store in the knowledge graphs. The knowledge graphs can be leveraged by upper applications. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and7 Tensor-B / 27
8 Event Extraction What s an event? Event Type: Trigger Argument Business Release Company Microsoft Product Surface Pro Place USA Figure: Microsoft releases surface Pro in USA. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and8 Tensor-B / 27
9 Event Extraction What s an event? Event Type: Trigger Argument Attack Crash Attacker Five hijackers Target World Trade Center s North Tower Instrument American Airlines Flight 11 Time September 11th Figure: On September 11th, five hijackers crashed American Airlines Flight 11 into the World Trade Center s North Tower. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and9 Tensor-B / 27
10 Event Extraction from News Text What should we do? Extract trigger Identify arguments Classify roles Event Type: Trigger Argument Victim Place Instrument Die Die cameraman Baghdad American tank Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and10 Tensor-B / 27
11 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and11 Tensor-B / 27
12 Motivation Challenges of event extraction by the previous solutions p Using syntax information as feature Using syntax information as architecture p Capture two kinds of argument-argument relationship (Pos & Neg) Capture large amount of argument-argument relationship Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and12 Tensor-B / 27
13 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and13 Tensor-B / 27
14 Event Extraction from News Text Motivation 1: Dependency relation! Dependency bridge According to definition of dependency relation, dependency edges usually contain some information about temporal, consequence, conditional or purpose. Figure: Example of dependency parse tree. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and14 Tensor-B / 27
15 Event Extraction from News Text We add dependency bridges to conventional LSTM-RNN architecture. Bidirectionality: Forward: Set all dependency bridges as forward. Backward: Set all dependency bridges as backward. A cameraman died when tank fired on the hotel Figure: Dependency bridge on LSTM. Apart from the last LSTM cell, each cell also receives information from former syntactically related cells. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and15 Tensor-B / 27
16 Event Extraction from News Text Details of dependency bridge We add a new gate d t and change the calculation of hidden state. h t = o t tanh(c t )+d 1 t S in P(i,p)2S in a p h i h? h > :;<=> h # - #01 h #01 + ~ + #, # - # / # $ $ %&'h $ h # - # %&'h h # - #01 h #01 + ~ + # A #, # - # / # $ $ $ %&'h $ %&'h + - # h # " # " # Figure: The calculation detail of dependency bridge. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and16 Tensor-B / 27
17 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and17 Tensor-B / 27
18 Event Extraction from News Text Motivation 2: We represent each arg-arg relationship by a vector We use a tensor to represent all kinds of arg-arg relationships in a sentence DNN Representation of Candidate Arguments Tensor layer Figure: The calculation detail of tensor layer. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and18 Tensor-B / 27
19 Event Extraction from News Text The whole architecture... Tensor layer is applied to the hidden layer of the dependency bridge RNN Then we apply max-pooling over arguments to find the most important interactive features for the arguments Candidate trigger A cameraman died when tank fired on the hotel Candidate Arguments DBLSTM-RNN DNN DBRNN layer Tensor layer Pooling layer Output layer Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and19 Tensor-B / 27
20 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and20 Tensor-B / 27
21 Weights of each dependency relation Figure: The visualization of trained weights of each dependency relations. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and21 Tensor-B / 27
22 Overall performance Figure: Performances of various approaches on ACE 2005 dataset. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and22 Tensor-B / 27
23 Ablation tests dependency bridge Binary DB: The weight of DB belongs to 0, 1 Typed DB: The weight of DB can be any float numbers Method Trigger Argument Argument id+cl id id+cl Our model without DB binarydb typeddb(full) Table: Comparison after adding dependency bridges (DB). The numbers are F 1 scores. We compare with two baselines: no dependency bridges considered and only binary dependency bridges. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and23 Tensor-B / 27
24 Ablation tests Tensor layer dbrnn-sma: only cast SMA away from the whole model dbrnn-mp: means cast the max-pooling feature matrix away dbrnn-tl: dbrnn without tensor layer dbrnn full model Method 1/1 1/N All dbrnn-sma Argument dbrnn-mp Identification dbrnn-tl dbrnn dbrnn-sma Argument Role dbrnn-mp Classification dbrnn-tl dbrnn Table: Comparison between di erent models. Here, we report the argument performance since the tensor layer is only applied to argument extraction. Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and24 Tensor-B / 27
25 Table of Contents 1 Introduction 2 Motivations 3 Dependency bridges 4 Tensor for various arg-arg relationships 5 Experiments 6 Conclusion Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and25 Tensor-B / 27
26 Conclusion In this paper: We propose to add dependency bridges to sequential architecture We propose to add tensor layer for capturing various of argument relationships The weights of dependency bridges after training illuminates the importance of each dependency type in event extraction task The full model achieves high performance in all the three evaluation metrics, trigger classification, argument identification and role classification Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and26 Tensor-B / 27
27 Thank you. Any questions? Jointly Extracting Event Triggers and Arguments by Dependency-Bridge November 29, 2017 RNN and27 Tensor-B / 27
Capturing Argument Relationships for Chinese Semantic Role Labeling
Capturing Argument Relationships for Chinese Semantic Role abeling ei Sha, Tingsong Jiang, Sujian i, Baobao Chang, Zhifang Sui Key aboratory of Computational inguistics, Ministry of Education School of
More informationDeep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017
Deep Learning for Natural Language Processing Sidharth Mudgal April 4, 2017 Table of contents 1. Intro 2. Word Vectors 3. Word2Vec 4. Char Level Word Embeddings 5. Application: Entity Matching 6. Conclusion
More informationRecurrent Neural Networks (Part - 2) Sumit Chopra Facebook
Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationCIKM 18, October 22-26, 2018, Torino, Italy
903 Session 6B: Knowledge Modelling 1.1 Entity Typing by Structured Features Existing systems [16, 19, 27] cast entity typing as a single-instance multi-label classification problem. For each entity e,
More informationChunking with Support Vector Machines
NAACL2001 Chunking with Support Vector Machines Graduate School of Information Science, Nara Institute of Science and Technology, JAPAN Taku Kudo, Yuji Matsumoto {taku-ku,matsu}@is.aist-nara.ac.jp Chunking
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationRandom Coattention Forest for Question Answering
Random Coattention Forest for Question Answering Jheng-Hao Chen Stanford University jhenghao@stanford.edu Ting-Po Lee Stanford University tingpo@stanford.edu Yi-Chun Chen Stanford University yichunc@stanford.edu
More informationTracking the World State with Recurrent Entity Networks
Tracking the World State with Recurrent Entity Networks Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann LeCun Task At each timestep, get information (in the form of a sentence) about the
More informationTask-Oriented Dialogue System (Young, 2000)
2 Review Task-Oriented Dialogue System (Young, 2000) 3 http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Speech Recognition Hypothesis are there any action movies to see
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationDeep Learning Sequence to Sequence models: Attention Models. 17 March 2018
Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationarxiv: v2 [cs.cl] 20 Apr 2017
Syntax Aware LSM Model for Chinese Semantic Role Labeling Feng Qian 2, Lei Sha 1, Baobao Chang 1, Lu-chen Liu 2, Ming Zhang 2 1 Key Laboratory of Computational Linguistics, Ministry of Education, Peking
More informationGoogle s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, et al. Google arxiv:1609.08144v2 Reviewed by : Bill
More informationLong-Short Term Memory
Long-Short Term Memory Sepp Hochreiter, Jürgen Schmidhuber Presented by Derek Jones Table of Contents 1. Introduction 2. Previous Work 3. Issues in Learning Long-Term Dependencies 4. Constant Error Flow
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor niversity of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationarxiv: v3 [cs.lg] 14 Jan 2018
A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation Gang Chen Department of Computer Science and Engineering, SUNY at Buffalo arxiv:1610.02583v3 [cs.lg] 14 Jan 2018 1 abstract We describe
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationParts 3-6 are EXAMPLES for cse634
1 Parts 3-6 are EXAMPLES for cse634 FINAL TEST CSE 352 ARTIFICIAL INTELLIGENCE Fall 2008 There are 6 pages in this exam. Please make sure you have all of them INTRODUCTION Philosophical AI Questions Q1.
More informationPrediction and Uncertainty Quantification of Daily Airport Flight Delays
Proceedings of Machine Learning Research 82 (2017) 45-51, 4th International Conference on Predictive Applications and APIs Prediction and Uncertainty Quantification of Daily Airport Flight Delays Thomas
More informationRecurrent Neural Networks. Jian Tang
Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating
More informationOn the use of Long-Short Term Memory neural networks for time series prediction
On the use of Long-Short Term Memory neural networks for time series prediction Pilar Gómez-Gil National Institute of Astrophysics, Optics and Electronics ccc.inaoep.mx/~pgomez In collaboration with: J.
More informationDynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji
Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient
More informationTowards Universal Sentence Embeddings
Towards Universal Sentence Embeddings Towards Universal Paraphrastic Sentence Embeddings J. Wieting, M. Bansal, K. Gimpel and K. Livescu, ICLR 2016 A Simple But Tough-To-Beat Baseline For Sentence Embeddings
More informationLaconic: Label Consistency for Image Categorization
1 Laconic: Label Consistency for Image Categorization Samy Bengio, Google with Jeff Dean, Eugene Ie, Dumitru Erhan, Quoc Le, Andrew Rabinovich, Jon Shlens, and Yoram Singer 2 Motivation WHAT IS THE OCCLUDED
More informationApplied Natural Language Processing
Applied Natural Language Processing Info 256 Lecture 20: Sequence labeling (April 9, 2019) David Bamman, UC Berkeley POS tagging NNP Labeling the tag that s correct for the context. IN JJ FW SYM IN JJ
More informationLECTURER: BURCU CAN Spring
LECTURER: BURCU CAN 2017-2018 Spring Regular Language Hidden Markov Model (HMM) Context Free Language Context Sensitive Language Probabilistic Context Free Grammar (PCFG) Unrestricted Language PCFGs can
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Language Models Tobias Scheffer Stochastic Language Models A stochastic language model is a probability distribution over words.
More informationTwo-Stream Bidirectional Long Short-Term Memory for Mitosis Event Detection and Stage Localization in Phase-Contrast Microscopy Images
Two-Stream Bidirectional Long Short-Term Memory for Mitosis Event Detection and Stage Localization in Phase-Contrast Microscopy Images Yunxiang Mao and Zhaozheng Yin (B) Computer Science, Missouri University
More informationPart-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287
Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287 Review: Neural Networks One-layer multi-layer perceptron architecture, NN MLP1 (x) = g(xw 1 + b 1 )W 2 + b 2 xw + b; perceptron x is the
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationDriving Semantic Parsing from the World s Response
Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,
More informationEE-559 Deep learning Recurrent Neural Networks
EE-559 Deep learning 11.1. Recurrent Neural Networks François Fleuret https://fleuret.org/ee559/ Sun Feb 24 20:33:31 UTC 2019 Inference from sequences François Fleuret EE-559 Deep learning / 11.1. Recurrent
More informationLecture 15: Exploding and Vanishing Gradients
Lecture 15: Exploding and Vanishing Gradients Roger Grosse 1 Introduction Last lecture, we introduced RNNs and saw how to derive the gradients using backprop through time. In principle, this lets us train
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationModels, Data, Learning Problems
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Models, Data, Learning Problems Tobias Scheffer Overview Types of learning problems: Supervised Learning (Classification, Regression,
More informationLong-Short Term Memory and Other Gated RNNs
Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling
More informationGenerating Sequences with Recurrent Neural Networks
Generating Sequences with Recurrent Neural Networks Alex Graves University of Toronto & Google DeepMind Presented by Zhe Gan, Duke University May 15, 2015 1 / 23 Outline Deep recurrent neural network based
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationCSC321 Lecture 15: Exploding and Vanishing Gradients
CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent
More informationMultimodal context analysis and prediction
Multimodal context analysis and prediction Valeria Tomaselli (valeria.tomaselli@st.com) Sebastiano Battiato Giovanni Maria Farinella Tiziana Rotondo (PhD student) Outline 2 Context analysis vs prediction
More informationsmart reply and implicit semantics Matthew Henderson and Brian Strope Google AI
smart reply and implicit semantics Matthew Henderson and Brian Strope Google AI collaborators include: Rami Al-Rfou, Yun-hsuan Sung Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar Balint Miklos, Ray Kurzweil and
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationConditional Language modeling with attention
Conditional Language modeling with attention 2017.08.25 Oxford Deep NLP 조수현 Review Conditional language model: assign probabilities to sequence of words given some conditioning context x What is the probability
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationContents. (75pts) COS495 Midterm. (15pts) Short answers
Contents (75pts) COS495 Midterm 1 (15pts) Short answers........................... 1 (5pts) Unequal loss............................. 2 (15pts) About LSTMs........................... 3 (25pts) Modular
More informationChinese Character Handwriting Generation in TensorFlow
Chinese Character Handwriting Generation in TensorFlow Heri Zhao, Jiayu Ye, Ke Xu Abstract Recurrent neural network(rnn) has been proved to be successful in several sequence generation task. RNN can generate
More informationInformation Extraction from Text
Information Extraction from Text Jing Jiang Chapter 2 from Mining Text Data (2012) Presented by Andrew Landgraf, September 13, 2013 1 What is Information Extraction? Goal is to discover structured information
More informationwith Local Dependencies
CS11-747 Neural Networks for NLP Structured Prediction with Local Dependencies Xuezhe Ma (Max) Site https://phontron.com/class/nn4nlp2017/ An Example Structured Prediction Problem: Sequence Labeling Sequence
More informationDeep Learning for NLP
Deep Learning for NLP Instructor: Wei Xu Ohio State University CSE 5525 Many slides from Greg Durrett Outline Motivation for neural networks Feedforward neural networks Applying feedforward neural networks
More informationIntroduction to RNNs!
Introduction to RNNs Arun Mallya Best viewed with Computer Modern fonts installed Outline Why Recurrent Neural Networks (RNNs)? The Vanilla RNN unit The RNN forward pass Backpropagation refresher The RNN
More informationInformal Definition: Telling things apart
9. Decision Trees Informal Definition: Telling things apart 2 Nominal data No numeric feature vector Just a list or properties: Banana: longish, yellow Apple: round, medium sized, different colors like
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationMaschinelle Sprachverarbeitung
Maschinelle Sprachverarbeitung Parsing with Probabilistic Context-Free Grammar Ulf Leser Content of this Lecture Phrase-Structure Parse Trees Probabilistic Context-Free Grammars Parsing with PCFG Other
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationSegmental Recurrent Neural Networks for End-to-end Speech Recognition
Segmental Recurrent Neural Networks for End-to-end Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, Noah Smith and Steve Renals TTI-Chicago, UoE, CMU and UW 9 September 2016 Background A new wave
More informationTuning as Linear Regression
Tuning as Linear Regression Marzieh Bazrafshan, Tagyoung Chung and Daniel Gildea Department of Computer Science University of Rochester Rochester, NY 14627 Abstract We propose a tuning method for statistical
More informationMachine Learning for Structured Prediction
Machine Learning for Structured Prediction Grzegorz Chrupa la National Centre for Language Technology School of Computing Dublin City University NCLT Seminar Grzegorz Chrupa la (DCU) Machine Learning for
More informationarxiv: v2 [quant-ph] 16 Nov 2018
aaacxicdvhlsgmxfe3hv62vvswncwelkrmikdlgi7cqc1yfwyro+mthmasibkjlgg+wk3u/s2/wn8wfsxs1qsjh3nuosckjhcuhb8fry5+yxfpejyawv1bx2jxnm8tto1hftcs23ui7aohciwcjnxieojjf/xphvrdcxortlqhqykdgj6u6ako5kjmwo5gtsc0fi/qtgbvtaxmolcnxuap7gqihlspyhdblqgbicil5q1atid3qkfhfqqo+1ki6e5f+cyrt/txh1f/oj9+skd2npbhlnngojzmpd8k9tyjdw0kykioniem9jfmxflvtjmjlaseio9n9llpk/ahkfldycthdga3aj3t58/gwfolthsqx2olgidl87cdyigsjusbud182x0/7nbjs9utoacgfz/g1uj2phuaubx9u6fyy7kljdts8owchowj1dsarmc6qvbi39l78ta8bw9nvoovjv1tsanx9rbsmy8zw==
More informationQuasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs
Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan Institute of Computer Science and Technology Peking University September 5, 2017
More informationRecurrent Neural Network
Recurrent Neural Network Xiaogang Wang xgwang@ee..edu.hk March 2, 2017 Xiaogang Wang (linux) Recurrent Neural Network March 2, 2017 1 / 48 Outline 1 Recurrent neural networks Recurrent neural networks
More informationConvolutional Dictionary Learning and Feature Design
1 Convolutional Dictionary Learning and Feature Design Lawrence Carin Duke University 16 September 214 1 1 Background 2 Convolutional Dictionary Learning 3 Hierarchical, Deep Architecture 4 Convolutional
More informationStructured Neural Networks (I)
Structured Neural Networks (I) CS 690N, Spring 208 Advanced Natural Language Processing http://peoplecsumassedu/~brenocon/anlp208/ Brendan O Connor College of Information and Computer Sciences University
More informationDeep Learning (CNNs)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell
More informationHidden Markov Models Part 1: Introduction
Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that
More informationRecurrent Neural Networks. COMP-550 Oct 5, 2017
Recurrent Neural Networks COMP-550 Oct 5, 2017 Outline Introduction to neural networks and deep learning Feedforward neural networks Recurrent neural networks 2 Classification Review y = f( x) output label
More informationlecture 6: modeling sequences (final part)
Natural Language Processing 1 lecture 6: modeling sequences (final part) Ivan Titov Institute for Logic, Language and Computation Outline After a recap: } Few more words about unsupervised estimation of
More informationHomework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown
Homework 3 COMS 4705 Fall 017 Prof. Kathleen McKeown The assignment consists of a programming part and a written part. For the programming part, make sure you have set up the development environment as
More informationMachine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL
Machine learning: lecture 20 ommi. Jaakkola MI CAI tommi@csail.mit.edu opics Representation and graphical models examples Bayesian networks examples, specification graphs and independence associated distribution
More information(2pts) What is the object being embedded (i.e. a vector representing this object is computed) when one uses
Contents (75pts) COS495 Midterm 1 (15pts) Short answers........................... 1 (5pts) Unequal loss............................. 2 (15pts) About LSTMs........................... 3 (25pts) Modular
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step
More informationA Bayesian Model of Diachronic Meaning Change
A Bayesian Model of Diachronic Meaning Change Lea Frermann and Mirella Lapata Institute for Language, Cognition, and Computation School of Informatics The University of Edinburgh lea@frermann.de www.frermann.de
More informationMaking Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation
Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Dr. Yanjun Qi Department of Computer Science University of Virginia Tutorial @ ACM BCB-2018 8/29/18 Yanjun Qi / UVA
More informationMemory-Augmented Attention Model for Scene Text Recognition
Memory-Augmented Attention Model for Scene Text Recognition Cong Wang 1,2, Fei Yin 1,2, Cheng-Lin Liu 1,2,3 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences
More informationDay-ahead time series forecasting: application to capacity planning
Day-ahead time series forecasting: application to capacity planning Colin Leverger 1,2, Vincent Lemaire 1 Simon Malinowski 2, Thomas Guyet 3, Laurence Rozé 4 1 Orange Labs, 35000 Rennes France 2 Univ.
More informationAsaf Bar Zvi Adi Hayat. Semantic Segmentation
Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random
More informationRecurrent and Recursive Networks
Neural Networks with Applications to Vision and Language Recurrent and Recursive Networks Marco Kuhlmann Introduction Applications of sequence modelling Map unsegmented connected handwriting to strings.
More informationNatural Language Processing
Natural Language Processing Global linear models Based on slides from Michael Collins Globally-normalized models Why do we decompose to a sequence of decisions? Can we directly estimate the probability
More informationYNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional -CRF Model Quanlei Liao, Jin Wang, Jinnan Yang and Xuejie Zhang School of Information Science and Engineering
More informationCaesar s Taxi Prediction Services
1 Caesar s Taxi Prediction Services Predicting NYC Taxi Fares, Trip Distance, and Activity Paul Jolly, Boxiao Pan, Varun Nambiar Abstract In this paper, we propose three models each predicting either taxi
More informationNatural Language Processing
Natural Language Processing Pushpak Bhattacharyya CSE Dept, IIT Patna and Bombay LSTM 15 jun, 2017 lgsoft:nlp:lstm:pushpak 1 Recap 15 jun, 2017 lgsoft:nlp:lstm:pushpak 2 Feedforward Network and Backpropagation
More informationGeometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat
Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental
More informationLearning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials
Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials Contents 1 Pseudo-code for the damped Gauss-Newton vector product 2 2 Details of the pathological synthetic problems
More informationHelping the Red Cross predict flooding in Togo
Helping the Red Cross predict flooding in Togo Abstract- This project aims to help the Red Cross predict flooding occurrences in Togo due to overflow in the Nangbeto Dam. Flooding is a result of both high
More informationOverview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop
Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop Project Tips Announcement: Hint for PSet1: Understand
More informationFeature selection. Micha Elsner. January 29, 2014
Feature selection Micha Elsner January 29, 2014 2 Using megam as max-ent learner Hal Daume III from UMD wrote a max-ent learner Pretty typical of many classifiers out there... Step one: create a text file
More informationPredicting flight on-time performance
1 Predicting flight on-time performance Arjun Mathur, Aaron Nagao, Kenny Ng I. INTRODUCTION Time is money, and delayed flights are a frequent cause of frustration for both travellers and airline companies.
More informationAttention Based Joint Model with Negative Sampling for New Slot Values Recognition. By: Mulan Hou
Attention Based Joint Model with Negative Sampling for New Slot Values Recognition By: Mulan Hou houmulan@bupt.edu.cn CONTE NTS 1 2 3 4 5 6 Introduction Related work Motivation Proposed model Experiments
More informationNatural Language Processing CS Lecture 06. Razvan C. Bunescu School of Electrical Engineering and Computer Science
Natural Language Processing CS 6840 Lecture 06 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Statistical Parsing Define a probabilistic model of syntax P(T S):
More informationNeural Networks 2. 2 Receptive fields and dealing with image inputs
CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There
More informationUniversità di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7
Università di Pisa A.A. 2016-2017 Data Mining II June 13th, 2017 Exercise 1 - Sequential patterns (6 points) a) (3 points) Given the following input sequence < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D}
More informationTasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based
UNDERSTANDING CNN ADAS Tasks Self Driving Localizati on Perception Planning/ Control Driver state Vehicle Diagnosis Smart factory Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationPresented By: Omer Shmueli and Sivan Niv
Deep Speaker: an End-to-End Neural Speaker Embedding System Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu Presented By: Omer Shmueli and Sivan
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More information