Attention Based Joint Model with Negative Sampling for New Slot Values Recognition. By: Mulan Hou
|
|
- Madison Lamb
- 5 years ago
- Views:
Transcription
1 Attention Based Joint Model with Negative Sampling for New Slot Values Recognition By: Mulan Hou
2 CONTE NTS Introduction Related work Motivation Proposed model Experiments Conclusion
3 CHAPTE R 1 Introduction
4 Introduction User input System output Natural Language Understanding Natural Language Generation Dialogue Manager
5 Introduction User input System output Natural Language Understanding Natural Language Generation Dialogue Manager
6 Introduction User input Natural Language Understanding Dialogue Manager
7 Introduction User input Natural Language Understanding Slot-value Dialogue Manager
8 Introduction User input Natural Language Understanding Slot-value Dialogue Manager E.g NLU Function = FaceTime Can I have a video chat on my phone? (Slot = Value) DM
9 Introduction Natural Language Understanding NEW Slot-value Query Dialogue Manager Database SLOT Standard Value0 Standard Value1
10 Introduction E.g Natural Language Understanding NEW Slot-value Query Dialogue Manager Database FUNCTION Can I scroll the screen to have a screenshot on my phone? Function = Smart Screenshot (Slot = Value) Not in the predefined set Face Time Wechat
11 Introduction Natural Language Understanding New values Standard values Dialogue Manager Sequence labeling raw text in utterances standard slot values Classification Problems of new slot values recognition for the lack of training data. Attention based joint model with negative sampling Query Database SLOT Standard Value0 Standard Value1
12 CHAPTE R 2 Related work Sequence labeling Pipeline based Classification based
13 Related work Sequence labeling Sequence labels words l l l l w w w w O O B I I I O O O O Can I have a video chat on my phone? Pipeline based methods Can I have a video chat on my phone? 1 extract the raw texts from utterance have a video chat FaceTim e Classification based methods Can I have a video chat on my phone? 2 normalized the texts into standard slot values FaceTim e
14 Related work Sequence labeling Extra normalization operations Pipeline based methods Prone to accumulating errors Classification based methods Disability to deal with new slot values Losing local information
15 Related work Sequence labeling Xuezhe Ma and Eduard Hovy. End-to-end sequence labeling via bi-directional lstm-cnnscrf. Gokhan 2016 Tur, Dilek Hakkani-Tur et al. Sentence simplification for spoken language understanding. Kaisheng 2011 Yao, Baolin Peng et al. Recurrent neural networks for language understanding Kaisheng Yao, Baolin Peng et al. Spoken language understanding using long short-term memory neural networks Pipeline based methods F Lef evre. Dynamic bayesian networks and discriminative classifiers for multi-stage semantic interpretation.2007 Peter Z Yeh, Benjamin Douglas et al. A speech-driven second screen application for tv program discovery Classification based methods Rahul Bhagat, Anton Leuski, and Eduard Hovy. Statistical shallow semantic parsing despite little training data Franc ois Mairesse, Milica Gasic et al. Spoken language understanding from unaligned data using discriminative classification models Ana Mendes Pedro Mota, Lu Natural language understanding as a classification process: report of initial experiments and results. 2012
16 CHAPTE R 3 Motivation
17 Motivation Sequence labeling Classification based methods Take local information into consideration Obtains normalized slot values directly + Attention based joint model with negative sampling Able to deal with new slot values
18 CHAPTE R 4 Proposed model Attention based joint model Negative sampling
19 Proposed model Attention based joint model utterance Model SLOT UNK NULL Standard Standard Value0 Standard Value1 Value2
20 Proposed model Attention based joint model Model SLOT UNK NULL Standard Standard Value0 Standard Value1 Value2 utterance
21 Proposed model Attention based joint model Sequence tagger Bidirectional Classifier Attention layer SLOT UNK NULL Standard Standard Value0 Standard Value1 Value2 Embedding layer utterance
22 Proposed model Attention based joint model t-1 v v v t-1 t t t+1 t+ 1 = H Attention layer qt-1 q t q t+ 1 Classifier Sequence tagger v t-1 h et-1 t-1 s t-1 et v v t t+ 1 h e h t t+1 t +1 s s t t+ 1 W y h t- 1 h t h t+ 1 Bidirectional Embedding layer wt-1 w w t t+ 1 h T softmax function align function concatenate
23 Proposed model Attention based joint model t-1 vt -1 t vt q q q t-1 t t+ 1 t+1 t+ 1 v = H Attention layer Classifier Sequence tagger v v v t-1 t t+1 e e t-1 h e t-1 t h t t +1h t +1 st-1 s s t t+ 1 W y h t- 1 h t h t+ 1 Bidirectional h T Embedding layer wt-1 w t w t+ 1
24 Proposed model Attention based joint model t-1 v v v t-1 t t t+1 t+ 1 = H qt-1 q t q t+ 1 Classifier L tagging Sequence tagger N 1 1 N i T i T i t i L s, s t i t v t-1 h et-1 t-1 s t-1 et v v t t+ 1 h e h t t+1 t +1 s s t t+ 1 h t- 1 h t h t+ 1 L W y N 1 = L y, y N classification i i i h T wt-1 w w t t+ 1 L = g L (1 g) L tagging classification
25 Proposed model Negative sampling Models will fail in recognizing new slot values without corresponding training data Construct negative samples of the existing slot values to simulate new ones Existing slot value: Can I have a video chat on my phone? Slot = FaceTime New slot value: Can I scroll the screen to have a screenshot on my phone? Slot = Smart Screenshot Shared context: Can I phone? Random words Non-value on my phone?
26 Proposed model Negative sampling Models will fail in recognizing new slot values without corresponding training data Construct negative samples of the existing slot values to simulate new ones Existing slot value: Can I have a video chat on my phone? Slot = FaceTime New slot value: Can I scroll the screen to have a screenshot on my phone? Slot = Smart Screenshot Can I Shared context: Random words Non-value phone? on my phone? Negative sampling
27 Proposed model Negative sampling Words distribution count ( word ) U( word ) Data Old values as templates Can I have a video chat on my phone? Chinese food Sample from vocabulary by words distribution to fill the templates Can I have a video chat on my phone? O O B-func I-func I-func I-func O O O O negative sampling: You use battery let Can I You use battery let on my phone? O O B-func I-func I-func I-func O O O O FaceTime UNK negative sampling: O O O O O O B- func I-func I-func I-func I-func I-func I-func O O UNK
28 CHAPTE R 5 Experiments Results Analyses
29 Experiments Results Dataset: DSTC(English) Service(Chinese) DSTC --- an English dataset from a public contest and we use DSTC2 and DSTC3 together. It collects 5510 dialogues about hotels and restaurants booking. Only the slot Service food --- a Chinese dialogue dataset which is mainly about consultation for cell phones and contains a single slot named Corpus DSTC Service train dev test train dev test old ne Original data w null negative samples overall size Statistics of two dataset Corpus DSTC Service train dev test train dev test old new Statistics of value types
30 Experiments Results Baselines: _FM _C, without negative samples _FM: pipeline based method, labeling the words with slot values tags by model and then normalized them into standard values by Fuzzy Matching. _C: classification based mothed, encoding the utterance by model and then use a full-connected layer as a Classifier. st-1 s s t t+ 1 h t- 1 h t+ 1 h t Fuzzy Matching y W y _FM _C
31 Experiments Results: Results DSTC Service all NEW OLD NULL all NEW OLD NULL _FM _C (a) F1 AJM_NS(ours) scores of classification for different models
32 Experiments Results: Results DSTC Service all NEW OLD NULL all NEW OLD NULL _FM _C (a) F1 AJM_NS(ours) scores of classification for different models
33 Experiments Results: Results DSTC Service all NEW OLD NULL all NEW OLD NULL _FM _C (a) F1 AJM_NS(ours) scores of classification for different models _FM 6 AJM_NS(ours) DSTC Service all NEW OLD all NEW OLD (b) F1 scores of sequence labeling
34 Experiments Results: Results DSTC Service all NEW OLD NULL all NEW OLD NULL _FM _C (a) F1 AJM_NS(ours) scores of classification for different models _FM 6 AJM_NS(ours) DSTC Service all NEW OLD all NEW OLD (b) F1 scores of sequence labeling
35 Experiments Results Comparison inside model: Attention mechanism(ajm) & Negative sampling(jm_ns) DSTC Service all NEW OLD NULL All NEW OLD NULL Full(AJM_N Full(AJM_NS) S) -Attention only(jm_ns) NS only(ajm)
36 Experiments Results Comparison inside model: Attention mechanism(ajm) & Negative sampling(jm_ns) DSTC Service all NEW OLD NULL All NEW OLD NULL Full(AJM_NS) NS only(ajm) Confusion metrices AJM DSTC NEW OLD NULL NEW OLD NULL NEW NEW OLD OLD NULL NULL Service NEW OLD NULL NEW OLD NULL NEW NEW OLD 0 0 OLD NULL NULL AJM_NS
37 Experiments Results Comparison inside model: Attention mechanism(ajm) & Negative sampling(jm_ns) DSTC Service all NEW OLD NULL All NEW OLD NULL Full(AJM_NS) NS only(ajm) Classification results based on negative samples _FM_NS _C_NS JM_NS 5 DSTC Service all NEW OLD NULL all NEW OLD NULL
38 Experiments Results Comparison inside model: Attention mechanism(ajm) & Negative sampling(jm_ns) DSTC Service all NEW OLD NULL All NEW OLD NULL Full(AJM_NS) Attention only(jm_ns) Attention mechanism v t-1 et-1 h t -1 et v t ht v t+1 et+1 h t+ 1
39 Experiments Results Comparison inside model: Attention mechanism(ajm) & Negative sampling(jm_ns) DSTC Service all NEW OLD NULL All NEW OLD NULL Full(AJM_NS) Attention only(jm_ns) Attention mechanism DSTC Service Full (AJM_NS) -Attention (JM_NS) heatmap True Pred True i want an indonesia n restauran t in the north part of town O O O B-food O O O O O O O O O O B-food O O O O O O O i want an indonesia n restauran t in the north part of town O O O B-food O O O O O O O indonesia n indonesia n indonesia n A7 O O B-func I-func I-func I-func O O O O O B-func I-func I-func I-func O O O A7 O O B-func I-func I-func I-func O O O
40 CHAPTE R 6 Conclusion
41 Conclusion We propose an attention based joint model with negative sampling. Maps the utterance into standard slot values directly without extra normalization operations Negative sampling for existing values for a certain slot S enables our model to effectively recognize new slot values Joint model collaborated by attention mechanism promotes the performance Experimental results demonstrate that our model achieves impressive improvements on new slot values with less damage on other sub-datasets
42 THANK YOU
Task-Oriented Dialogue System (Young, 2000)
2 Review Task-Oriented Dialogue System (Young, 2000) 3 http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Speech Recognition Hypothesis are there any action movies to see
More informationRecent Developments in Statistical Dialogue Systems
Recent Developments in Statistical Dialogue Systems Steve Young Machine Intelligence Laboratory Information Engineering Division Cambridge University Engineering Department Cambridge, UK Contents Review
More informationPresented By: Omer Shmueli and Sivan Niv
Deep Speaker: an End-to-End Neural Speaker Embedding System Chao Li, Xiaokong Ma, Bing Jiang, Xiangang Li, Xuewei Zhang, Xiao Liu, Ying Cao, Ajay Kannan, Zhenyao Zhu Presented By: Omer Shmueli and Sivan
More informationAdversarial Training and Decoding Strategies for End-to-end Neural Conversation Models
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Adversarial Training and Decoding Strategies for End-to-end Neural Conversation Models Hori, T.; Wang, W.; Koji, Y.; Hori, C.; Harsham, B.A.;
More informationAdvanced Cutting Edge Research Seminar. Dialogue System with Deep Neural Networks
Advanced Cutting Edge Research Seminar Dialogue System with Deep Neural Networks Assistant Professor Koichiro Yoshino Nara Institute of Science and Technology Augmented Human Communication Laboratory PRESTO,
More informationWhat s so Hard about Natural Language Understanding?
What s so Hard about Natural Language Understanding? Alan Ritter Computer Science and Engineering The Ohio State University Collaborators: Jiwei Li, Dan Jurafsky (Stanford) Bill Dolan, Michel Galley, Jianfeng
More informationThe SJTU System for Dialog State Tracking Challenge 2
The SJTU System for Dialog State Tracking Challenge 2 Kai Sun, Lu Chen, Su Zhu and Kai Yu Department of Computer Science and Engineering, Shanghai Jiao Tong University Shanghai, China {accreator, chenlusz,
More informationRecurrent Neural Networks. deeplearning.ai. Why sequence models?
Recurrent Neural Networks deeplearning.ai Why sequence models? Examples of sequence data The quick brown fox jumped over the lazy dog. Speech recognition Music generation Sentiment classification There
More informationDeep Learning Sequence to Sequence models: Attention Models. 17 March 2018
Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:
More informationCS230: Lecture 10 Sequence models II
CS23: Lecture 1 Sequence models II Today s outline We will learn how to: - Automatically score an NLP model I. BLEU score - Improve Machine II. Beam Search Translation results with Beam search III. Speech
More informationarxiv: v1 [cs.cl] 31 May 2015
Recurrent Neural Networks with External Memory for Language Understanding Baolin Peng 1, Kaisheng Yao 2 1 The Chinese University of Hong Kong 2 Microsoft Research blpeng@se.cuhk.edu.hk, kaisheny@microsoft.com
More informationRecurrent Neural Networks (Part - 2) Sumit Chopra Facebook
Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech
More informationHidden Markov Models Hamid R. Rabiee
Hidden Markov Models Hamid R. Rabiee 1 Hidden Markov Models (HMMs) In the previous slides, we have seen that in many cases the underlying behavior of nature could be modeled as a Markov process. However
More informationLearning to translate with neural networks. Michael Auli
Learning to translate with neural networks Michael Auli 1 Neural networks for text processing Similar words near each other France Spain dog cat Neural networks for text processing Similar words near each
More informationDeep Learning for Natural Language Processing. Sidharth Mudgal April 4, 2017
Deep Learning for Natural Language Processing Sidharth Mudgal April 4, 2017 Table of contents 1. Intro 2. Word Vectors 3. Word2Vec 4. Char Level Word Embeddings 5. Application: Entity Matching 6. Conclusion
More informationRecurrent Neural Network
Recurrent Neural Network Xiaogang Wang xgwang@ee..edu.hk March 2, 2017 Xiaogang Wang (linux) Recurrent Neural Network March 2, 2017 1 / 48 Outline 1 Recurrent neural networks Recurrent neural networks
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationNLP Homework: Dependency Parsing with Feed-Forward Neural Network
NLP Homework: Dependency Parsing with Feed-Forward Neural Network Submission Deadline: Monday Dec. 11th, 5 pm 1 Background on Dependency Parsing Dependency trees are one of the main representations used
More informationSequence Modeling with Neural Networks
Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes
More informationwith Local Dependencies
CS11-747 Neural Networks for NLP Structured Prediction with Local Dependencies Xuezhe Ma (Max) Site https://phontron.com/class/nn4nlp2017/ An Example Structured Prediction Problem: Sequence Labeling Sequence
More informationMemory-Augmented Attention Model for Scene Text Recognition
Memory-Augmented Attention Model for Scene Text Recognition Cong Wang 1,2, Fei Yin 1,2, Cheng-Lin Liu 1,2,3 1 National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences
More informationRecurrent Neural Networks. Jian Tang
Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationSparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent
More informationANLP Lecture 22 Lexical Semantics with Dense Vectors
ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous
More informationAutoregressive Neural Models for Statistical Parametric Speech Synthesis
Autoregressive Neural Models for Statistical Parametric Speech Synthesis シンワン Xin WANG 2018-01-11 contact: wangxin@nii.ac.jp we welcome critical comments, suggestions, and discussion 1 https://www.slideshare.net/kotarotanahashi/deep-learning-library-coyotecnn
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationSegmental Recurrent Neural Networks for End-to-end Speech Recognition
Segmental Recurrent Neural Networks for End-to-end Speech Recognition Liang Lu, Lingpeng Kong, Chris Dyer, Noah Smith and Steve Renals TTI-Chicago, UoE, CMU and UW 9 September 2016 Background A new wave
More informationWaveNet: A Generative Model for Raw Audio
WaveNet: A Generative Model for Raw Audio Ido Guy & Daniel Brodeski Deep Learning Seminar 2017 TAU Outline Introduction WaveNet Experiments Introduction WaveNet is a deep generative model of raw audio
More informationGenerating Sequences with Recurrent Neural Networks
Generating Sequences with Recurrent Neural Networks Alex Graves University of Toronto & Google DeepMind Presented by Zhe Gan, Duke University May 15, 2015 1 / 23 Outline Deep recurrent neural network based
More informationApplied Natural Language Processing
Applied Natural Language Processing Info 256 Lecture 7: Testing (Feb 12, 2019) David Bamman, UC Berkeley Significance in NLP You develop a new method for text classification; is it better than what comes
More informationConditional Language Modeling. Chris Dyer
Conditional Language Modeling Chris Dyer Unconditional LMs A language model assigns probabilities to sequences of words,. w =(w 1,w 2,...,w`) It is convenient to decompose this probability using the chain
More informationNatural Language Processing SoSe Words and Language Model
Natural Language Processing SoSe 2016 Words and Language Model Dr. Mariana Neves May 2nd, 2016 Outline 2 Words Language Model Outline 3 Words Language Model Tokenization Separation of words in a sentence
More informationlecture 6: modeling sequences (final part)
Natural Language Processing 1 lecture 6: modeling sequences (final part) Ivan Titov Institute for Logic, Language and Computation Outline After a recap: } Few more words about unsupervised estimation of
More informationNeural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation
Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with Semantic Aggregation Van-Khanh Tran 1,2 and Le-Minh Nguyen 1 1 Japan Advanced Institute of Science and Technology, JAIST
More informationSpatial Role Labeling CS365 Course Project
Spatial Role Labeling CS365 Course Project Amit Kumar, akkumar@iitk.ac.in Chandra Sekhar, gchandra@iitk.ac.in Supervisor : Dr.Amitabha Mukerjee ABSTRACT In natural language processing one of the important
More informationNatural Language Processing SoSe Language Modelling. (based on the slides of Dr. Saeedeh Momtazi)
Natural Language Processing SoSe 2015 Language Modelling Dr. Mariana Neves April 20th, 2015 (based on the slides of Dr. Saeedeh Momtazi) Outline 2 Motivation Estimation Evaluation Smoothing Outline 3 Motivation
More informationNEURAL LANGUAGE MODELS
COMP90042 LECTURE 14 NEURAL LANGUAGE MODELS LANGUAGE MODELS Assign a probability to a sequence of words Framed as sliding a window over the sentence, predicting each word from finite context to left E.g.,
More informationSemantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing
Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Language Models. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Language Models Tobias Scheffer Stochastic Language Models A stochastic language model is a probability distribution over words.
More informationGoogle s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, et al. Google arxiv:1609.08144v2 Reviewed by : Bill
More informationDialogue Systems. Statistical NLU component. Representation. A Probabilistic Dialogue System. Task: map a sentence + context to a database query
Statistical NLU component Task: map a sentence + context to a database query Dialogue Systems User: Show me flights from NY to Boston, leaving tomorrow System: [returns a list of flights] Origin (City
More informationNatural Language Processing and Recurrent Neural Networks
Natural Language Processing and Recurrent Neural Networks Pranay Tarafdar October 19 th, 2018 Outline Introduction to NLP Word2vec RNN GRU LSTM Demo What is NLP? Natural Language? : Huge amount of information
More informationHomework 4, Part B: Structured perceptron
Homework 4, Part B: Structured perceptron CS 585, UMass Amherst, Fall 2016 Overview Due Friday, Oct 28. Get starter code/data from the course website s schedule page. You should submit a zipped directory
More informationAspect Term Extraction with History Attention and Selective Transformation 1
Aspect Term Extraction with History Attention and Selective Transformation 1 Xin Li 1, Lidong Bing 2, Piji Li 1, Wai Lam 1, Zhimou Yang 3 Presenter: Lin Ma 2 1 The Chinese University of Hong Kong 2 Tencent
More informationLecture 13: Structured Prediction
Lecture 13: Structured Prediction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/nlp16 CS6501: NLP 1 Quiz 2 v Lectures 9-13 v Lecture 12: before page
More informationHIGH PERFORMANCE CTC TRAINING FOR END-TO-END SPEECH RECOGNITION ON GPU
April 4-7, 2016 Silicon Valley HIGH PERFORMANCE CTC TRAINING FOR END-TO-END SPEECH RECOGNITION ON GPU Minmin Sun, NVIDIA minmins@nvidia.com April 5th Brief Introduction of CTC AGENDA Alpha/Beta Matrix
More informationCross-Lingual Language Modeling for Automatic Speech Recogntion
GBO Presentation Cross-Lingual Language Modeling for Automatic Speech Recogntion November 14, 2003 Woosung Kim woosung@cs.jhu.edu Center for Language and Speech Processing Dept. of Computer Science The
More informationON THE USE OF MLP-DISTANCE TO ESTIMATE POSTERIOR PROBABILITIES BY KNN FOR SPEECH RECOGNITION
Zaragoza Del 8 al 1 de Noviembre de 26 ON THE USE OF MLP-DISTANCE TO ESTIMATE POSTERIOR PROBABILITIES BY KNN FOR SPEECH RECOGNITION Ana I. García Moral, Carmen Peláez Moreno EPS-Universidad Carlos III
More informationAugmented Statistical Models for Speech Recognition
Augmented Statistical Models for Speech Recognition Mark Gales & Martin Layton 31 August 2005 Trajectory Models For Speech Processing Workshop Overview Dependency Modelling in Speech Recognition: latent
More informationsmart reply and implicit semantics Matthew Henderson and Brian Strope Google AI
smart reply and implicit semantics Matthew Henderson and Brian Strope Google AI collaborators include: Rami Al-Rfou, Yun-hsuan Sung Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar Balint Miklos, Ray Kurzweil and
More informationIntegrating Order Information and Event Relation for Script Event Prediction
Integrating Order Information and Event Relation for Script Event Prediction Zhongqing Wang 1,2, Yue Zhang 2 and Ching-Yun Chang 2 1 Soochow University, China 2 Singapore University of Technology and Design
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data Statistical Machine Learning from Data Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne (EPFL),
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationYNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional -CRF Model Quanlei Liao, Jin Wang, Jinnan Yang and Xuejie Zhang School of Information Science and Engineering
More informationMidterm sample questions
Midterm sample questions CS 585, Brendan O Connor and David Belanger October 12, 2014 1 Topics on the midterm Language concepts Translation issues: word order, multiword translations Human evaluation Parts
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationTemporal Modeling and Basic Speech Recognition
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Temporal Modeling and Basic Speech Recognition Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Today s lecture Recognizing
More informationMachine Learning Basics
Security and Fairness of Deep Learning Machine Learning Basics Anupam Datta CMU Spring 2019 Image Classification Image Classification Image classification pipeline Input: A training set of N images, each
More informationLong-Short Term Memory and Other Gated RNNs
Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling
More informationMulti-theme Sentiment Analysis using Quantified Contextual
Multi-theme Sentiment Analysis using Quantified Contextual Valence Shifters Hongkun Yu, Jingbo Shang, MeichunHsu, Malú Castellanos, Jiawei Han Presented by Jingbo Shang University of Illinois at Urbana-Champaign
More informationa) b) (Natural Language Processing; NLP) (Deep Learning) Bag of words White House RGB [1] IBM
c 1. (Natural Language Processing; NLP) (Deep Learning) RGB IBM 135 8511 5 6 52 yutat@jp.ibm.com a) b) 2. 1 0 2 1 Bag of words White House 2 [1] 2015 4 Copyright c by ORSJ. Unauthorized reproduction of
More informationPart-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287
Part-of-Speech Tagging + Neural Networks 3: Word Embeddings CS 287 Review: Neural Networks One-layer multi-layer perceptron architecture, NN MLP1 (x) = g(xw 1 + b 1 )W 2 + b 2 xw + b; perceptron x is the
More informationCITS 4402 Computer Vision
CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images
More informationImproved Learning through Augmenting the Loss
Improved Learning through Augmenting the Loss Hakan Inan inanh@stanford.edu Khashayar Khosravi khosravi@stanford.edu Abstract We present two improvements to the well-known Recurrent Neural Network Language
More informationSocial Media & Text Analysis
Social Media & Text Analysis lecture 5 - Paraphrase Identification and Logistic Regression CSE 5539-0010 Ohio State University Instructor: Wei Xu Website: socialmedia-class.org In-class Presentation pick
More informationReasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17
Reasoning with Memory Augmented Neural Networks for Language Comprehension, ICLR17 UMass Medical School ICLR 2017 Presenter: Jack Lanchantin ICLR 2017 Presenter: Jack Lanchantin 1 Outline 1 Intro 2 Hypothesis
More informationOn the use of Long-Short Term Memory neural networks for time series prediction
On the use of Long-Short Term Memory neural networks for time series prediction Pilar Gómez-Gil National Institute of Astrophysics, Optics and Electronics ccc.inaoep.mx/~pgomez In collaboration with: J.
More informationDriving Semantic Parsing from the World s Response
Driving Semantic Parsing from the World s Response James Clarke, Dan Goldwasser, Ming-Wei Chang, Dan Roth Cognitive Computation Group University of Illinois at Urbana-Champaign CoNLL 2010 Clarke, Goldwasser,
More informationLinear Classifiers IV
Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers IV Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic
More informationMultiple Aspect Ranking Using the Good Grief Algorithm. Benjamin Snyder and Regina Barzilay MIT
Multiple Aspect Ranking Using the Good Grief Algorithm Benjamin Snyder and Regina Barzilay MIT From One Opinion To Many Much previous work assumes one opinion per text. (Turney 2002; Pang et al 2002; Pang
More informationQuestion Answering on Statistical Linked Data
Question Answering on Statistical Linked Data AKSW Colloquium paper presentation Konrad Höffner Universität Leipzig, AKSW/MOLE, PhD Student 2015-2-16 1 / 18 1 2 3 2 / 18 Motivation Statistical Linked Data
More informationNatural Language Processing
Natural Language Processing Pushpak Bhattacharyya CSE Dept, IIT Patna and Bombay LSTM 15 jun, 2017 lgsoft:nlp:lstm:pushpak 1 Recap 15 jun, 2017 lgsoft:nlp:lstm:pushpak 2 Feedforward Network and Backpropagation
More informationApplied Natural Language Processing
Applied Natural Language Processing Info 256 Lecture 20: Sequence labeling (April 9, 2019) David Bamman, UC Berkeley POS tagging NNP Labeling the tag that s correct for the context. IN JJ FW SYM IN JJ
More informationEntropy-based data organization tricks for browsing logs and packet captures
Entropy-based data organization tricks for browsing logs and packet captures Department of Computer Science Dartmouth College Outline 1 Log browsing moves Pipes and tables Trees are better than pipes and
More informationDT2118 Speech and Speaker Recognition
DT2118 Speech and Speaker Recognition Language Modelling Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 56 Outline Introduction Formal Language Theory Stochastic Language Models (SLM) N-gram Language
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/jv7vj9 Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationPre-Initialized Composition For Large-Vocabulary Speech Recognition
Pre-Initialized Composition For Large-Vocabulary Speech Recognition Cyril Allauzen, Michael Riley Google Research, 76 Ninth Avenue, New York, NY, USA allauzen@google.com, riley@google.com Abstract This
More informationModelling Time Series with Neural Networks. Volker Tresp Summer 2017
Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,
More informationMore Smoothing, Tuning, and Evaluation
More Smoothing, Tuning, and Evaluation Nathan Schneider (slides adapted from Henry Thompson, Alex Lascarides, Chris Dyer, Noah Smith, et al.) ENLP 21 September 2016 1 Review: 2 Naïve Bayes Classifier w
More informationA Tutorial on Learning with Bayesian Networks
A utorial on Learning with Bayesian Networks David Heckerman Presented by: Krishna V Chengavalli April 21 2003 Outline Introduction Different Approaches Bayesian Networks Learning Probabilities and Structure
More informationAssignment 1. Learning distributed word representations. Jimmy Ba
Assignment 1 Learning distributed word representations Jimmy Ba csc321ta@cstorontoedu Background Text and language play central role in a wide range of computer science and engineering problems Applications
More informationA Generative Score Space for Statistical Dialog Characterization in Social Signalling
A Generative Score Space for Statistical Dialog Characterization in Social Signalling 1 S t-1 1 S t 1 S t+4 2 S t-1 2 S t 2 S t+4 Anna Pesarin, Paolo Calanca, Vittorio Murino, Marco Cristani Istituto Italiano
More informationStatistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.
http://goo.gl/xilnmn Course website KYOTO UNIVERSITY Statistical Machine Learning Theory From Multi-class Classification to Structured Output Prediction Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related
More informationSequence Models. Ji Yang. Department of Computing Science, University of Alberta. February 14, 2018
Sequence Models Ji Yang Department of Computing Science, University of Alberta February 14, 2018 This is a note mainly based on Prof. Andrew Ng s MOOC Sequential Models. I also include materials (equations,
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More information{ Jurafsky & Martin Ch. 6:! 6.6 incl.
N-grams Now Simple (Unsmoothed) N-grams Smoothing { Add-one Smoothing { Backo { Deleted Interpolation Reading: { Jurafsky & Martin Ch. 6:! 6.6 incl. 1 Word-prediction Applications Augmentative Communication
More informationNatural Language Understanding. Kyunghyun Cho, NYU & U. Montreal
Natural Language Understanding Kyunghyun Cho, NYU & U. Montreal 2 Machine Translation NEURAL MACHINE TRANSLATION 3 Topics: Statistical Machine Translation log p(f e) =log p(e f) + log p(f) f = (La, croissance,
More informationJointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction
Jointly Extracting Event Triggers and Arguments by Dependency-Bridge RNN and Tensor-Based Argument Interaction Feng Qian,LeiSha, Baobao Chang, Zhifang Sui Institute of Computational Linguistics, Peking
More informationTALP at GeoQuery 2007: Linguistic and Geographical Analysis for Query Parsing
TALP at GeoQuery 2007: Linguistic and Geographical Analysis for Query Parsing Daniel Ferrés and Horacio Rodríguez TALP Research Center Software Department Universitat Politècnica de Catalunya {dferres,horacio}@lsi.upc.edu
More informationDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing Dylan Drover, Borui Ye, Jie Peng University of Waterloo djdrover@uwaterloo.ca borui.ye@uwaterloo.ca July 8, 2015 Dylan Drover, Borui Ye, Jie Peng (University
More informationChapter 3: Basics of Language Modelling
Chapter 3: Basics of Language Modelling Motivation Language Models are used in Speech Recognition Machine Translation Natural Language Generation Query completion For research and development: need a simple
More informationPersonal Project: Shift-Reduce Dependency Parsing
Personal Project: Shift-Reduce Dependency Parsing 1 Problem Statement The goal of this project is to implement a shift-reduce dependency parser. This entails two subgoals: Inference: We must have a shift-reduce
More informationImproving Sequence-to-Sequence Constituency Parsing
Improving Sequence-to-Sequence Constituency Parsing Lemao Liu, Muhua Zhu and Shuming Shi Tencent AI Lab, Shenzhen, China {redmondliu,muhuazhu, shumingshi}@tencent.com Abstract Sequence-to-sequence constituency
More informationarxiv: v2 [cs.cl] 1 Jan 2019
Variational Self-attention Model for Sentence Representation arxiv:1812.11559v2 [cs.cl] 1 Jan 2019 Qiang Zhang 1, Shangsong Liang 2, Emine Yilmaz 1 1 University College London, London, United Kingdom 2
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationSpoken Language Understanding in a Latent Topic-based Subspace
INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Spoken Language Understanding in a Latent Topic-based Subspace Mohamed Morchid 1, Mohamed Bouaziz 1,3, Waad Ben Kheder 1, Killian Janod 1,2, Pierre-Michel
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More information