Slide credit from Hung-Yi Lee & Richard Socher
|
|
- Annabel Gibson
- 5 years ago
- Views:
Transcription
1 Slide credit from Hung-Yi Lee & Richard Socher 1
2 Review Recurrent Neural Network 2
3 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step Assumption: temporal information matters 3
4 output word prob dist RNN Language Modeling hidden P(next word is wreck ) P(next word is a ) input P(next word is nice ) context vector P(next word is beach ) vector of START vector of wreck vector of a vector of nice Idea: pass the information from the previous hidden layer to leverage all contexts 4
5 RNNLM Formulation At each time step, probability of the next word vector of the current word 5
6 Recurrent Neural Network Definition : tanh, ReLU 6
7 Model Training All model parameters can be updated by y t-1 y t y t+1 target predicted 7
8 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 8
9 Backpropagation Layer l j l w ij Layer l 1 2 i l i Backward Pass Error signal l a j x 1 j l 1 l 1 Forward Pass 9
10 Backpropagation l δ l δ 1 l δ 2 Layer l 1 2 l z 1 l z 2 Layer L-1 L-1 1 z 2 z L1 1 L1 2 Layer L L 1 2 L z 1 L z 2 C y C y 1 C y 2 l i Backward Pass Error signal l δ i i l z i l W 1 T m W L T L1 z m n L z n C y n 10
11 Backpropagation through Time (BPTT) Unfold x t s t o t y t Input: init, x 1, x 2,, x t Output: o t Target: y t init x 1 s 1 x t-2 x t-1 s t-1 Cy C s t-2 o1 C o 2 C o n 11
12 Backpropagation through Time (BPTT) Unfold x t s t o t y t Input: init, x 1, x 2,, x t Output: o t Target: y t x t-2 x t-1 s t-1 1 s t Cy x 1 s 1 2 n init n 12
13 Backpropagation through Time (BPTT) Unfold x t s t o t y t x t-1 s t-1 Cy Input: init, x 1, x 2,, x t Output: o t Target: y t x t-2 s t-2 x 1 s 1 init 13
14 Backpropagation through Time (BPTT) Unfold Input: init, x 1, x 2,, x t Output: o t Target: y t init i x 1 s 1 j x t-2 j the same memory x t i x t-1 s t-1 j i s t-2 pointer pointer j i s t o t y t Cy Weights are tied together 14
15 Backpropagation through Time (BPTT) Unfold Input: init, x 1, x 2,, x t Output: o t Target: y t i x 1 s 1 j x t-2 j x t i x t-1 s t-1 j k i s t-2 k j i s t o t y t Cy init k Weights are tied together 15
16 BPTT Forward Pass: Backward Pass: Compute s 1, s 2, s 3, s 4 For C (4) For C (3) For C (2) For C (1) y 1 y 2 y 3 y 4 C (1) C (2) C (3) C (4) o 1 o 2 o 3 o 4 init s 1 s 2 s 3 s 4 x 1 x 2 x 3 x 4 16
17 RNN Training Issue The gradient is a product of Jacobian matrices, each associated with a step in the forward computation Multiply the same matrix at each time step during backprop The gradient becomes very small or very large quickly vanishing or exploding gradient Bengio et al., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. of Neural Networks, [link] Pascanu et al., On the difficulty of training recurrent neural networks, in ICML, [link] 17
18 Rough Error Surface Cost w 2 w1 The error surface is either very flat or very steep Bengio et al., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. of Neural Networks, [link] Pascanu et al., On the difficulty of training recurrent neural networks, in ICML, [link] 18
19 Possible Solutions Recurrent Neural Network 19
20 Exploding Gradient: Clipping clipped gradient Idea: control the gradient value to avoid exploding Cost Parameter setting: values from half to ten times the average can still yield convergence w 2 w 1 20
21 Vanishing Gradient: Initialization + ReLU IRNN initialize all W as identity matrix I use ReLU for activation functions Le et al., A Simple Way to Initialize Recurrent Networks of Rectified Linear Units, arxiv, [link] 21
22 Vanishing Gradient: Gating Mechanism RNN models temporal sequence information can handle long-term dependencies in theory I grew up in France I speak fluent French. Issue: RNN cannot handle such long-term dependencies in practice due to vanishing gradient apply the gating mechanism to directly encode the long-distance information 22
23 Extension Recurrent Neural Network 23
24 Bidirectional RNN h = h; h represents (summarizes) the past and future around a single token 24
25 Deep Bidirectional RNN Each memory layer passes an intermediate representation to the next 25
26 Concluding Remarks Recurrent Neural Networks Definition Issue: Vanishing/Exploding Gradient Solution: Exploding Gradient: Clipping Vanishing Gradient: Initialization, ReLU, Gated RNNs Extension Bidirectional Deep RNN 26
Recurrent Neural Networks. Jian Tang
Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating
More informationDeep Learning. Recurrent Neural Network (RNNs) Ali Ghodsi. October 23, Slides are partially based on Book in preparation, Deep Learning
Recurrent Neural Network (RNNs) University of Waterloo October 23, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Sequential data Recurrent neural
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationCSC321 Lecture 10 Training RNNs
CSC321 Lecture 10 Training RNNs Roger Grosse and Nitish Srivastava February 23, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 10 Training RNNs February 23, 2015 1 / 18 Overview Last time, we saw
More informationLecture 11 Recurrent Neural Networks I
Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor niversity of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks
More informationCSC321 Lecture 15: Exploding and Vanishing Gradients
CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent
More informationLong-Short Term Memory and Other Gated RNNs
Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling
More informationRecurrent Neural Networks (Part - 2) Sumit Chopra Facebook
Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationIntroduction to RNNs!
Introduction to RNNs Arun Mallya Best viewed with Computer Modern fonts installed Outline Why Recurrent Neural Networks (RNNs)? The Vanilla RNN unit The RNN forward pass Backpropagation refresher The RNN
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationRecurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST
1 Recurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST Summary We have shown: Now First order optimization methods: GD (BP), SGD, Nesterov, Adagrad, ADAM, RMSPROP, etc. Second
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationLong Short- Term Memory (LSTM) M1 Yuichiro Sawai Computa;onal Linguis;cs Lab. January 15, Deep Lunch
Long Short- Term Memory (LSTM) M1 Yuichiro Sawai Computa;onal Linguis;cs Lab. January 15, 2015 @ Deep Lunch 1 Why LSTM? OJen used in many recent RNN- based systems Machine transla;on Program execu;on Can
More informationBased on the original slides of Hung-yi Lee
Based on the original slides of Hung-yi Lee New Activation Function Rectified Linear Unit (ReLU) σ z a a = z Reason: 1. Fast to compute 2. Biological reason a = 0 [Xavier Glorot, AISTATS 11] [Andrew L.
More informationLecture 15: Exploding and Vanishing Gradients
Lecture 15: Exploding and Vanishing Gradients Roger Grosse 1 Introduction Last lecture, we introduced RNNs and saw how to derive the gradients using backprop through time. In principle, this lets us train
More informationarxiv: v3 [cs.lg] 14 Jan 2018
A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation Gang Chen Department of Computer Science and Engineering, SUNY at Buffalo arxiv:1610.02583v3 [cs.lg] 14 Jan 2018 1 abstract We describe
More informationLearning Long-Term Dependencies with Gradient Descent is Difficult
Learning Long-Term Dependencies with Gradient Descent is Difficult Y. Bengio, P. Simard & P. Frasconi, IEEE Trans. Neural Nets, 1994 June 23, 2016, ICML, New York City Back-to-the-future Workshop Yoshua
More informationStephen Scott.
1 / 35 (Adapted from Vinod Variyam and Ian Goodfellow) sscott@cse.unl.edu 2 / 35 All our architectures so far work on fixed-sized inputs neural networks work on sequences of inputs E.g., text, biological
More informationRecurrent and Recursive Networks
Neural Networks with Applications to Vision and Language Recurrent and Recursive Networks Marco Kuhlmann Introduction Applications of sequence modelling Map unsegmented connected handwriting to strings.
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationSequence Modeling with Neural Networks
Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes
More informationRECURRENT NETWORKS I. Philipp Krähenbühl
RECURRENT NETWORKS I Philipp Krähenbühl RECAP: CLASSIFICATION conv 1 conv 2 conv 3 conv 4 1 2 tu RECAP: SEGMENTATION conv 1 conv 2 conv 3 conv 4 RECAP: DETECTION conv 1 conv 2 conv 3 conv 4 RECAP: GENERATION
More informationEE-559 Deep learning LSTM and GRU
EE-559 Deep learning 11.2. LSTM and GRU François Fleuret https://fleuret.org/ee559/ Mon Feb 18 13:33:24 UTC 2019 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE The Long-Short Term Memory unit (LSTM) by Hochreiter
More informationNatural Language Processing and Recurrent Neural Networks
Natural Language Processing and Recurrent Neural Networks Pranay Tarafdar October 19 th, 2018 Outline Introduction to NLP Word2vec RNN GRU LSTM Demo What is NLP? Natural Language? : Huge amount of information
More informationDeep Recurrent Neural Networks
Deep Recurrent Neural Networks Artem Chernodub e-mail: a.chernodub@gmail.com web: http://zzphoto.me ZZ Photo IMMSP NASU 2 / 28 Neuroscience Biological-inspired models Machine Learning p x y = p y x p(x)/p(y)
More informationLecture 5: Recurrent Neural Networks
1/25 Lecture 5: Recurrent Neural Networks Nima Mohajerin University of Waterloo WAVE Lab nima.mohajerin@uwaterloo.ca July 4, 2017 2/25 Overview 1 Recap 2 RNN Architectures for Learning Long Term Dependencies
More informationEE-559 Deep learning Recurrent Neural Networks
EE-559 Deep learning 11.1. Recurrent Neural Networks François Fleuret https://fleuret.org/ee559/ Sun Feb 24 20:33:31 UTC 2019 Inference from sequences François Fleuret EE-559 Deep learning / 11.1. Recurrent
More informationNatural Language Processing
Natural Language Processing Pushpak Bhattacharyya CSE Dept, IIT Patna and Bombay LSTM 15 jun, 2017 lgsoft:nlp:lstm:pushpak 1 Recap 15 jun, 2017 lgsoft:nlp:lstm:pushpak 2 Feedforward Network and Backpropagation
More informationNeural Networks 2. 2 Receptive fields and dealing with image inputs
CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There
More informationRecurrent Neural Network
Recurrent Neural Network Xiaogang Wang xgwang@ee..edu.hk March 2, 2017 Xiaogang Wang (linux) Recurrent Neural Network March 2, 2017 1 / 48 Outline 1 Recurrent neural networks Recurrent neural networks
More informationTTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization
TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Vanishing and Exploding Gradients ReLUs Xavier Initialization Batch Normalization Highway Architectures: Resnets, LSTMs and GRUs Causes
More informationRecurrent Neural Networks Deep Learning Lecture 5. Efstratios Gavves
Recurrent Neural Networks Deep Learning Lecture 5 Efstratios Gavves Sequential Data So far, all tasks assumed stationary data Neither all data, nor all tasks are stationary though Sequential Data: Text
More informationNeural Networks Language Models
Neural Networks Language Models Philipp Koehn 10 October 2017 N-Gram Backoff Language Model 1 Previously, we approximated... by applying the chain rule p(w ) = p(w 1, w 2,..., w n ) p(w ) = i p(w i w 1,...,
More informationLearning Unitary Operators with Help from u(n)
@_hylandsl Learning Unitary Operators with Help from u(n) Stephanie L. Hyland 1,2, Gunnar Rätsch 1 1 Department of Computer Science, ETH Zurich 2 Tri-Institutional Training Program in Computational Biology
More informationSequence Models. Ji Yang. Department of Computing Science, University of Alberta. February 14, 2018
Sequence Models Ji Yang Department of Computing Science, University of Alberta February 14, 2018 This is a note mainly based on Prof. Andrew Ng s MOOC Sequential Models. I also include materials (equations,
More informationModelling Time Series with Neural Networks. Volker Tresp Summer 2017
Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,
More informationSpeech and Language Processing
Speech and Language Processing Lecture 5 Neural network based acoustic and language models Information and Communications Engineering Course Takahiro Shinoaki 08//6 Lecture Plan (Shinoaki s part) I gives
More informationNatural Language Understanding. Recap: probability, language models, and feedforward networks. Lecture 12: Recurrent Neural Networks and LSTMs
Natural Language Understanding Lecture 12: Recurrent Neural Networks and LSTMs Recap: probability, language models, and feedforward networks Simple Recurrent Networks Adam Lopez Credits: Mirella Lapata
More informationRecurrent Neural Networks. Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion. Slides were adapted from lectures by Richard Socher
Recurrent Neural Networks Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion Slides were adapted from lectures by Richard Socher Overview Tradi8onal language models RNNs RNN language
More informationLong-Short Term Memory
Long-Short Term Memory Sepp Hochreiter, Jürgen Schmidhuber Presented by Derek Jones Table of Contents 1. Introduction 2. Previous Work 3. Issues in Learning Long-Term Dependencies 4. Constant Error Flow
More informationNEURAL LANGUAGE MODELS
COMP90042 LECTURE 14 NEURAL LANGUAGE MODELS LANGUAGE MODELS Assign a probability to a sequence of words Framed as sliding a window over the sentence, predicting each word from finite context to left E.g.,
More informationIndex. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation
More informationRecurrent Neural Networks 2. CS 287 (Based on Yoav Goldberg s notes)
Recurrent Neural Networks 2 CS 287 (Based on Yoav Goldberg s notes) Review: Representation of Sequence Many tasks in NLP involve sequences w 1,..., w n Representations as matrix dense vectors X (Following
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationTask-Oriented Dialogue System (Young, 2000)
2 Review Task-Oriented Dialogue System (Young, 2000) 3 http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Speech Recognition Hypothesis are there any action movies to see
More informationContents. (75pts) COS495 Midterm. (15pts) Short answers
Contents (75pts) COS495 Midterm 1 (15pts) Short answers........................... 1 (5pts) Unequal loss............................. 2 (15pts) About LSTMs........................... 3 (25pts) Modular
More informationAnalysis of the Learning Process of a Recurrent Neural Network on the Last k-bit Parity Function
Analysis of the Learning Process of a Recurrent Neural Network on the Last k-bit Parity Function Austin Wang Adviser: Xiuyuan Cheng May 4, 2017 1 Abstract This study analyzes how simple recurrent neural
More informationRecurrent Neural Networks
Charu C. Aggarwal IBM T J Watson Research Center Yorktown Heights, NY Recurrent Neural Networks Neural Networks and Deep Learning, Springer, 218 Chapter 7.1 7.2 The Challenges of Processing Sequences Conventional
More informationDeep learning for Natural Language Processing and Machine Translation
Deep learning for Natural Language Processing and Machine Translation 2015.10.16 Seung-Hoon Na Contents Introduction: Neural network, deep learning Deep learning for Natural language processing Neural
More informationNatural Language Processing
Natural Language Processing Pushpak Bhattacharyya CSE Dept, IIT Patna and Bombay Recurrent Neural Network 1 NLP-ML marriage 2 An eample SMS complaint n I have purchased a 80 litre Videocon fridge about
More informationDeep Learning and Lexical, Syntactic and Semantic Analysis. Wanxiang Che and Yue Zhang
Deep Learning and Lexical, Syntactic and Semantic Analysis Wanxiang Che and Yue Zhang 2016-10 Part 2: Introduction to Deep Learning Part 2.1: Deep Learning Background What is Machine Learning? From Data
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationarxiv: v2 [cs.ne] 7 Apr 2015
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units arxiv:154.941v2 [cs.ne] 7 Apr 215 Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton Google Abstract Learning long term dependencies
More informationBackpropagation and Neural Networks part 1. Lecture 4-1
Lecture 4: Backpropagation and Neural Networks part 1 Lecture 4-1 Administrative A1 is due Jan 20 (Wednesday). ~150 hours left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Also note: Lectures
More informationarxiv: v1 [cs.cl] 21 May 2017
Spelling Correction as a Foreign Language Yingbo Zhou yingbzhou@ebay.com Utkarsh Porwal uporwal@ebay.com Roberto Konow rkonow@ebay.com arxiv:1705.07371v1 [cs.cl] 21 May 2017 Abstract In this paper, we
More informationGated Recurrent Neural Tensor Network
Gated Recurrent Neural Tensor Network Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani and Satoshi Nakamura Faculty of Computer Science, Universitas Indonesia, Indonesia Email: andros.tjandra@gmail.com,
More informationRecurrent Neural Networks with Flexible Gates using Kernel Activation Functions
2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,
More informationDeep Learning Recurrent Networks 10/11/2017
Deep Learning Recurrent Networks 10/11/2017 1 Which open source project? Related math. What is it talking about? And a Wikipedia page explaining it all The unreasonable effectiveness of recurrent neural
More informationtext classification 3: neural networks
text classification 3: neural networks CS 585, Fall 2018 Introduction to Natural Language Processing http://people.cs.umass.edu/~miyyer/cs585/ Mohit Iyyer College of Information and Computer Sciences University
More informationTraining Neural Networks Practical Issues
Training Neural Networks Practical Issues M. Soleymani Sharif University of Technology Fall 2017 Most slides have been adapted from Fei Fei Li and colleagues lectures, cs231n, Stanford 2017, and some from
More informationCSCI 315: Artificial Intelligence through Deep Learning
CSCI 315: Artificial Intelligence through Deep Learning W&L Winter Term 2017 Prof. Levy Recurrent Neural Networks (Chapter 7) Recall our first-week discussion... How do we know stuff? (MIT Press 1996)
More informationDeep Learning Tutorial. 李宏毅 Hung-yi Lee
Deep Learning Tutorial 李宏毅 Hung-yi Lee Outline Part I: Introduction of Deep Learning Part II: Why Deep? Part III: Tips for Training Deep Neural Network Part IV: Neural Network with Memory Part I: Introduction
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationDeep Learning Sequence to Sequence models: Attention Models. 17 March 2018
Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:
More informationRandom Coattention Forest for Question Answering
Random Coattention Forest for Question Answering Jheng-Hao Chen Stanford University jhenghao@stanford.edu Ting-Po Lee Stanford University tingpo@stanford.edu Yi-Chun Chen Stanford University yichunc@stanford.edu
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationRecurrent neural networks
12-1: Recurrent neural networks Prof. J.C. Kao, UCLA Recurrent neural networks Motivation Network unrollwing Backpropagation through time Vanishing and exploding gradients LSTMs GRUs 12-2: Recurrent neural
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationRecurrent Neural Networks. deeplearning.ai. Why sequence models?
Recurrent Neural Networks deeplearning.ai Why sequence models? Examples of sequence data The quick brown fox jumped over the lazy dog. Speech recognition Music generation Sentiment classification There
More informationGoogle s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Google s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Y. Wu, M. Schuster, Z. Chen, Q.V. Le, M. Norouzi, et al. Google arxiv:1609.08144v2 Reviewed by : Bill
More informationHigh Order LSTM/GRU. Wenjie Luo. January 19, 2016
High Order LSTM/GRU Wenjie Luo January 19, 2016 1 Introduction RNN is a powerful model for sequence data but suffers from gradient vanishing and explosion, thus difficult to be trained to capture long
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationNeural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016
Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph
More informationNatural Language Processing
Natural Language Processing Language models Based on slides from Michael Collins, Chris Manning and Richard Soccer Plan Problem definition Trigram models Evaluation Estimation Interpolation Discounting
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Spring 2018 http://vllab.ee.ntu.edu.tw/dlcv.html (primary) https://ceiba.ntu.edu.tw/1062dlcv (grade, etc.) FB: DLCV Spring 2018 Yu-Chiang Frank Wang 王鈺強, Associate Professor
More informationNLP Programming Tutorial 8 - Recurrent Neural Nets
NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and Technology (NAIST) 1 Feed Forward Neural Nets All connections point forward ϕ( x) y It is a directed acyclic
More informationARTIFICIAL neural networks (ANNs) are made from
1 Recent Advances in Recurrent Neural Networks Hojjat Salehinejad, Sharan Sankar, Joseph Barfett, Errol Colak, and Shahrokh Valaee arxiv:1801.01078v3 [cs.ne] 22 Feb 2018 Abstract Recurrent neural networks
More informationIntroduction to Deep Learning
Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 1 / 39 Outline 1 Universality of Neural Networks
More informationConvolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积 LSTM 网络 : 利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积 LSTM 网络 : 利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23 Content Quick Review of Recurrent Neural Network Introduction
More informationSpatial Transformer. Ref: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transformer Networks, NIPS, 2015
Spatial Transormer Re: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transormer Networks, NIPS, 2015 Spatial Transormer Layer CNN is not invariant to scaling and rotation
More informationImproved Learning through Augmenting the Loss
Improved Learning through Augmenting the Loss Hakan Inan inanh@stanford.edu Khashayar Khosravi khosravi@stanford.edu Abstract We present two improvements to the well-known Recurrent Neural Network Language
More informationRecurrent Neural Networks
Recurrent Neural Networks Datamining Seminar Kaspar Märtens Karl-Oskar Masing Today's Topics Modeling sequences: a brief overview Training RNNs with back propagation A toy example of training an RNN Why
More informationAnalysis of Multilayer Neural Network Modeling and Long Short-Term Memory
Analysis of Multilayer Neural Network Modeling and Long Short-Term Memory Danilo López, Nelson Vera, Luis Pedraza International Science Index, Mathematical and Computational Sciences waset.org/publication/10006216
More informationLearning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials
Learning Recurrent Neural Networks with Hessian-Free Optimization: Supplementary Materials Contents 1 Pseudo-code for the damped Gauss-Newton vector product 2 2 Details of the pathological synthetic problems
More informationShort-term water demand forecast based on deep neural network ABSTRACT
Short-term water demand forecast based on deep neural network Guancheng Guo 1, Shuming Liu 2 1,2 School of Environment, Tsinghua University, 100084, Beijing, China 2 shumingliu@tsinghua.edu.cn ABSTRACT
More informationGianluca Pollastri, Head of Lab School of Computer Science and Informatics and. University College Dublin
Introduction ti to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University College Dublin gianluca.pollastri@ucd.ie Credits
More informationArtificial Neuron (Perceptron)
9/6/208 Gradient Descent (GD) Hantao Zhang Deep Learning with Python Reading: https://en.wikipedia.org/wiki/gradient_descent Artificial Neuron (Perceptron) = w T = w 0 0 + + w 2 2 + + w d d where
More informationRECURRENT NEURAL NETWORKS WITH FLEXIBLE GATES USING KERNEL ACTIVATION FUNCTIONS
2018 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 17 20, 2018, AALBORG, DENMARK RECURRENT NEURAL NETWORKS WITH FLEXIBLE GATES USING KERNEL ACTIVATION FUNCTIONS Simone Scardapane,
More informationRecurrent Neural Network Training with Preconditioned Stochastic Gradient Descent
Recurrent Neural Network Training with Preconditioned Stochastic Gradient Descent 1 Xi-Lin Li, lixilinx@gmail.com arxiv:1606.04449v2 [stat.ml] 8 Dec 2016 Abstract This paper studies the performance of
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationOn the use of Long-Short Term Memory neural networks for time series prediction
On the use of Long-Short Term Memory neural networks for time series prediction Pilar Gómez-Gil National Institute of Astrophysics, Optics and Electronics ccc.inaoep.mx/~pgomez In collaboration with: J.
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationGenerating Sequences with Recurrent Neural Networks
Generating Sequences with Recurrent Neural Networks Alex Graves University of Toronto & Google DeepMind Presented by Zhe Gan, Duke University May 15, 2015 1 / 23 Outline Deep recurrent neural network based
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationA QUESTION ANSWERING SYSTEM USING ENCODER-DECODER, SEQUENCE-TO-SEQUENCE, RECURRENT NEURAL NETWORKS. A Project. Presented to
A QUESTION ANSWERING SYSTEM USING ENCODER-DECODER, SEQUENCE-TO-SEQUENCE, RECURRENT NEURAL NETWORKS A Project Presented to The Faculty of the Department of Computer Science San José State University In
More informationCSC321 Lecture 15: Recurrent Neural Networks
CSC321 Lecture 15: Recurrent Neural Networks Roger Grosse Roger Grosse CSC321 Lecture 15: Recurrent Neural Networks 1 / 26 Overview Sometimes we re interested in predicting sequences Speech-to-text and
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More information