Recurrent Neural Network

Size: px
Start display at page:

Download "Recurrent Neural Network"

Transcription

1 Recurrent Neural Network The Deepest of All Deep Learning Slides by Chen Liang

2 Deep Learning

3 Deep Learning

4 Deep learning works like the human brain? Demystify Deep Learning

5 Deep Learning: Building Blocks

6 Deep Learning: Deep Composition

7 Deep Learning: Gradient Descent

8 Deep Learning: Demo TensorFlow Playground

9 Deep Learning: Weight Sharing

10 Recurrent Neural Network Deepest of Deep learning? Can be infinitely deep Equation RNN, LSTM illustrationsfrom Christopher Olah s blog

11 BPTT: Backpropagation Through Time

12 RNN is Turing complete, but... Exploding/vanishing gradient Short term dependency Long term dependency

13 LSTM: Long-Short Term Memory

14 LSTM: Long-Short Term Memory Let the gradient flow to earlier steps

15 RNN: A General Framework Image Caption Generation Sentiment Analysis Image Recognition Machine Translation Speech recognition Language Modeling

16 Char-RNN How it works? Vocabulary: [ h, e, l, o ] Training sequence: hello

17 Char-RNN Linux Latex Wikipedia Music Check out the blog: The Unreasonable Effectiveness of RNN

18 What does the Neuron do?

19 Seq2seq: sequence-to-sequence learning

20 Sequence is even longer now => Attention Again, let the gradient flow to earlier steps

21 More Seq2seq fun: Chatbots

22 More Seq2seq fun: Programmers

23 Summary Deep Learning is like Lego Blocks => Compositionality and Backpropagation Creative ways to combine the blocks => New applications When you have problem => Get new blocks (LSTM, attention) to let gradient flow Food for thought: is our brain just a bunch of Lego blocks?

24 TensorFlow TensorFlow is an open source software library for numerical computation using data flow graphs.

25 TensorFlow: Computation Graph i mpor t t ensor f l ow as t f i mpor t numpy as np # Cr eat e 100 phony x, y dat a poi nt s i n NumPy, y = x * x_dat a = np. r andom. r and( 100). ast ype( np. f l oat 32) y_dat a = x_dat a * Import TensorFlow and NumPy # Tr y t o f i nd val ues f or W and b t hat comput e y_dat a = W * x_dat a + b # ( We know t hat W shoul d be 0. 1 and b 0. 3, but Tensor f l ow wi l l # f i gur e t hat out f or us. ) W = t f. Var i abl e( t f. r andom_uni f or m( [ 1], , 1. 0) ) b = tf.variable(tf.zeros([1])) y = W * x_dat a + b # Mi ni mi ze t he mean squar ed er r or s. l oss = t f. r educe_mean( t f. squar e( y - y_dat a) ) opt i mi zer = t f. t rai n. Gradi ent Descent Opt i mi zer(0. 5) train = optimizer.minimize(loss)

26 TensorFlow: Computation Graph i mpor t t ensor f l ow as t f i mpor t numpy as np # Cr eat e 100 phony x, y dat a poi nt s i n NumPy, y = x * x_dat a = np. r andom. r and( 100). ast ype( np. f l oat 32) y_dat a = x_dat a * # Tr y t o f i nd val ues f or W and b t hat comput e y_dat a = W * x_dat a + b # ( We know t hat W shoul d be 0. 1 and b 0. 3, but Tensor f l ow wi l l # f i gur e t hat out f or us. ) W = t f. Var i abl e( t f. r andom_uni f or m( [ 1], , 1. 0) ) b = tf.variable(tf.zeros([1])) y = W * x_dat a + b Synthesize some noisy data from a linear model # Mi ni mi ze t he mean squar ed er r or s. l oss = t f. r educe_mean( t f. squar e( y - y_dat a) ) opt i mi zer = t f. t rai n. Gradi ent Descent Opt i mi zer(0. 5) train = optimizer.minimize(loss)

27 TensorFlow: Computation Graph i mpor t t ensor f l ow as t f i mpor t numpy as np # Cr eat e 100 phony x, y dat a poi nt s i n NumPy, y = x * x_dat a = np. r andom. r and( 100). ast ype( np. f l oat 32) y_dat a = x_dat a * # Tr y t o f i nd val ues f or W and b t hat comput e y_dat a = W * x_dat a + b # ( We know t hat W shoul d be 0. 1 and b 0. 3, but Tensor f l ow wi l l # f i gur e t hat out f or us. ) W = t f. Var i abl e( t f. r andom_uni f or m( [ 1], , 1. 0) ) b = tf.variable(tf.zeros([1])) y = W * x_dat a + b # Mi ni mi ze t he mean squar ed er r or s. l oss = t f. r educe_mean( t f. squar e( y - y_dat a) ) opt i mi zer = t f. t rai n. Gradi ent Descent Opt i mi zer(0. 5) train = optimizer.minimize(loss) W * b x_data

28 TensorFlow: Computation Graph Optimizer Loss i mpor t t ensor f l ow as t f i mpor t numpy as np # Cr eat e 100 phony x, y dat a poi nt s i n NumPy, y = x * x_dat a = np. r andom. r and( 100). ast ype( np. f l oat 32) y_dat a = x_dat a * y_data + # Tr y t o f i nd val ues f or W and b t hat comput e y_dat a = W * x_dat a + b # ( We know t hat W shoul d be 0. 1 and b 0. 3, but Tensor f l ow wi l l # f i gur e t hat out f or us. ) W = t f. Var i abl e( t f. r andom_uni f or m( [ 1], , 1. 0) ) b = t f. Var i abl e( t f. zer os( [ 1] ) ) y = W * x_dat a + b # Mi ni mi ze t he mean squar ed er r or s. l oss = t f. r educe_mean( t f. squar e( y - y_dat a) ) opt i mi zer = t f. t r ai n. Gr adi ent Descent Opt i mi zer ( 0. 5) train = optimizer.minimize(loss) W * b x_data

29 TensorFlow: Session # Bef or e st ar t i ng, i ni t i al i ze t he var i abl es. We wi l l ' r un' t hi s f i r st. i ni t = t f. i ni t i al i ze_al l _var i abl es( ) # Launch t he gr aph. sess = t f. Sessi on( ) sess. r un( i ni t ) # Fi t t he l i ne. f or st ep i n xr ange( 201) : sess. r un( t r ai n) i f st ep % 20 == 0: pr i nt ( st ep, sess. r un( W), sess. r un( b) ) # Lear ns best f i t i s W: [ 0. 1], b: [ 0. 3]

30 TensorFlow: Session # Bef or e st ar t i ng, i ni t i al i ze t he var i abl es. We wi l l ' r un' t hi s f i r st. i ni t = t f. i ni t i al i ze_al l _var i abl es( ) # Launch t he gr aph. sess = t f. Sessi on( ) sess. r un( i ni t ) # Fi t t he l i ne. f or st ep i n xr ange( 201) : sess. r un( t r ai n) i f st ep % 20 == 0: pr i nt ( st ep, sess. r un( W), sess. r un( b) ) # Lear ns best f i t i s W: [ 0. 1], b: [ 0. 3]

31 Tensorboard Demo

32 Tensorboard Demo

33 Tensorboard Demo

34 Now the part that everybody hates...

35 Homework Part 1: Backpropagation and gradient check NumPy Part 2: Char-RNN Undergrad/Grad Descent Gradient descent => graduate descent Systematic search of hyperparameters Do something fun with it! Use gradient to find the best parameters Use graduate student to find the best hyperparameters

36

37

38

39

40

41 References Christopher Colah s Blog: Andrej Karpathy s Blog: David Silver s Talk: Geoffrey Hinton s Coursera Talk:

42

Sequence Modeling with Neural Networks

Sequence Modeling with Neural Networks Sequence Modeling with Neural Networks Harini Suresh y 0 y 1 y 2 s 0 s 1 s 2... x 0 x 1 x 2 hat is a sequence? This morning I took the dog for a walk. sentence medical signals speech waveform Successes

More information

Stephen Scott.

Stephen Scott. 1 / 35 (Adapted from Vinod Variyam and Ian Goodfellow) sscott@cse.unl.edu 2 / 35 All our architectures so far work on fixed-sized inputs neural networks work on sequences of inputs E.g., text, biological

More information

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI

APPLIED DEEP LEARNING PROF ALEXIEI DINGLI APPLIED DEEP LEARNING PROF ALEXIEI DINGLI TECH NEWS TECH NEWS HOW TO DO IT? TECH NEWS APPLICATIONS TECH NEWS TECH NEWS NEURAL NETWORKS Interconnected set of nodes and edges Designed to perform complex

More information

Deep Learning Recurrent Networks 2/28/2018

Deep Learning Recurrent Networks 2/28/2018 Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

CSC321 Lecture 15: Exploding and Vanishing Gradients

CSC321 Lecture 15: Exploding and Vanishing Gradients CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent

More information

Lecture 11 Recurrent Neural Networks I

Lecture 11 Recurrent Neural Networks I Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor niversity of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks

More information

Tensor Flow. Tensors: n-dimensional arrays Vector: 1-D tensor Matrix: 2-D tensor

Tensor Flow. Tensors: n-dimensional arrays Vector: 1-D tensor Matrix: 2-D tensor Tensor Flow Tensors: n-dimensional arrays Vector: 1-D tensor Matrix: 2-D tensor Deep learning process are flows of tensors A sequence of tensor operations Can represent also many machine learning algorithms

More information

Speech and Language Processing

Speech and Language Processing Speech and Language Processing Lecture 5 Neural network based acoustic and language models Information and Communications Engineering Course Takahiro Shinoaki 08//6 Lecture Plan (Shinoaki s part) I gives

More information

Lecture 11 Recurrent Neural Networks I

Lecture 11 Recurrent Neural Networks I Lecture 11 Recurrent Neural Networks I CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 01, 2017 Introduction Sequence Learning with Neural Networks Some Sequence Tasks

More information

Sequence Models. Ji Yang. Department of Computing Science, University of Alberta. February 14, 2018

Sequence Models. Ji Yang. Department of Computing Science, University of Alberta. February 14, 2018 Sequence Models Ji Yang Department of Computing Science, University of Alberta February 14, 2018 This is a note mainly based on Prof. Andrew Ng s MOOC Sequential Models. I also include materials (equations,

More information

EE-559 Deep learning Recurrent Neural Networks

EE-559 Deep learning Recurrent Neural Networks EE-559 Deep learning 11.1. Recurrent Neural Networks François Fleuret https://fleuret.org/ee559/ Sun Feb 24 20:33:31 UTC 2019 Inference from sequences François Fleuret EE-559 Deep learning / 11.1. Recurrent

More information

Recurrent Neural Networks Deep Learning Lecture 5. Efstratios Gavves

Recurrent Neural Networks Deep Learning Lecture 5. Efstratios Gavves Recurrent Neural Networks Deep Learning Lecture 5 Efstratios Gavves Sequential Data So far, all tasks assumed stationary data Neither all data, nor all tasks are stationary though Sequential Data: Text

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

Spatial Transformer. Ref: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transformer Networks, NIPS, 2015

Spatial Transformer. Ref: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transformer Networks, NIPS, 2015 Spatial Transormer Re: Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu, Spatial Transormer Networks, NIPS, 2015 Spatial Transormer Layer CNN is not invariant to scaling and rotation

More information

Based on the original slides of Hung-yi Lee

Based on the original slides of Hung-yi Lee Based on the original slides of Hung-yi Lee New Activation Function Rectified Linear Unit (ReLU) σ z a a = z Reason: 1. Fast to compute 2. Biological reason a = 0 [Xavier Glorot, AISTATS 11] [Andrew L.

More information

Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop

Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop Overview Today: From one-layer to multi layer neural networks! Backprop (last bit of heavy math) Different descriptions and viewpoints of backprop Project Tips Announcement: Hint for PSet1: Understand

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

TensorFlow: A Framework for Scalable Machine Learning

TensorFlow: A Framework for Scalable Machine Learning TensorFlow: A Framework for Scalable Machine Learning You probably Outline want to know... What is TensorFlow? Why did we create TensorFlow? How does Tensorflow Work? Example: Linear Regression Example:

More information

introduction to convolutional networks using tensorflow

introduction to convolutional networks using tensorflow introduction to convolutional networks using tensorflow Jesús Fernández Bes, jfbes@ing.uc3m.es 8 de febrero de 2016 contents Install What is Tensorflow? Implementing Softmax Regression Deep Convolutional

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook

Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recurrent Neural Networks (Part - 2) Sumit Chopra Facebook Recap Standard RNNs Training: Backpropagation Through Time (BPTT) Application to sequence modeling Language modeling Applications: Automatic speech

More information

Slide credit from Hung-Yi Lee & Richard Socher

Slide credit from Hung-Yi Lee & Richard Socher Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Vanishing and Exploding Gradients ReLUs Xavier Initialization Batch Normalization Highway Architectures: Resnets, LSTMs and GRUs Causes

More information

Recurrent Neural Networks. Jian Tang

Recurrent Neural Networks. Jian Tang Recurrent Neural Networks Jian Tang tangjianpku@gmail.com 1 RNN: Recurrent neural networks Neural networks for sequence modeling Summarize a sequence with fix-sized vector through recursively updating

More information

Machine Learning. Boris

Machine Learning. Boris Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

I zm ir I nstiute of Technology CS Lecture Notes are based on the CS 101 notes at the University of I llinois at Urbana-Cham paign

I zm ir I nstiute of Technology CS Lecture Notes are based on the CS 101 notes at the University of I llinois at Urbana-Cham paign I zm ir I nstiute of Technology CS - 1 0 2 Lecture 1 Lecture Notes are based on the CS 101 notes at the University of I llinois at Urbana-Cham paign I zm ir I nstiute of Technology W hat w ill I learn

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Deep Learning Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein, Pieter Abbeel, Anca Dragan for CS188 Intro to AI at UC Berkeley.

More information

Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and. University College Dublin

Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and. University College Dublin Introduction ti to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University College Dublin gianluca.pollastri@ucd.ie Credits

More information

Lecture 15: Exploding and Vanishing Gradients

Lecture 15: Exploding and Vanishing Gradients Lecture 15: Exploding and Vanishing Gradients Roger Grosse 1 Introduction Last lecture, we introduced RNNs and saw how to derive the gradients using backprop through time. In principle, this lets us train

More information

Senility Degree. Our machine derives APG waveform after 2 nd differential of arterial pulse wave in order to measure

Senility Degree. Our machine derives APG waveform after 2 nd differential of arterial pulse wave in order to measure Senility Degree Our machine derives APG waveform after 2 nd differential of arterial pulse wave in order to measure 1. Cardiac Eccentric Constriction Power 2. Arterial Elasticity 3. Remaining blood volume

More information

Recurrent and Recursive Networks

Recurrent and Recursive Networks Neural Networks with Applications to Vision and Language Recurrent and Recursive Networks Marco Kuhlmann Introduction Applications of sequence modelling Map unsegmented connected handwriting to strings.

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Non-Linearity. CS 188: Artificial Intelligence. Non-Linear Separators. Non-Linear Separators. Deep Learning I

Non-Linearity. CS 188: Artificial Intelligence. Non-Linear Separators. Non-Linear Separators. Deep Learning I Non-Linearity CS 188: Artificial Intelligence Deep Learning I Instructors: Pieter Abbeel & Anca Dragan --- University of California, Berkeley [These slides were created by Dan Klein, Pieter Abbeel, Anca

More information

Recurrent Neural Networks. deeplearning.ai. Why sequence models?

Recurrent Neural Networks. deeplearning.ai. Why sequence models? Recurrent Neural Networks deeplearning.ai Why sequence models? Examples of sequence data The quick brown fox jumped over the lazy dog. Speech recognition Music generation Sentiment classification There

More information

Deep Learning Recurrent Networks 10/11/2017

Deep Learning Recurrent Networks 10/11/2017 Deep Learning Recurrent Networks 10/11/2017 1 Which open source project? Related math. What is it talking about? And a Wikipedia page explaining it all The unreasonable effectiveness of recurrent neural

More information

A Tutorial On Backward Propagation Through Time (BPTT) In The Gated Recurrent Unit (GRU) RNN

A Tutorial On Backward Propagation Through Time (BPTT) In The Gated Recurrent Unit (GRU) RNN A Tutorial On Backward Propagation Through Time (BPTT In The Gated Recurrent Unit (GRU RNN Minchen Li Department of Computer Science The University of British Columbia minchenl@cs.ubc.ca Abstract In this

More information

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning

TTIC 31230, Fundamentals of Deep Learning, Winter David McAllester. The Fundamental Equations of Deep Learning TTIC 31230, Fundamentals of Deep Learning, Winter 2019 David McAllester The Fundamental Equations of Deep Learning 1 Early History 1943: McCullock and Pitts introduced the linear threshold neuron. 1962:

More information

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai ECE521 W17 Tutorial 1 Renjie Liao & Min Bai Schedule Linear Algebra Review Matrices, vectors Basic operations Introduction to TensorFlow NumPy Computational Graphs Basic Examples Linear Algebra Review

More information

attention mechanisms and generative models

attention mechanisms and generative models attention mechanisms and generative models Master's Deep Learning Sergey Nikolenko Harbour Space University, Barcelona, Spain November 20, 2017 attention in neural networks attention You re paying attention

More information

Neural Networks: Backpropagation

Neural Networks: Backpropagation Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others

More information

Gated Recurrent Networks

Gated Recurrent Networks Gated Recurrent Networks Davide Bacciu Dipartimento di Informatica Università di Pisa Intelligent Systems for Pattern Recognition (ISPR) Outline Recurrent Neural Networks Gradient Issues Dealing with Sequences

More information

Long-Short Term Memory and Other Gated RNNs

Long-Short Term Memory and Other Gated RNNs Long-Short Term Memory and Other Gated RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Sequence Modeling

More information

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve)

Neural Turing Machine. Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Neural Turing Machine Author: Alex Graves, Greg Wayne, Ivo Danihelka Presented By: Tinghui Wang (Steve) Introduction Neural Turning Machine: Couple a Neural Network with external memory resources The combined

More information

CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

More information

CSC321 Lecture 15: Recurrent Neural Networks

CSC321 Lecture 15: Recurrent Neural Networks CSC321 Lecture 15: Recurrent Neural Networks Roger Grosse Roger Grosse CSC321 Lecture 15: Recurrent Neural Networks 1 / 26 Overview Sometimes we re interested in predicting sequences Speech-to-text and

More information

Lecture 8: Recurrent Neural Networks

Lecture 8: Recurrent Neural Networks Lecture 8: Recurrent Neural Networks André Martins Deep Structured Learning Course, Fall 2018 André Martins (IST) Lecture 8 IST, Fall 2018 1 / 85 Announcements The deadline for the project midterm report

More information

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph

More information

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017

Modelling Time Series with Neural Networks. Volker Tresp Summer 2017 Modelling Time Series with Neural Networks Volker Tresp Summer 2017 1 Modelling of Time Series The next figure shows a time series (DAX) Other interesting time-series: energy prize, energy consumption,

More information

>TensorFlow and deep learning_

>TensorFlow and deep learning_ >TensorFlow and deep learning_ without a PhD deep Science! #Tensorflow deep Code... @martin_gorner Hello World: handwritten digits classification - MNIST? MNIST = Mixed National Institute of Standards

More information

(

( Class 15 - Long Short-Term Memory (LSTM) Study materials http://colah.github.io/posts/2015-08-understanding-lstms/ (http://colah.github.io/posts/2015-08-understanding-lstms/) http://karpathy.github.io/2015/05/21/rnn-effectiveness/

More information

Recurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST

Recurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST 1 Recurrent Neural Networks (RNN) and Long-Short-Term-Memory (LSTM) Yuan YAO HKUST Summary We have shown: Now First order optimization methods: GD (BP), SGD, Nesterov, Adagrad, ADAM, RMSPROP, etc. Second

More information

TensorFlow. Dan Evans

TensorFlow. Dan Evans TensorFlow Presentation references material from https://www.tensorflow.org/get_started/get_started and Data Science From Scratch by Joel Grus, 25, O Reilly, Ch. 8 Dan Evans TensorFlow www.tensorflow.org

More information

Lecture 15: Recurrent Neural Nets

Lecture 15: Recurrent Neural Nets Lecture 15: Recurrent Neural Nets Roger Grosse 1 Introduction Most of the prediction tasks we ve looked at have involved pretty simple kinds of outputs, such as real values or discrete categories. But

More information

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text

More information

INF 5860 Machine learning for image classification. Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018

INF 5860 Machine learning for image classification. Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018 INF 5860 Machine learning for image classification Lecture 5 : Introduction to TensorFlow Tollef Jahren February 14, 2018 OUTLINE Deep learning frameworks TensorFlow TensorFlow graphs TensorFlow session

More information

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01 (Li Lab) National Science Foundation Center for Big Learning (CBL) Department of Electrical and Computer Engineering (ECE) Department of Computer & Information Science & Engineering (CISE) Pytorch Tutorial

More information

Introduction to RNNs!

Introduction to RNNs! Introduction to RNNs Arun Mallya Best viewed with Computer Modern fonts installed Outline Why Recurrent Neural Networks (RNNs)? The Vanilla RNN unit The RNN forward pass Backpropagation refresher The RNN

More information

Introduction to TensorFlow

Introduction to TensorFlow Large Scale Data Analysis Using Deep Learning (Prof. U Kang) Introduction to TensorFlow 2017.04.17 Beunguk Ahn ( beunguk.ahn@gmail.com) 1 What is TensorFlow? Consturction Phase Execution Phase Examples

More information

Deep Structured Prediction in Handwriting Recognition Juan José Murillo Fuentes, P. M. Olmos (Univ. Carlos III) and J.C. Jaramillo (Univ.

Deep Structured Prediction in Handwriting Recognition Juan José Murillo Fuentes, P. M. Olmos (Univ. Carlos III) and J.C. Jaramillo (Univ. Deep Structured Prediction in Handwriting Recognition Juan José Murillo Fuentes, P. M. Olmos (Univ. Carlos III) and J.C. Jaramillo (Univ. Sevilla) Computational and Biological Learning Lab Dep. of Engineering

More information

Identifying QCD transition using Deep Learning

Identifying QCD transition using Deep Learning Identifying QCD transition using Deep Learning Kai Zhou Long-Gang Pang, Nan Su, Hannah Peterson, Horst Stoecker, Xin-Nian Wang Collaborators: arxiv:1612.04262 Outline 2 What is deep learning? Artificial

More information

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Steve Renals Machine Learning Practical MLP Lecture 5 16 October 2018 MLP Lecture 5 / 16 October 2018 Deep Neural Networks

More information

@SoyGema GEMA PARREÑO PIQUERAS

@SoyGema GEMA PARREÑO PIQUERAS @SoyGema GEMA PARREÑO PIQUERAS WHAT IS AN ARTIFICIAL NEURON? WHAT IS AN ARTIFICIAL NEURON? Image Recognition Classification using Softmax Regressions and Convolutional Neural Networks Languaje Understanding

More information

CSC321 Lecture 10 Training RNNs

CSC321 Lecture 10 Training RNNs CSC321 Lecture 10 Training RNNs Roger Grosse and Nitish Srivastava February 23, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 10 Training RNNs February 23, 2015 1 / 18 Overview Last time, we saw

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

(Artificial) Neural Networks in TensorFlow

(Artificial) Neural Networks in TensorFlow (Artificial) Neural Networks in TensorFlow By Prof. Seungchul Lee Industrial AI Lab http://isystems.unist.ac.kr/ POSTECH Table of Contents I. 1. Recall Supervised Learning Setup II. 2. Artificial Neural

More information

Introduction to TensorFlow

Introduction to TensorFlow Introduction to TensorFlow Oliver Dürr Datalab-Lunch Seminar Series Winterthur, 17 Nov, 2016 1 Abstract Introduc)on to TensorFlow TensorFlow is a mul/purpose open source so2ware library for numerical computa/on

More information

Natural Language Processing and Recurrent Neural Networks

Natural Language Processing and Recurrent Neural Networks Natural Language Processing and Recurrent Neural Networks Pranay Tarafdar October 19 th, 2018 Outline Introduction to NLP Word2vec RNN GRU LSTM Demo What is NLP? Natural Language? : Huge amount of information

More information

arxiv: v3 [cs.lg] 14 Jan 2018

arxiv: v3 [cs.lg] 14 Jan 2018 A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation Gang Chen Department of Computer Science and Engineering, SUNY at Buffalo arxiv:1610.02583v3 [cs.lg] 14 Jan 2018 1 abstract We describe

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Pushpak Bhattacharyya CSE Dept, IIT Patna and Bombay LSTM 15 jun, 2017 lgsoft:nlp:lstm:pushpak 1 Recap 15 jun, 2017 lgsoft:nlp:lstm:pushpak 2 Feedforward Network and Backpropagation

More information

Deep Recurrent Neural Networks

Deep Recurrent Neural Networks Deep Recurrent Neural Networks Artem Chernodub e-mail: a.chernodub@gmail.com web: http://zzphoto.me ZZ Photo IMMSP NASU 2 / 28 Neuroscience Biological-inspired models Machine Learning p x y = p y x p(x)/p(y)

More information

Lecture 5 Neural models for NLP

Lecture 5 Neural models for NLP CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

Recurrent Neural Network

Recurrent Neural Network Recurrent Neural Network Xiaogang Wang xgwang@ee..edu.hk March 2, 2017 Xiaogang Wang (linux) Recurrent Neural Network March 2, 2017 1 / 48 Outline 1 Recurrent neural networks Recurrent neural networks

More information

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units Connectionist Models Consider humans: Neuron switching time ~ :001 second Number of neurons ~ 10 10 Connections per neuron ~ 10 4 5 Scene recognition time ~ :1 second 100 inference steps doesn't seem like

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018

Deep Learning Sequence to Sequence models: Attention Models. 17 March 2018 Deep Learning Sequence to Sequence models: Attention Models 17 March 2018 1 Sequence-to-sequence modelling Problem: E.g. A sequence X 1 X N goes in A different sequence Y 1 Y M comes out Speech recognition:

More information

Approximate Q-Learning. Dan Weld / University of Washington

Approximate Q-Learning. Dan Weld / University of Washington Approximate Q-Learning Dan Weld / University of Washington [Many slides taken from Dan Klein and Pieter Abbeel / CS188 Intro to AI at UC Berkeley materials available at http://ai.berkeley.edu.] Q Learning

More information

Regression Adjustment with Artificial Neural Networks

Regression Adjustment with Artificial Neural Networks Regression Adjustment with Artificial Neural Networks Age of Big Data: data comes in a rate and in a variety of types that exceed our ability to analyse it Texts, image, speech, video Real motivation:

More information

Lecture 35: Optimization and Neural Nets

Lecture 35: Optimization and Neural Nets Lecture 35: Optimization and Neural Nets CS 4670/5670 Sean Bell DeepDream [Google, Inceptionism: Going Deeper into Neural Networks, blog 2015] Aside: CNN vs ConvNet Note: There are many papers that use

More information

CSE 446 Dimensionality Reduction, Sequences

CSE 446 Dimensionality Reduction, Sequences CSE 446 Dimensionality Reduction, Sequences Administrative Final review this week Practice exam questions will come out Wed Final exam next week Wed 8:30 am Today Dimensionality reduction examples Sequence

More information

CS224N: Natural Language Processing with Deep Learning Winter 2018 Midterm Exam

CS224N: Natural Language Processing with Deep Learning Winter 2018 Midterm Exam CS224N: Natural Language Processing with Deep Learning Winter 2018 Midterm Exam This examination consists of 17 printed sides, 5 questions, and 100 points. The exam accounts for 20% of your total grade.

More information

Fr anchi s ee appl i cat i on for m

Fr anchi s ee appl i cat i on for m Other Fr anchi s ee appl i cat i on for m Kindly fill in all the applicable information in the spaces provided and submit to us before the stipulated deadline. The information you provide will be held

More information

Differentiable Functional Programming

Differentiable Functional Programming Differentiable Functional Programming Atılım Güneş Baydin University of Oxford http://www.robots.ox.ac.uk/~gunes/ F#unctional Londoners Meetup, April 28, 2016 About me Current (from 11 April 2016): Postdoctoral

More information

Introduction to Deep Neural Networks

Introduction to Deep Neural Networks Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic

More information

Animals and Behaviors. Templeton Biology

Animals and Behaviors. Templeton Biology Animals and Behaviors Templeton Biology Nueral- Nami Expl ai n t he behavi or s you see i n t he vi deo. Woul d t hi s be a gr oup or i ndi vi dual f eedi ng behavi or why? Video of Prairie Dog Communication

More information

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation

Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Making Deep Learning Understandable for Analyzing Sequential Data about Gene Regulation Dr. Yanjun Qi Department of Computer Science University of Virginia Tutorial @ ACM BCB-2018 8/29/18 Yanjun Qi / UVA

More information

Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)

Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) Sanjeev Arora Elad Hazan Recap: Structure of a deep

More information

Deep Learning Recurrent Networks 10/16/2017

Deep Learning Recurrent Networks 10/16/2017 Deep Learning Recurrent Networks 10/16/2017 1 Which open source project? Related math. What is it talking about? And a Wikipedia page explaining it all The unreasonable effectiveness of recurrent neural

More information

Autoregressive Neural Models for Statistical Parametric Speech Synthesis

Autoregressive Neural Models for Statistical Parametric Speech Synthesis Autoregressive Neural Models for Statistical Parametric Speech Synthesis シンワン Xin WANG 2018-01-11 contact: wangxin@nii.ac.jp we welcome critical comments, suggestions, and discussion 1 https://www.slideshare.net/kotarotanahashi/deep-learning-library-coyotecnn

More information

Recurrent Neural Networks

Recurrent Neural Networks Charu C. Aggarwal IBM T J Watson Research Center Yorktown Heights, NY Recurrent Neural Networks Neural Networks and Deep Learning, Springer, 218 Chapter 7.1 7.2 The Challenges of Processing Sequences Conventional

More information

P a g e 5 1 of R e p o r t P B 4 / 0 9

P a g e 5 1 of R e p o r t P B 4 / 0 9 P a g e 5 1 of R e p o r t P B 4 / 0 9 J A R T a l s o c o n c l u d e d t h a t a l t h o u g h t h e i n t e n t o f N e l s o n s r e h a b i l i t a t i o n p l a n i s t o e n h a n c e c o n n e

More information

Neural Networks Language Models

Neural Networks Language Models Neural Networks Language Models Philipp Koehn 10 October 2017 N-Gram Backoff Language Model 1 Previously, we approximated... by applying the chain rule p(w ) = p(w 1, w 2,..., w n ) p(w ) = i p(w i w 1,...,

More information

Deep Learning Tutorial. 李宏毅 Hung-yi Lee

Deep Learning Tutorial. 李宏毅 Hung-yi Lee Deep Learning Tutorial 李宏毅 Hung-yi Lee Outline Part I: Introduction of Deep Learning Part II: Why Deep? Part III: Tips for Training Deep Neural Network Part IV: Neural Network with Memory Part I: Introduction

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow, Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 1 / 39 Outline 1 Universality of Neural Networks

More information