Deep Learning: a gentle introduction

Size: px
Start display at page:

Download "Deep Learning: a gentle introduction"

Transcription

1 Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 /

2 Why a talk about deep learning? Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

3 Why a talk about deep learning? Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

4 Why a talk about deep learning? Convolutional Networks (Yann Le Cun): challenge ILSVRC : 000 categories and images. Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

5 Neuron s basic anatomy Figure: A neuron s basic anatomy consists of four parts: a soma (cell body), dendrites, an axon, and nerve terminals. Information is received by dendrites, gets collected in the cell body, and flows down the axon. Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

6 Artificial neuron x w x2 w2 x3 w3 n i= wixi + b Axon Pre-activation Activation Cell body wd b xn Dendrites xn+ = Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

7 Perceptron Rosenblatt 957 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

8 Perceptron learning Input: a sample S = {(x (), y ),, (x (n), y n )} Initialize a parameter t to 0 Initialize the weights w i with random values. Repeat Pick an example x (k) = [x (k),, x(k) d ]T from S Let y (k) be the target value and ỹ (k) the computed value using the current perceptron If ỹ (k) y (k) then Update the weights: w i (t + ) = w i (t) + w i (t) where w i (t) = (y (k) ỹ (k) )x (k) i End If t = t + Until all the examples in S were visited and no change occurs in the weights Output: a perceptron for linear discrimination of S Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

9 Perceptron learning Example: learning the OR function Initialization: w (0) = w 2 (0) =, w 3 (0) = t w (t) w 2 (t) w 3 (t) x (k) wi x k i ỹ (k) y (k) w (t) w 2 (t) w 3 (t) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

10 Perceptron illustration x () = [0, 0] T t = 0 0 w = 0 w 2 = w 3 = if a > 0 a = ỹ = 0 i wixi 0 elsewhere y = 0 w i = w i + (y ỹ)x i w = w 2 = w 3 = + 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

11 Perceptron illustration x (2) = [0, ] T t = 0 w = w 2 = w 3 = if a > 0 a = 0 ỹ = 0 i wixi 0 elsewhere y = w i = w i + (y ỹ)x i w = + 0 = w 2 = + = 2 w 3 = + = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 /

12 Perceptron illustration x (3) = [, 0] T t = 2 w = 0 w 2 = 2 w 3 = 0 if a > 0 a = ỹ = i wixi 0 elsewhere y = w i = w i + (y ỹ)x i w = = w 2 = = 2 w 3 = = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

13 Perceptron illustration x (4) = [, ] T t = 3 w = w 2 = 2 w 3 = 0 if a > 0 a = 3 ỹ = i wixi 0 elsewhere y = w i = w i + (y ỹ)x i w = = w 2 = = 2 w 3 = = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

14 Perceptron illustration x () = [0, 0] T t = 4 0 w = 0 w 2 = 2 w 3 = 0 if a > 0 a = 0 ỹ = 0 i wixi 0 elsewhere y = 0 w i = w i + (y ỹ)x i w = = w 2 = = 2 w 3 = = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

15 Perceptron illustration x (2) = [0, ] T t = 5 0 w = w 2 = 2 w 3 = 0 if a > 0 a = 2 ỹ = i wixi 0 elsewhere y = w i = w i + (y ỹ)x i w = = w 2 = = 2 w 3 = = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

16 Perceptron capacity OR(x, x 2 ) AND(x, x 2 ) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

17 Perceptron autumn 0 0 XOR(x, x 2 ) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

18 Link with Logistic regression x w x2 w2 a(x) = w3 d i= wixi + b x3 h(x) = +e a(x) Stochastic gradient update rule: wd wj = wj + λ(y (i) h(x (i) ))x (i) j The same as the perceptron update rule! b xd xd+ = Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

19 But! XOR(x, x 2 ) AND(x, x2) 0 0 AND( x, x 2 ) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

20 Multilayer Perceptron b Paul Werbos, 84. David Rumelhart, 86 x x2 o x3 om Output layer xd Input Layer Hidden layers Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

21 Layer Perceptron b x w w2 h w 2 b 2 w2 w3 x2 w3 w22 h2 w32 w 2 2 w23 x3 w33 h3 w 2 3 w2d wn2 wn3 Output Layer f(x) = o(b 2 + d i= w2 hi(x)) wn w3d wd xn w 2 d Input Layer wnd hd Hidden Layer hj(x) = g(b + n i= wijxi) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

22 MLP training: backpropagation XOR example x i w + ixi b a h h(x) = +e a(x) w w 2 w 2 k w2 k hk + ao b2 ho(x) = +e a(x) ỹ x2 w 2 2 i w i2 x2 + b 2 a2 h2 h(x) = +e a(x) w 2 2 w 2 22 b b 2 b 2 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

23 MLP training: backpropagation XOR example: initialisation x w = i w + ixi b a h h(x) = +e a(x) x2 w 2 = w 2 = w 22 = b = b 2 = i w i2 x2 + b 2 a2 h2 h(x) = +e a(x) w 2 = w 2 2 = b 2 = k w2 k hk + ao b2 ho(x) = +e a(x) ỹ Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

24 MLP training: backpropagation XOR example: Feed-forward x 0 w = a = i w + ixi b h(x) = +e a (x) 0.27 x2 0 w 2 = w 2 = w 22 = b = b 2 = a2 = i w i2 x2 + b 2 h2(x) = +e a 2 (x) 0.27 w 2 = w 2 2 = b 2 = k 0.46 ao = ỹ = 0.39 w2 k hk + ho(x) = +e ao(x) y = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

25 MLP training: backpropagation XOR example: Backpropagation w 2 k = w2 k + ηδk δk = (y ỹ)( ỹ)ỹhk x 0 w = a = i w + ixi b h(x) = +e a (x) 0.27 = Ew ỹ ỹ ao ao w 2 k x2 0 w 2 = w 2 = w 22 = b = b 2 = a2 = i w i2 x2 + b 2 h2(x) = +e a 2 (x) 0.27 w 2 = w 2 2 = b 2 = k 0.46 ao = ỹ = 0.39 w2 k hk + ho(x) = +e ao(x) y = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

26 MLP training: backpropagation XOR example: Backpropagation w ij = w ij ηδij δij = Ew wij = Ew ho ho ao δij = (y ỹ)( ỹ)ỹwj 2 hj( hj)xi ao hj hj aj aj wij w 2 k = w2 k ηδk δk = Ew, w 2 Ew = 2 (y ỹ)2 k x 0 w = a = i w + ixi b h(x) = +e a (x) 0.27 = Ew ỹ ỹ ao ao w 2 k δk = (y ỹ)( ỹ)ỹhk x2 w 2 = w 2 = 0 w22 = b =.98 b 2 =.98 a2 = i w i2 x2 + b 2 h2(x) = +e a 2 (x) 0.27 w = + (0.39)(.39) (.27) 0 w22 = + (0.39)(.39) (.27) 0. b = + (0.39)(.39) (.27) k w 2 = ao = ỹ = 0.39 w2 k hk + ho(x) = +e ao(x) w 2 2 = 0.98 b 2 =.07 w 2 = w 2 + (0.39)(.39) w 2 2 = w (0.39)(.39) b 2 = b 2 + (0.39)(.39).3 y = 0 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

27 MLP training: backpropagation XOR example: Feed-forward x 0 w = a = i w + ixi b.02 h(x) = +e a (x) 0.5 x2 w 2 = w 2 = w 22 = b =.98 b 2 =.98 a2 =.02 i w i2 x2 + b 2 h2(x) = +e a 2 (x) 0.5 k w 2 = ao = ỹ 0.5 w2 k hk + ho(x) = +e ao(x) w 2 2 = 0.98 b 2 =.07 y = Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

28 Multilayer Perceptron as a deep NN MLP with more than 2/3 layers is a deep network b x x2 o x3 om Output layer xd Input Layer Hidden layers Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

29 Training issues Setting the hyperparameters Initialization Number of iterations Learning rate η Activation function Early stopping criterion Overfitting/Underfitting Number of hidden layers Number of neurons Optimization Vanishing gradient problem Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

30 Overfitting/Underfitting Source: Bishop, PRML Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

31 Overfitting/Underfitting Source: Bishop, PRML Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

32 What s new! Before 2006, training deep architectures was unsuccessful! Underfitting New optimization techniques Overfitting Bengio, Hinton, LeCun Unsupervised pre-training Stochastic dropout" training Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

33 Unsupervised pre-training Main idea Initialize the network in an unsupervised way. Consequences Network layers encode successive invariant features (latent structure of the data) Better optimization and hence better generalization Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

34 Unsupervised pre-training How to? Proceed greedily, layer by layer. x x x x2 x2 x2 x3 x3 x3 xd xd xd Input Layer = Input Layer = Input Layer Unsupervised NN learning techniques Restricted Boltzmann Machines (RBM) Auto-encoders and many many variants since 2006 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

35 Unsupervised pre-training Autoencoder Credit: Hugo Larochelle Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

36 Unsupervised pre-training Sparse Autoencoders Credit: Y. Bengio Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

37 Fine tuning How to? Add the output layer Initialize its weights with random values initialize the hidden layers with the values from the pre-training Update the layers by Backpropagation x x2 x3 xd Input Layer Output Layer Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

38 Drop out Intuition Regularize the network by dropping out stochastically some hidden units. Procedure Assign to each hidden unit a value 0 with probability p (common choice:.5) x x2 x3 xd Input Layer Output Layer Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

39 Some applications Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

40 Computer vision Convolutional Neural Networks Lecun, 89 State of the art in digit recognition Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

41 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

42 Modern CNN A. Krizhevsky, I. Sutskever, and G. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS hidden layers, 650,000 units, 60,000,000 parameters Drop out 0 6 images GPU implementation Activation function: f(x) = max(0, x) (ReLu) Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

43 Computer vision Convolutional Neural Networks LeCun, Hinton, Bengio, Nature 205 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

44 What is deep in deep learning! Old vs New ML paradigms Decision Decision Classifier Hand-crafted features Classifier Learned features Representation learning Raw data Raw data Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

45 Softwares TensorFlow (google), Theano (Python CPU/GPU) mathematical and deep learning library http ://deeplearning.net/software/theano. Can do automatic, symbolic differentiation. Senna: NLP by Collobert et al. http ://ronan.collobert.com/senna/ State-of-the-art performance on many tasks 3500 lines of C, extremely fast and using very little memory. Torch ML Library (C+Lua) : http :// Recurrent Neural Network Language Model http :// imikolov/rnnlm Recursive Neural Net and RAE models for paraphrase detection, sentiment analysis, relation classiffcation Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, /

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

10. Artificial Neural Networks

10. Artificial Neural Networks Foundations of Machine Learning CentraleSupélec Fall 217 1. Artificial Neural Networks Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne Wikipedia Geoffrey A. Hinton https://www.macs.hw.ac.uk/~dwcorne/teaching/introdl.ppt Feedforward networks for function

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen

Artificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler

Machine Learning and Data Mining. Multi-layer Perceptrons & Neural Networks: Basics. Prof. Alexander Ihler + Machine Learning and Data Mining Multi-layer Perceptrons & Neural Networks: Basics Prof. Alexander Ihler Linear Classifiers (Perceptrons) Linear Classifiers a linear classifier is a mapping which partitions

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

Artificial Neural Networks. Historical description

Artificial Neural Networks. Historical description Artificial Neural Networks Historical description Victor G. Lopez 1 / 23 Artificial Neural Networks (ANN) An artificial neural network is a computational model that attempts to emulate the functions of

More information

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10

Neural Networks. Learning and Computer Vision Prof. Olga Veksler CS9840. Lecture 10 CS9840 Learning and Computer Vision Prof. Olga Veksler Lecture 0 Neural Networks Many slides are from Andrew NG, Yann LeCun, Geoffry Hinton, Abin - Roozgard Outline Short Intro Perceptron ( layer NN) Multilayer

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron

More information

Deep Learning Autoencoder Models

Deep Learning Autoencoder Models Deep Learning Autoencoder Models Davide Bacciu Dipartimento di Informatica Università di Pisa Intelligent Systems for Pattern Recognition (ISPR) Generative Models Wrap-up Deep Learning Module Lecture Generative

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

CSC242: Intro to AI. Lecture 21

CSC242: Intro to AI. Lecture 21 CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

Deep Learning Architectures and Algorithms

Deep Learning Architectures and Algorithms Deep Learning Architectures and Algorithms In-Jung Kim 2016. 12. 2. Agenda Introduction to Deep Learning RBM and Auto-Encoders Convolutional Neural Networks Recurrent Neural Networks Reinforcement Learning

More information

How to do backpropagation in a brain

How to do backpropagation in a brain How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

Introduction to Deep Neural Networks

Introduction to Deep Neural Networks Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic

More information

Neural Networks. Intro to AI Bert Huang Virginia Tech

Neural Networks. Intro to AI Bert Huang Virginia Tech Neural Networks Intro to AI Bert Huang Virginia Tech Outline Biological inspiration for artificial neural networks Linear vs. nonlinear functions Learning with neural networks: back propagation https://en.wikipedia.org/wiki/neuron#/media/file:chemical_synapse_schema_cropped.jpg

More information

Reading Group on Deep Learning Session 2

Reading Group on Deep Learning Session 2 Reading Group on Deep Learning Session 2 Stephane Lathuiliere & Pablo Mesejo 10 June 2016 1/39 Chapter Structure Introduction. 5.1. Feed-forward Network Functions. 5.2. Network Training. 5.3. Error Backpropagation.

More information

Neural Networks and Deep Learning.

Neural Networks and Deep Learning. Neural Networks and Deep Learning www.cs.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts perceptrons the perceptron training rule linear separability hidden

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline

More information

CSE446: Neural Networks Winter 2015

CSE446: Neural Networks Winter 2015 CSE446: Neural Networks Winter 2015 Luke Ze

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

Links between Perceptrons, MLPs and SVMs

Links between Perceptrons, MLPs and SVMs Links between Perceptrons, MLPs and SVMs Ronan Collobert Samy Bengio IDIAP, Rue du Simplon, 19 Martigny, Switzerland Abstract We propose to study links between three important classification algorithms:

More information

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016

Neural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

Neural Networks biological neuron artificial neuron 1

Neural Networks biological neuron artificial neuron 1 Neural Networks biological neuron artificial neuron 1 A two-layer neural network Output layer (activation represents classification) Weighted connections Hidden layer ( internal representation ) Input

More information

Machine Learning (CSE 446): Neural Networks

Machine Learning (CSE 446): Neural Networks Machine Learning (CSE 446): Neural Networks Noah Smith c 2017 University of Washington nasmith@cs.washington.edu November 6, 2017 1 / 22 Admin No Wednesday office hours for Noah; no lecture Friday. 2 /

More information

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation

Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Master Recherche IAC TC2: Apprentissage Statistique & Optimisation Alexandre Allauzen Anne Auger Michèle Sebag LIMSI LRI Oct. 4th, 2012 This course Bio-inspired algorithms Classical Neural Nets History

More information

Artificial neural networks

Artificial neural networks Artificial neural networks B. Mehlig, Department of Physics, University of Gothenburg, Sweden FFR35/FIM70 Artificial Neural Networks Chalmers/Gothenburg University, 7.5 credits 3 Course home page Teachers

More information

Machine Learning. Boris

Machine Learning. Boris Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)

More information

WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,

WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WITH IMPLICATIONS FOR TRAINING Sanjeev Arora, Yingyu Liang & Tengyu Ma Department of Computer Science Princeton University Princeton, NJ 08540, USA {arora,yingyul,tengyu}@cs.princeton.edu

More information

Machine Learning Lecture 10

Machine Learning Lecture 10 Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

Intelligent Systems Discriminative Learning, Neural Networks

Intelligent Systems Discriminative Learning, Neural Networks Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm

More information

CSC321 Lecture 5: Multilayer Perceptrons

CSC321 Lecture 5: Multilayer Perceptrons CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3

More information

UNSUPERVISED LEARNING

UNSUPERVISED LEARNING UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

Supervised Learning. George Konidaris

Supervised Learning. George Konidaris Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

Neural Networks: A Very Brief Tutorial

Neural Networks: A Very Brief Tutorial Neural Networks: A Very Brief Tutorial Chloé-Agathe Azencott Machine Learning & Computational Biology MPIs for Developmental Biology & for Intelligent Systems Tübingen (Germany) cazencott@tue.mpg.de October

More information

RegML 2018 Class 8 Deep learning

RegML 2018 Class 8 Deep learning RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x

More information

Neural Networks: Backpropagation

Neural Networks: Backpropagation Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others

More information

Feedforward Neural Nets and Backpropagation

Feedforward Neural Nets and Backpropagation Feedforward Neural Nets and Backpropagation Julie Nutini University of British Columbia MLRG September 28 th, 2016 1 / 23 Supervised Learning Roadmap Supervised Learning: Assume that we are given the features

More information

Demystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK

Demystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK Demystifying deep learning Petar Veličković Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK London Data Science Summit 20 October 2017 Introduction

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Introduction Goal: Classify objects by learning nonlinearity There are many problems for which linear discriminants are insufficient for minimum error In previous methods, the

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

2018 EE448, Big Data Mining, Lecture 5. (Part II) Weinan Zhang Shanghai Jiao Tong University

2018 EE448, Big Data Mining, Lecture 5. (Part II) Weinan Zhang Shanghai Jiao Tong University 2018 EE448, Big Data Mining, Lecture 5 Supervised Learning (Part II) Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html Content of Supervised Learning

More information

y(x n, w) t n 2. (1)

y(x n, w) t n 2. (1) Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,

More information

Multilayer Neural Networks

Multilayer Neural Networks Multilayer Neural Networks Multilayer Neural Networks Discriminant function flexibility NON-Linear But with sets of linear parameters at each layer Provably general function approximators for sufficient

More information

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants

Lecture 12. Talk Announcement. Neural Networks. This Lecture: Advanced Machine Learning. Recap: Generalized Linear Discriminants Advanced Machine Learning Lecture 2 Neural Networks 24..206 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Talk Announcement Yann LeCun (NYU & FaceBook AI) 28..

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 1 / 39 Outline 1 Universality of Neural Networks

More information

Machine Learning. Neural Networks

Machine Learning. Neural Networks Machine Learning Neural Networks Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 Biological Analogy Bryan Pardo, Northwestern University, Machine Learning EECS 349 Fall 2007 THE

More information

Revision: Neural Network

Revision: Neural Network Revision: Neural Network Exercise 1 Tell whether each of the following statements is true or false by checking the appropriate box. Statement True False a) A perceptron is guaranteed to perfectly learn

More information

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Neural Networks. Nicholas Ruozzi University of Texas at Dallas Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify

More information

Introduction to (Convolutional) Neural Networks

Introduction to (Convolutional) Neural Networks Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient

More information

Deep Learning (CNNs)

Deep Learning (CNNs) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Deep Learning & Neural Networks Lecture 2

Deep Learning & Neural Networks Lecture 2 Deep Learning & Neural Networks Lecture 2 Kevin Duh Graduate School of Information Science Nara Institute of Science and Technology Jan 16, 2014 2/45 Today s Topics 1 General Ideas in Deep Learning Motivation

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on

More information

Statistical NLP for the Web

Statistical NLP for the Web Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks

More information

Neural Networks. William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]

Neural Networks. William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ] Neural Networks William Cohen 10-601 [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ] WHAT ARE NEURAL NETWORKS? William s notation Logis;c regression + 1

More information

Machine Learning

Machine Learning Machine Learning 10-315 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 03/29/2019 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter

More information

Pattern Classification

Pattern Classification Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 2. Perceptrons. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 2. Perceptrons COMP9444 17s2 Perceptrons 1 Outline Neurons Biological and Artificial Perceptron Learning Linear Separability Multi-Layer Networks COMP9444 17s2

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Deep Learning for NLP Part 2

Deep Learning for NLP Part 2 Deep Learning for NLP Part 2 CS224N Christopher Manning (Many slides borrowed from ACL 2012/NAACL 2013 Tutorials by me, Richard Socher and Yoshua Bengio) 2 Part 1.3: The Basics Word Representations The

More information

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen

Lecture 12. Neural Networks Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 12 Neural Networks 10.12.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information