Deep Learning Lab Course 2017 (Deep Learning Practical)
|
|
- Christal Miller
- 5 years ago
- Views:
Transcription
1 Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of Freiburg January 2, 208 Andreas Eitel Uni FR DL 207 ()
2 Technical Issues Location: Monday, 4:00-6:00, building 082, room Remark: We will be there for questions every week from 4:00-6:00. We expect you to work on your own. Your attendance is required during lectures/presentations We expect you have basic knowledge in ML (e.g. heard the Machine Learning lecture). Contact information (tutors): Aaron Klein Artemij Amiranashvili Mohammadreza Zolfaghari Maria Hügle Jingwhei Zhang Andreas Eitel Homepage: learning course Andreas Eitel Uni FR DL 207 (2)
3 Schedule and outline Phase Today: Introduction to Deep Learning (lecture). Assignment ) Implementation of a feed-forward Neural Network from scratch (each person individually, -2 page report). 23.0: meeting solving open questions 30.0: meeting solving open questions 06.: Intro to CNNs and Tensorflow (mandatory lecture), hand in Assignment Assignment 2) Train CNN in tensorflow (each person individually, -2 page report) 3. meeting solving open questions Andreas Eitel Uni FR DL 207 (3)
4 Schedule and outline Phase Today: Introduction to Deep Learning (lecture). Assignment ) Implementation of a feed-forward Neural Network from scratch (each person individually, -2 page report). 23.0: meeting solving open questions 30.0: meeting solving open questions 06.: Intro to CNNs and Tensorflow (mandatory lecture), hand in Assignment Assignment 2) Train CNN in tensorflow (each person individually, -2 page report) 3. meeting solving open questions Phase 2 (split into three tracks) 20. First meeting in tracks, hand in Assignment 2 Assignment 3) Group work track specific task, -2 page report Assignment 4) Group work advanced topic, -2 page report Andreas Eitel Uni FR DL 207 (3)
5 Schedule and outline Phase Today: Introduction to Deep Learning (lecture). Assignment ) Implementation of a feed-forward Neural Network from scratch (each person individually, -2 page report). 23.0: meeting solving open questions 30.0: meeting solving open questions 06.: Intro to CNNs and Tensorflow (mandatory lecture), hand in Assignment Assignment 2) Train CNN in tensorflow (each person individually, -2 page report) 3. meeting solving open questions Phase 2 (split into three tracks) 20. First meeting in tracks, hand in Assignment 2 Assignment 3) Group work track specific task, -2 page report Assignment 4) Group work advanced topic, -2 page report Phase 3: 5.0: meeting final project topics and solving open questions Assignment 5) Solve a challenging problem of your choice using the tools from Phase 2 (group work + 5 min presentation) Andreas Eitel Uni FR DL 207 (3)
6 Tracks Track Neurorobotics/Robotics Robot navigation Deep reinforcement learning Andreas Eitel Uni FR DL 207 (4)
7 Tracks Track Neurorobotics/Robotics Robot navigation Deep reinforcement learning Track 2 Machine Learning Architecture search and hyperparameter optimization Visual recognition Andreas Eitel Uni FR DL 207 (4)
8 Tracks Track Neurorobotics/Robotics Robot navigation Deep reinforcement learning Track 2 Machine Learning Architecture search and hyperparameter optimization Visual recognition Track 3 Computer Vision Image segmentation Autoencoders Generative adversarial networks Andreas Eitel Uni FR DL 207 (4)
9 Groups and Evaluation In total we have three practical phases Reports: Short -2 page report explaining your results, typically accompanied by -2 figures (e.g. a learning curve) hand in the code you used to generate these results as well, send us your github/bitbucket repo Presentations: Phase 3: 5 min group work presentation in last week the reports and presentation after each exercise will be evaluated w.r.t. scientific quality and quality of presentation Requirements for passing: attending the class in the lecture and presentation weeks active participation in the small groups, i.e., programming, testing, writing report Andreas Eitel Uni FR DL 207 (5)
10 Today... Groups: You will work on your own for the first exercise Assignment: You will have three weeks to build a MLP and apply it to the MNIST dataset (more on this at the end) Lecture: Short recap on how MLPs (feed-forward neural networks) work and how to train them Andreas Eitel Uni FR DL 207 (6)
11 (Deep) Machine Learning Overview Data Model M? inference Learn Model M from the data 2 Let the model M infer unknown quantities from data Andreas Eitel Uni FR DL 207 (7)
12 (Deep) Machine Learning Overview Data Model M? inference Andreas Eitel Uni FR DL 207 (8)
13 Machine Learning Overview What is the difference between deep learning and a standard machine learning pipeline? Andreas Eitel Uni FR DL 207 (9)
14 Standard Machine Learning Pipeline () Engineer good features (not learned) (2) Learn Model (3) Inference e.g. classes of unseen data x x x 2 (3) 3 () x x x features color height (4 cm) 2 color height (3.5 cm) 3 color height (0 cm) (2) Data Andreas Eitel Uni FR DL 207 (0)
15 Unsupervised Feature Learning Pipeline (a) (b) (2) (3) Maybe engineer good features (not learned) Learn (deep) representation unsupervisedly Learn Model Inference e.g. classes of unseen data x x x 2 3 (a) x x x features color height (4 cm) 2 color height (3.5 cm) 3 color height (0 cm) (3) (2) Data Pixels (b) Andreas Eitel Uni FR DL 207 ()
16 Supervised Deep Learning Pipeline () Jointly Learn everything with a deep architecture (2) Inference e.g. classes of unseen data x x x Classes 2 (2) 3 Data Andreas Eitel Uni FR Pixels () DL 207 (2)
17 Training supervised feed-forward neural networks Let s formalize! We are given: Dataset D = {(x, y ),..., (x N, y N )} A neural network with parameters θ which implements a function f θ (x) We want to learn: The parameters θ such that i [, N] : f θ (x i ) = y i Andreas Eitel Uni FR DL 207 (3)
18 Training supervised feed-forward neural networks A neural network with parameters θ which implements a function f θ (x) θ is given by the network weights w and bias terms b f(x) (2) h (x) () h (x) smax a smax a a a a a a ai... x xj... xn Andreas Eitel Uni FR DL 207 (4)
19 Neural network forward-pass Computing f θ (x) for a neural network is a forward-pass f(x) smax a smax a unit i activation: N a i = w i,jx j + b i j=i unit i output: h () i (x) = t(a i) where t( ) is an activation or transfer function (2) h (x) () h (x) a a a a a ai wi,j x xj xn bi Andreas Eitel Uni FR DL 207 (5)
20 Neural network forward-pass Computing f θ (x) for a neural network is a forward-pass alternatively (and much faster) use vector notation: layer activation: a () = W () x + b () layer output: h () (x) = t(a () ) where t( ) is applied element wise f(x) (2) h (x) () h (x) W () smax a smax a a a a a a ai wi,j x xj xn bi Andreas Eitel Uni FR DL 207 (6)
21 Neural network forward-pass Computing f θ (x) for a neural network is a forward-pass Second layer layer 2 activation: a (2) = W (2) h () (x) + b (2) layer 2 output: h () (x) = t(a (2) ) where t( ) is applied element wise () f(x) (2) h (x) W (2) h (x) W () smax a smax a a a a a a ai... x xj... xn Andreas Eitel Uni FR DL 207 (7)
22 Neural network forward-pass Computing f θ (x) for a neural network is a forward-pass Output layer output layer activation: a (3) = W (3) h (2) (x) + b (3) network output: f(x) = o(a (3) ) where o( ) is the output nonlinearity for classification use softmax: e z i o i(z) = z j= ez j () f(x) (2) h (x) h (x) W (3) W (2) W () smax a smax a a a a a a ai... x xj... xn Andreas Eitel Uni FR DL 207 (8)
23 Training supervised feed-forward neural networks Neural network activation functions Typical nonlinearities for hidden layers are: tanh(a i), sigmoid σ(a i) =, ReLU relu(ai) = max(ai, 0) + e a i tanh and sigmoid are both squashing non-linearities ReLU just thresholds at 0 Why not linear? Andreas Eitel Uni FR DL 207 (9)
24 Training supervised feed-forward neural networks Train parameters θ such that i [, N] : f θ (x i ) = y i Andreas Eitel Uni FR DL 207 (20)
25 Training supervised feed-forward neural networks Train parameters θ such that i [, N] : f θ (x i ) = y i We can do this via minimizing the empirical risk on our dataset D min L(f θ, D) = min θ θ N where l(, ) is a per example loss N l(f θ (x i ), y i ), () i= Andreas Eitel Uni FR DL 207 (20)
26 Training supervised feed-forward neural networks Train parameters θ such that i [, N] : f θ (x i ) = y i We can do this via minimizing the empirical risk on our dataset D min L(f θ, D) = min θ θ N N l(f θ (x i ), y i ), () i= where l(, ) is a per example loss For regression often use the squared loss: l(f θ (x), y) = 2 M (f j,θ (x) y j) 2 j= For M-class classification use the negative log likelihood: l(f θ (x), y) = M log(f j,θ (x)) y j j Andreas Eitel Uni FR DL 207 (20)
27 Gradient descent The simplest approach to minimizing min L(f θ, D) is gradient descent θ Gradient descent: θ 0 init randomly do θ t+ = θ t γ L(f θ, D) θ while (L(f θ t+, V ) L(f θ t, V )) 2 > ɛ Andreas Eitel Uni FR DL 207 (2)
28 Gradient descent The simplest approach to minimizing min L(f θ, D) is gradient descent θ Gradient descent: θ 0 init randomly do θ t+ = θ t γ L(f θ, D) θ while (L(f θ t+, V ) L(f θ t, V )) 2 > ɛ Where V is a validation dataset (why not use D?) Remember in our case: L(f θ, D) = N l(f θ (x i ), y i ) N i= We will get to computing the derivatives shortly Andreas Eitel Uni FR DL 207 (2)
29 Gradient descent Gradient descent example: D = {(x, y ),..., (x 00, y 00 )} with x U[0, ] y = 3 x + ɛ where ɛ N (0, 0.) Andreas Eitel Uni FR DL 207 (22)
30 Gradient descent Gradient descent example: D = {(x, y ),..., (x 00, y 00 )} with x U[0, ] y = 3 x + ɛ where ɛ N (0, 0.) Learn parameters θ of function f θ (x) = θx using loss l(f θ (x), y) = 2 f θ(x) y 2 2 = 2 (f θ(x) y) 2 L(f θ, D) θ = N N i= l(f θ, D) θ = N N (θx y)x i= Andreas Eitel Uni FR DL 207 (22)
31 Gradient descent Gradient descent example: D = {(x, y ),..., (x 00, y 00 )} with x U[0, ] y = 3 x + ɛ where ɛ N (0, 0.) Learn parameters θ of function f θ (x) = θx using loss l(f θ (x), y) = 2 f θ(x) y 2 2 = 2 (f θ(x) y) 2 L(f θ, D) θ = N N i= l(f θ, D) θ = N N (θx y)x i= Andreas Eitel Uni FR DL 207 (22)
32 Gradient descent Gradient descent example γ = 2. gradient descent Andreas Eitel Uni FR DL 207 (23)
33 Stochastic Gradient descent (SGD) There are two problems with gradient descent:. We have to find a good γ 2. Computing the gradient is expensive if the training dataset is large! Problem 2 can be attacked with online optimization (we will have a look at this) Problem remains but can be tackled via second order methods or other advanced optimization algorithms (rprop/rmsprop, adagrad) Andreas Eitel Uni FR DL 207 (24)
34 Gradient descent. We have to find a good γ (γ = 2., γ = 5.) gradient descent 2 Andreas Eitel Uni FR DL 207 (25)
35 Stochastic Gradient descent (SGD) 2 Computing the gradient is expensive if the training dataset is large! What if we would only evaluate f on parts of the data? Stochastic Gradient Descent: θ 0 init randomly do (x, y ) D sample example from dataset D θ t+ = θ t γ t l(f θ (x ), y ) θ while (L(f θ t+, V ) L(f θ t, V )) 2 > ɛ where γ t and (γ t ) 2 < t= t= (γ should go to zero but not too fast) Andreas Eitel Uni FR DL 207 (26)
36 Stochastic Gradient descent (SGD) 2 Computing the gradient is expensive if the training dataset is large! What if we would only evaluate f on parts of the data? Stochastic Gradient Descent: θ 0 init randomly do (x, y ) D sample example from dataset D θ t+ = θ t γ t l(f θ (x ), y ) θ while (L(f θ t+, V ) L(f θ t, V )) 2 > ɛ where γ t and (γ t ) 2 < t= t= (γ should go to zero but not too fast) SGD can speed up optimization for large datasets but can yield very noisy updates in practice mini-batches are used (compute l(, ) for several samples and average) we still have to find a learning rate shedule γ t Andreas Eitel Uni FR DL 207 (26)
37 Stochastic Gradient descent (SGD) Same data, assuming that gradient evaluation on all data takes 4 times as much time as evaluating a single datapoint (gradient descent (γ = 2), stochastic gradient descent (γ t = 0.0 t )) sgd Andreas Eitel Uni FR DL 207 (27)
38 Neural Network backward pass Now how do we compute the gradient for a network? l(f(x), y) Use the chain rule: f(g(x)) x = f(g(x)) g(x) g(x) x first compute loss on output layer then backpropagate to get l(f(x), y) l(f(x), y) and W (3) a (3) f(x) (2) h (x) W (3) W (2) () h (x) W () smax a smax a a a a a a ai... x xj... xn Andreas Eitel Uni FR DL 207 (28)
39 Neural Network backward pass Now how do we compute the gradient for a network? l(f(x), y) gradient wrt. layer 3 weights: l(f(x), y) l(f(x), y) a (3) = W (3) a (3) W (3) assuming l is NLL and softmax outputs, gradient wrt. layer 3 activation is: l(f(x), y) = (y f(x)) a (3) f(x) (2) h (x) W (3) W (2) () h (x) smax a a a y is one-hot encoded a a ai gradient of a wrt. W (3) : W () a (3) W = (3) h(2) (x) T... x xj... xn a smax a recall a (3) = W (3) h (2) (x) + b (3) Andreas Eitel Uni FR DL 207 (29)
40 Neural Network backward pass Now how do we compute the gradient for a network? gradient wrt. layer 3 weights: l(f(x), y) l(f(x), y) a (3) = W (3) a (3) W (3) assuming l is NLL and softmax outputs, gradient wrt. layer 3 activation is: l(f(x), y) = (y f(x)) a (3) f(x) (2) h (x) W (3) W (2) l(f(x), y) smax a a a gradient of a wrt. W (3) () : h (x) a (3) a a W = (3) h(2) (x) T ai W () combined:... l(f(x), y) = (y f(x))(h (2) (x)) T x xj... xn W (3) a smax recall a (3) = W (3) h (2) (x) + b (3) a Andreas Eitel Uni FR DL 207 (30)
41 Neural Network backward pass Now how do we compute the gradient for a network? gradient wrt. layer 3 weights: l(f(x), y) l(f(x), y) a (3) = W (3) a (3) W (3) assuming l is NLL and softmax outputs, gradient wrt. layer 3 activation is: l(f(x), y) = (y f(x)) a (3) gradient of a wrt. W (3) : a (3) W = (3) (h(2) (x)) T gradient wrt. previous layer: l(f(x), y) h (2) (x) l(f(x), y) a (3) = a (3) h (2) (x) ( = W (3)) T l(f(x), y) a (3) f(x) (2) h (x) W (3) W (2) () h (x) W () l(f(x), y) smax a smax a a a a a a ai... x xj... xn recall a (3) = W (3) h (2) (x) + b (3) Andreas Eitel Uni FR DL 207 (3)
42 Neural Network backward pass Now how do we compute the gradient for a network? l(f(x), y) gradient wrt. layer 2 weights: l(f(x), y) l(f(x), y) h (2) (x) a (2) = W (2) h (2) (x) a (2) W (2) same schema as before just have to consider computing derivative of activation function h(2) (x), e.g. for sigmoid σ( ) a (2) f(x) (2) h (x) W (3) W (2) () h (x) smax a a a a a ai h (2) (x) = σ(a i)( a i) a W () (2) and backprop even further... x xj... xn a smax a recall a (3) = W (3) h (2) (x) + b (3) Andreas Eitel Uni FR DL 207 (32)
43 Gradient Checking Backward-pass is just repeated application of the chain rule However, there is a huge potential for bugs... Gradient checking to the rescue (simply check code via finite-differences): Gradient Checking: θ = (W (), b (),... ) init randomly x init randomly ; y init randomly g analytic compute gradient l(f θ(x), y) θ for i in #θ ˆθ = θ ˆθi = ˆθ i + ɛ g numeric = l(fˆθ(x), y) l(f θ (x), y) ɛ assert( g numeric g analytic < ɛ) can also be used to test partial implementations (i.e. layers, activation functions) simply remove loss computation and backprop ones via backprop Andreas Eitel Uni FR DL 207 (33)
44 Overfitting If you train the parameters of a large network θ = (W (), b (),... ) you will see overfitting! L(f θ (x), D) L(f θ (x), V ) This can be at least partly conquered with regularization Andreas Eitel Uni FR DL 207 (34)
45 Overfitting If you train the parameters of a large network θ = (W (), b (),... ) you will see overfitting! L(f θ (x), D) L(f θ (x), V ) This can be at least partly conquered with regularization weight decay: change cost (and gradient) L(f θ, D) = N min θ enforces small weights (occams razor) N l(f θ (x i ), y i ) + #θ θ i 2 #θ i= i Andreas Eitel Uni FR DL 207 (34)
46 Overfitting If you train the parameters of a large network θ = (W (), b (),... ) you will see overfitting! L(f θ (x), D) L(f θ (x), V ) This can be at least partly conquered with regularization weight decay: change cost (and gradient) L(f θ, D) = N min θ N l(f θ (x i ), y i ) + #θ θ i 2 #θ i= i enforces small weights (occams razor) dropout: kill 50% of the activations in each hidden layer during training forward pass. Multiply hidden activations by during testing 2 prevents co-adaptation / enforces robustness to noise Andreas Eitel Uni FR DL 207 (34)
47 Overfitting If you train the parameters of a large network θ = (W (), b (),... ) you will see overfitting! L(f θ (x), D) L(f θ (x), V ) This can be at least partly conquered with regularization weight decay: change cost (and gradient) L(f θ, D) = N min θ N l(f θ (x i ), y i ) + #θ θ i 2 #θ i= i enforces small weights (occams razor) dropout: kill 50% of the activations in each hidden layer during training forward pass. Multiply hidden activations by during testing 2 prevents co-adaptation / enforces robustness to noise Many, many more! Andreas Eitel Uni FR DL 207 (34)
48 Assignment Implementation: Implement a simple feed-forward neural network by completing the provided stub this includes: possibility to use 2-4 layers sigmoid/tanh and ReLU for the hidden layer softmax output layer optimization via gradient descent (gd) optimization via stochastic gradient descent (sgd) gradient checking code (!!!) weight initialization with random noise (!!!) (use normal distribution with changing std. deviation for now) Bonus points for testing some advanced ideas: implement dropout, l2 regularization implement a different optimization scheme (rprop, rmsprop, adagrad) Code stub: Evaluation: Find good parameters (learning rate, number of iterations etc.) using a validation set (usually take the last 0k examples from the training set) After optimizing parameters run on the full dataset and test once on the test-set (you should be able to reach.6.8% error ) Submission: Clone our github repo and send us the link to your github/bitbucket repo including your solution code and report. to eitel@cs.uni-freiburg.de, hueglem@cs.uni-freiburg.de and zhang@cs.uni-freiburg.de Andreas Eitel Uni FR DL 207 (35)
49 Slide Information These slides have been created by Tobias Springenberg for the DL Lab Course WS 206/207. Andreas Eitel Uni FR DL 207 (36)
Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationNeural Networks: Backpropagation
Neural Networks: Backpropagation Seung-Hoon Na 1 1 Department of Computer Science Chonbuk National University 2018.10.25 eung-hoon Na (Chonbuk National University) Neural Networks: Backpropagation 2018.10.25
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationNeural Networks. David Rosenberg. July 26, New York University. David Rosenberg (New York University) DS-GA 1003 July 26, / 35
Neural Networks David Rosenberg New York University July 26, 2017 David Rosenberg (New York University) DS-GA 1003 July 26, 2017 1 / 35 Neural Networks Overview Objectives What are neural networks? How
More informationDeep Learning & Artificial Intelligence WS 2018/2019
Deep Learning & Artificial Intelligence WS 2018/2019 Linear Regression Model Model Error Function: Squared Error Has no special meaning except it makes gradients look nicer Prediction Ground truth / target
More informationIntroduction to Neural Networks
CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More informationBackpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline
More informationMachine Learning (CSE 446): Backpropagation
Machine Learning (CSE 446): Backpropagation Noah Smith c 2017 University of Washington nasmith@cs.washington.edu November 8, 2017 1 / 32 Neuron-Inspired Classifiers correct output y n L n loss hidden units
More informationTutorial on Machine Learning for Advanced Electronics
Tutorial on Machine Learning for Advanced Electronics Maxim Raginsky March 2017 Part I (Some) Theory and Principles Machine Learning: estimation of dependencies from empirical data (V. Vapnik) enabling
More informationArtificial Neural Networks. MGS Lecture 2
Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation
More informationCSC321 Lecture 9: Generalization
CSC321 Lecture 9: Generalization Roger Grosse Roger Grosse CSC321 Lecture 9: Generalization 1 / 26 Overview We ve focused so far on how to optimize neural nets how to get them to make good predictions
More informationEVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN)
EVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN) TARGETED PIECES OF KNOWLEDGE Linear regression Activation function Multi-Layers Perceptron (MLP) Stochastic Gradient Descent
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationLecture 4 Backpropagation
Lecture 4 Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 5, 2017 Things we will look at today More Backpropagation Still more backpropagation Quiz
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More informationStatistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks
Statistical Machine Learning (BE4M33SSU) Lecture 5: Artificial Neural Networks Jan Drchal Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science Topics covered
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationCSC321 Lecture 9: Generalization
CSC321 Lecture 9: Generalization Roger Grosse Roger Grosse CSC321 Lecture 9: Generalization 1 / 27 Overview We ve focused so far on how to optimize neural nets how to get them to make good predictions
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationOnline Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?
Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationNeural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016
Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph
More informationLoss Functions and Optimization. Lecture 3-1
Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative Assignment 1 is released: http://cs231n.github.io/assignments2017/assignment1/ Due Thursday April 20, 11:59pm on Canvas (Extending
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More informationIntro to Neural Networks and Deep Learning
Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationTopics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued)
Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 3: Introduction to Deep Learning (continued) Course Logistics - Update on course registrations - 6 seats left now -
More informationy(x n, w) t n 2. (1)
Network training: Training a neural network involves determining the weight parameter vector w that minimizes a cost function. Given a training set comprising a set of input vector {x n }, n = 1,...N,
More informationFoundations of Artificial Intelligence
Foundations of Artificial Intelligence 14. Deep Learning An Overview Joschka Boedecker and Wolfram Burgard and Bernhard Nebel Guest lecturer: Frank Hutter Albert-Ludwigs-Universität Freiburg July 14, 2017
More informationWhat Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1
What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single
More informationLecture 2: Learning with neural networks
Lecture 2: Learning with neural networks Deep Learning @ UvA LEARNING WITH NEURAL NETWORKS - PAGE 1 Lecture Overview o Machine Learning Paradigm for Neural Networks o The Backpropagation algorithm for
More informationMachine Learning I Continuous Reinforcement Learning
Machine Learning I Continuous Reinforcement Learning Thomas Rückstieß Technische Universität München January 7/8, 2010 RL Problem Statement (reminder) state s t+1 ENVIRONMENT reward r t+1 new step r t
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationDay 3 Lecture 3. Optimizing deep networks
Day 3 Lecture 3 Optimizing deep networks Convex optimization A function is convex if for all α [0,1]: f(x) Tangent line Examples Quadratics 2-norms Properties Local minimum is global minimum x Gradient
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single
More informationNeural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav
Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 2 October 2018 http://www.inf.ed.ac.u/teaching/courses/mlp/ MLP Lecture 3 / 2 October 2018 Deep
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More informationDeep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation
Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Steve Renals Machine Learning Practical MLP Lecture 5 16 October 2018 MLP Lecture 5 / 16 October 2018 Deep Neural Networks
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationDeepLearning on FPGAs
DeepLearning on FPGAs Introduction to Deep Learning Sebastian Buschäger Technische Universität Dortmund - Fakultät Informatik - Lehrstuhl 8 October 21, 2017 1 Recap Computer Science Approach Technical
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationLecture 5 Neural models for NLP
CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm
More informationAn overview of deep learning methods for genomics
An overview of deep learning methods for genomics Matthew Ploenzke STAT115/215/BIO/BIST282 Harvard University April 19, 218 1 Snapshot 1. Brief introduction to convolutional neural networks What is deep
More informationNormalization Techniques in Training of Deep Neural Networks
Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,
More informationMachine Learning Lecture 12
Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory Probability
More informationNeural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications
Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate
More informationCSC321 Lecture 15: Exploding and Vanishing Gradients
CSC321 Lecture 15: Exploding and Vanishing Gradients Roger Grosse Roger Grosse CSC321 Lecture 15: Exploding and Vanishing Gradients 1 / 23 Overview Yesterday, we saw how to compute the gradient descent
More informationNotes on Back Propagation in 4 Lines
Notes on Back Propagation in 4 Lines Lili Mou moull12@sei.pku.edu.cn March, 2015 Congratulations! You are reading the clearest explanation of forward and backward propagation I have ever seen. In this
More informationMachine Learning Lecture 10
Machine Learning Lecture 10 Neural Networks 26.11.2018 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Today s Topic Deep Learning 2 Course Outline Fundamentals Bayes
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationAdvanced computational methods X Selected Topics: SGD
Advanced computational methods X071521-Selected Topics: SGD. In this lecture, we look at the stochastic gradient descent (SGD) method 1 An illustrating example The MNIST is a simple dataset of variety
More informationSGD and Deep Learning
SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationIntroduction to Natural Computation. Lecture 9. Multilayer Perceptrons and Backpropagation. Peter Lewis
Introduction to Natural Computation Lecture 9 Multilayer Perceptrons and Backpropagation Peter Lewis 1 / 25 Overview of the Lecture Why multilayer perceptrons? Some applications of multilayer perceptrons.
More informationIntroduction to Machine Learning
Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................
More informationConvolutional Neural Networks. Srikumar Ramalingam
Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/
More informationECE521 Lecture 7/8. Logistic Regression
ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationRapid Introduction to Machine Learning/ Deep Learning
Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/59 Lecture 4a Feedforward neural network October 30, 2015 2/59 Table of contents 1 1. Objectives of Lecture
More informationMachine Learning Techniques
Machine Learning Techniques ( 機器學習技法 ) Lecture 13: Deep Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系
More informationSupervised Learning. George Konidaris
Supervised Learning George Konidaris gdk@cs.brown.edu Fall 2017 Machine Learning Subfield of AI concerned with learning from data. Broadly, using: Experience To Improve Performance On Some Task (Tom Mitchell,
More informationStatistical Machine Learning
Statistical Machine Learning Lecture 9 Numerical optimization and deep learning Niklas Wahlström Division of Systems and Control Department of Information Technology Uppsala University niklas.wahlstrom@it.uu.se
More informationFeed-forward Network Functions
Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationMachine Learning. Boris
Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationLecture 6 Optimization for Deep Neural Networks
Lecture 6 Optimization for Deep Neural Networks CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 12, 2017 Things we will look at today Stochastic Gradient Descent Things
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron
More informationLecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets)
COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) Sanjeev Arora Elad Hazan Recap: Structure of a deep
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationCSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning
CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.
More informationNeural Networks: Backpropagation
Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others
More informationStochastic gradient descent; Classification
Stochastic gradient descent; Classification Steve Renals Machine Learning Practical MLP Lecture 2 28 September 2016 MLP Lecture 2 Stochastic gradient descent; Classification 1 Single Layer Networks MLP
More informationMultilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)
Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w 1 x 1 + w 2 x 2 + w 0 = 0 Feature 1 x 2 = w 1 w 2 x 1 w 0 w 2 Feature 2 A perceptron
More informationUnderstanding Neural Networks : Part I
TensorFlow Workshop 2018 Understanding Neural Networks Part I : Artificial Neurons and Network Optimization Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Neural Networks
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationMachine Learning Basics III
Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient
More informationFeed-forward Networks Network Training Error Backpropagation Applications. Neural Networks. Oliver Schulte - CMPT 726. Bishop PRML Ch.
Neural Networks Oliver Schulte - CMPT 726 Bishop PRML Ch. 5 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will
More informationDeep Learning (CNNs)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More information