Deep Learning book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville
|
|
- Francine Mathews
- 6 years ago
- Views:
Transcription
1 Deep Learning book, by Ian Goodfellow, Yoshua Bengio and Aaron Courville Chapter 6 :Deep Feedforward Networks Benoit Massé Dionyssos Kounades-Bastian Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 1 / 25
2 Linear regression (and classication) Input vector x Output vector y Parameters Weight W and bias b Prediction : y = W x + b Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 2 / 25
3 Linear regression (and classication) Input vector x Output vector y Parameters Weight W and bias b Prediction : y = W x + b W b x u y Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 2 / 25
4 Linear regression (and classication) Input vector x Output vector y Parameters Weight W and bias b Prediction : y = W x + b x 1 W W 23 b 1 u 1 y 1 x 2 b 2 u 2 y 2 x 3 Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 2 / 25
5 Linear regression (and classication) Input vector x Output vector y Parameters Weight W and bias b Prediction : y = W x + b x 1 W, b y 1 x 2 y 2 x 3 Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 2 / 25
6 Linear regression (and classication) Advantages Easy to use Easy to train, low risk of overtting Drawbacks Some problems are inherently non-linear Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 3 / 25
7 Solving XOR Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 4 / 25
8 Solving XOR Linear regressor x 1 W, b y x 2 There is no value for W and b such that (x 1, x 2 ) {0, 1} 2 W ( x 1 x 2 ) + b = xor(x 1, x 2 ) Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 4 / 25
9 Solving XOR What about...? x 1 W, b u 1 V, c y x 2 u 2 Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 5 / 25
10 Solving XOR What about...? x 1 W, b u 1 V, c y x 2 u 2 Strictly equivalent : The composition of two linear operation is still a linear operation Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 5 / 25
11 Solving XOR And about...? x 1 W, b φ u 1 h 1 V, c y x 2 u 2 h 2 In which φ(x) = max{0, x} Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 6 / 25
12 Solving XOR And about...? x 1 W, b φ u 1 h 1 V, c y x 2 u 2 h 2 In which φ(x) = max{0, x} It is possible! ( ) 1 1 With W =, b = 1 1 ( ) ( ) 0 1, V = and c = 0, 1 2 Vφ(Wx + b) = xor(x 1, x 2 ) Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 6 / 25
13 Neural network with one hidden layer Compact representation W, b, φ V, c x h y Neural network Hidden layer with non-linearity can represent broader class of function Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 7 / 25
14 Universal approximation theorem Theorem A neural network with one hidden layer can approximate any continuous function More formally, given a continuous function f : C n R m where C n is a compact subset of R n, ε, f ε NN : x K v i φ(w i x + b i ) + c i=1 such that x C n, f (x) f ε NN(x) < ε Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 8 / 25
15 Problems Obtaining the network The universal theorem gives no information about HOW to obtain such a network Size of the hidden layer h Values of W and b Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 9 / 25
16 Problems Obtaining the network The universal theorem gives no information about HOW to obtain such a network Size of the hidden layer h Values of W and b Using the network Even if we nd a way to obtain the network, the size of the hidden layer may be prohibitively large. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 9 / 25
17 Deep neural network Why Deep? Let's stack l hidden layers one after the other; l is called the length of the network. W 1, b 1, φ W 2, b 2, φ W l, b l, φ V, c x h 1... h l y Properties of DNN The universal approximation theorem also apply Some functions can be approximated by a DNN with N hidden unit, and would require O(e N ) hidden units to be represented by a shallow network. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 10 / 25
18 Summary Comparison Linear classier Limited representational power + Simple Shallow Neural network (Exactly one hidden layer) + Unlimited representational power Sometimes prohibitively wide Deep Neural network + Unlimited representational power + Relatively small number of hidden units needed Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 11 / 25
19 Summary Comparison Linear classier Limited representational power + Simple Shallow Neural network (Exactly one hidden layer) + Unlimited representational power Sometimes prohibitively wide Deep Neural network + Unlimited representational power + Relatively small number of hidden units needed Remaining problem How to get this DNN? Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 11 / 25
20 On the path of getting my own DNN Hyperparameters First, we need to dene the architecture of the DNN The depth l The size of the hidden layers n 1,..., n l The activation function φ The output unit Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 12 / 25
21 On the path of getting my own DNN Hyperparameters First, we need to dene the architecture of the DNN The depth l The size of the hidden layers n 1,..., n l The activation function φ The output unit Parameters When the architecture is dened, we need to train the DNN W 1, b 1,..., W l, b l Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 12 / 25
22 Hyperparameters The depth l The size of the hidden layers n 1,..., n l Strongly depend on the problem to solve Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 13 / 25
23 Hyperparameters The depth l The size of the hidden layers n 1,..., n l Strongly depend on the problem to solve The activation function φ ReLU g : x max{0, x} Sigmoid σ : x (1 + e x ) 1 Many others : tanh, RBF, softplus... Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 13 / 25
24 Hyperparameters The depth l The size of the hidden layers n 1,..., n l Strongly depend on the problem to solve The activation function φ ReLU g : x max{0, x} Sigmoid σ : x (1 + e x ) 1 Many others : tanh, RBF, softplus... The output unit Linear output E[y] = V h l + c For regression with Gaussian distribution y N (E[y], I) Sigmoid output ŷ = σ(w h l + b) For classication with Bernouilli distribution P(y = 1 x) = ŷ Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 13 / 25
25 Parameters Training Objective Let's dene θ = (W 1, b 1,..., W l, b l ). We suppose we have a set of inputs X = (x 1,..., x N ) and a set of expected outputs Y = (y 1,..., y N ). The goal is to nd a neural network f NN such that i, f NN (x i, θ) y i. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 14 / 25
26 Parameters Training Cost function To evaluate the error that our current network makes, let's dene a cost function L(X, Y, θ). The goal becomes to nd Loss function argmin L(X, Y, θ) θ Should represent a combination of the distances between every y i and the corresponding f NN (x i, θ) Mean square error (rare) Cross-entropy Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 15 / 25
27 Parameters Training Find the minimum The basic idea consists in computing ˆθ such that θ L(X, Y, ˆθ) = 0. This is dicult to solve analytically e.g. when θ have millions of degrees of freedom. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 16 / 25
28 Parameters Training Gradient descent Let's use a numerical way to optimize θ, called the gradient descent (section 4.3). The idea is that f (θ εu) f (θ) εu f (θ) So if we take u = f (θ), we have u u > 0 and then f (θ εu) f (θ) εu u < f (θ). If f is a function to minimize, we have an update rule that improves our estimate. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 17 / 25
29 Parameters Training Gradient descent algorithm 1 Have an estimate ˆθ of the parameters 2 Compute θ L(X, Y, ˆθ) 3 Update ˆθ ˆθ ε θ L 4 Repeat step 2-3 until θ L < threshold Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 18 / 25
30 Parameters Training Gradient descent algorithm 1 Have an estimate ˆθ of the parameters 2 Compute θ L(X, Y, ˆθ) 3 Update ˆθ ˆθ ε θ L 4 Repeat step 2-3 until θ L < threshold Problem How to estimate eciently θ L(X, Y, ˆθ)? Back-propagation algorithm Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 18 / 25
31 Back-propagation for Parameter Learning Consider the architecture: w 2 w 1 y φ φ x with function: y = φ ( w 2 φ(w 1 x) ), some training pairs T = {ˆx n, ŷ n } N n=1, and an activation-function φ(). Learn w 1, w 2 so that: Feeding ˆx n results ŷ n. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 19 / 25
32 Prerequisite: dierentiable activation function For learning to be possible φ() has to be dierentiable. Let φ (x) = φ(x) denote the derivative of φ(x). x For example when φ(x) = Relu(x) we have: Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 20 / 25
33 Gradient-based Learning Minimize the loss function L(w 1, w 2, T ). We will learn the weights by iterating: [ w 1 w 2 ] updated [ = w 1 w 2 ] γ L w 1 L w 2, (1) L is the loss function (must be dierentiable): In detail is L(w 1, w 2, T ) and we want to compute the gradient(s) at w 1, w 2. γ is the learning rate (a scalar typically known). Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 21 / 25
34 Back-propagation Calculate intermediate values on all units: 1 a = w 1ˆx n. 2 b = φ(w 1ˆx n ). 3 c = w 2 φ(w 1ˆx n ). 4 d = φ ( w 2 φ(w 1ˆx n ) ). 6 The partial derivatives are: L(d) d = L (d). 7 d c = φ (w 2 φ(w 1ˆx n )). 8 c b = w 2. 9 b a = φ (w 1ˆx n ). 5 L(d) = L ( φ ( w 2 φ(w 1ˆx n ) )). Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 22 / 25
35 Calculating the Gradients I Apply chain rule: L w 1 L(d) w 2 = L d c b a, d c b a w 1 = L(d) d c. d c w 2 Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 23 / 25
36 Calculating the Gradients I Apply chain rule: L w 1 L(d) w 2 = L d c b a, d c b a w 1 = L(d) d c. d c w 2 Start the calculation from left-to-right. We propage the gradients (partial products) from the last layer towards the input. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 23 / 25
37 Calculating the Gradients And because we have N training pairs: L w 1 = L w 2 = N L(d n ) n=1 d n N L(d n ) n=1 d n d n c n c n b n b n a n a n w 1, d n c n c n w 2. Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 24 / 25
38 Thank you! Benoit Massé, Dionyssos Kounades-Bastian Deep Feedforward Networks 25 / 25
Deep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationFeedforward Neural Networks. Michael Collins, Columbia University
Feedforward Neural Networks Michael Collins, Columbia University Recap: Log-linear Models A log-linear model takes the following form: p(y x; v) = exp (v f(x, y)) y Y exp (v f(x, y )) f(x, y) is the representation
More informationLearning Deep Architectures for AI. Part I - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part I - Vijay Chakilam Chapter 0: Preliminaries Neural Network Models The basic idea behind the neural network approach is to model the response as a
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationDeep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated
Deep Feedforward Networks Lecture slides for Chapter 6 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2016-10-04 Roadmap Example: Learning XOR Gradient-Based Learning Hidden Units
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationLecture 3 Feedforward Networks and Backpropagation
Lecture 3 Feedforward Networks and Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 3, 2017 Things we will look at today Recap of Logistic Regression
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationDeep Feedforward Networks. Han Shao, Hou Pong Chan, and Hongyi Zhang
Deep Feedforward Networks Han Shao, Hou Pong Chan, and Hongyi Zhang Deep Feedforward Networks Goal: approximate some function f e.g., a classifier, maps input to a class y = f (x) x y Defines a mapping
More informationFeedforward Neural Networks
Feedforward Neural Networks Michael Collins 1 Introduction In the previous notes, we introduced an important class of models, log-linear models. In this note, we describe feedforward neural networks, which
More informationA summary of Deep Learning without Poor Local Minima
A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given
More informationCourse 395: Machine Learning - Lectures
Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture
More informationComments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms
Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:
More informationDeep Feedforward Networks. Seung-Hoon Na Chonbuk National University
Deep Feedforward Networks Seung-Hoon Na Chonbuk National University Neural Network: Types Feedforward neural networks (FNN) = Deep feedforward networks = multilayer perceptrons (MLP) No feedback connections
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationBased on the original slides of Hung-yi Lee
Based on the original slides of Hung-yi Lee Google Trends Deep learning obtains many exciting results. Can contribute to new Smart Services in the Context of the Internet of Things (IoT). IoT Services
More informationAdvanced Machine Learning
Advanced Machine Learning Lecture 4: Deep Learning Essentials Pierre Geurts, Gilles Louppe, Louis Wehenkel 1 / 52 Outline Goal: explain and motivate the basic constructs of neural networks. From linear
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationCS60010: Deep Learning
CS60010: Deep Learning Sudeshna Sarkar Spring 2018 16 Jan 2018 FFN Goal: Approximate some unknown ideal function f : X! Y Ideal classifier: y = f*(x) with x and category y Feedforward Network: Define parametric
More informationLearning Deep Architectures for AI. Part II - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model
More informationCS 453X: Class 20. Jacob Whitehill
CS 3X: Class 20 Jacob Whitehill More on training neural networks Training neural networks While training neural networks by hand is (arguably) fun, it is completely impractical except for toy examples.
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationMachine Learning Basics III
Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient
More informationFeed-forward Network Functions
Feed-forward Network Functions Sargur Srihari Topics 1. Extension of linear models 2. Feed-forward Network Functions 3. Weight-space symmetries 2 Recap of Linear Models Linear Models for Regression, Classification
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationOptmization Methods for Machine Learning Beyond Perceptron Feed Forward neural networks (FFN)
Optmization Methods for Machine Learning Beyond Perceptron Feed Forward neural networks (FFN) Laura Palagi http://www.dis.uniroma1.it/ palagi Dipartimento di Ingegneria informatica automatica e gestionale
More informationDeep learning attracts lots of attention.
Deep Learning Deep learning attracts lots of attention. I believe you have seen lots of exciting results before. Deep learning trends at Google. Source: SIGMOD/Jeff Dean Ups and downs of Deep Learning
More informationNeural Networks Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav
Neural Networks 30.11.2015 Lecturer: J. Matas Authors: J. Matas, B. Flach, O. Drbohlav 1 Talk Outline Perceptron Combining neurons to a network Neural network, processing input to an output Learning Cost
More informationNeural Networks. Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016
Neural Networks Yan Shao Department of Linguistics and Philology, Uppsala University 7 December 2016 Outline Part 1 Introduction Feedforward Neural Networks Stochastic Gradient Descent Computational Graph
More informationGradient-Based Learning. Sargur N. Srihari
Gradient-Based Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationECE521 Lecture 7/8. Logistic Regression
ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline
More informationIntroduction to Machine Learning Spring 2018 Note Neural Networks
CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationNeural Network Language Modeling
Neural Network Language Modeling Instructor: Wei Xu Ohio State University CSE 5525 Many slides from Marek Rei, Philipp Koehn and Noah Smith Course Project Sign up your course project In-class presentation
More informationStatistical Machine Learning
Statistical Machine Learning Lecture 9 Numerical optimization and deep learning Niklas Wahlström Division of Systems and Control Department of Information Technology Uppsala University niklas.wahlstrom@it.uu.se
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationCourse 10. Kernel methods. Classical and deep neural networks.
Course 10 Kernel methods. Classical and deep neural networks. Kernel methods in similarity-based learning Following (Ionescu, 2018) The Vector Space Model ò The representation of a set of objects as vectors
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationTopic 3: Neural Networks
CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationMachine Learning
Machine Learning 10-315 Maria Florina Balcan Machine Learning Department Carnegie Mellon University 03/29/2019 Today: Artificial neural networks Backpropagation Reading: Mitchell: Chapter 4 Bishop: Chapter
More informationSolutions. Part I Logistic regression backpropagation with a single training example
Solutions Part I Logistic regression backpropagation with a single training example In this part, you are using the Stochastic Gradient Optimizer to train your Logistic Regression. Consequently, the gradients
More informationNeural Networks and the Back-propagation Algorithm
Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron
More informationOther Topologies. Y. LeCun: Machine Learning and Pattern Recognition p. 5/3
Y. LeCun: Machine Learning and Pattern Recognition p. 5/3 Other Topologies The back-propagation procedure is not limited to feed-forward cascades. It can be applied to networks of module with any topology,
More information10 NEURAL NETWORKS Bio-inspired Multi-Layer Networks. Learning Objectives:
10 NEURAL NETWORKS TODO The first learning models you learned about (decision trees and nearest neighbor models) created complex, non-linear decision boundaries. We moved from there to the perceptron,
More informationCh.6 Deep Feedforward Networks (2/3)
Ch.6 Deep Feedforward Networks (2/3) 16. 10. 17. (Mon.) System Software Lab., Dept. of Mechanical & Information Eng. Woonggy Kim 1 Contents 6.3. Hidden Units 6.3.1. Rectified Linear Units and Their Generalizations
More informationCSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning Fall 2018/19 3. Improving Neural Networks (Some figures adapted from NNDL book) 1 Various Approaches to Improve Neural Networks 1. Cost functions Quadratic Cross
More informationRapid Introduction to Machine Learning/ Deep Learning
Rapid Introduction to Machine Learning/ Deep Learning Hyeong In Choi Seoul National University 1/62 Lecture 1b Logistic regression & neural network October 2, 2015 2/62 Table of contents 1 1 Bird s-eye
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationArtificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso
Artificial Neural Networks (ANN) Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Fall, 2018 Outline Introduction A Brief History ANN Architecture Terminology
More information1 What a Neural Network Computes
Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists
More informationNeural Networks. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Neural Networks CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Perceptrons x 0 = 1 x 1 x 2 z = h w T x Output: z x D A perceptron
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationNeural networks (NN) 1
Neural networks (NN) 1 Hedibert F. Lopes Insper Institute of Education and Research São Paulo, Brazil 1 Slides based on Chapter 11 of Hastie, Tibshirani and Friedman s book The Elements of Statistical
More informationSPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks
Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension
More informationDeep Learning: Self-Taught Learning and Deep vs. Shallow Architectures. Lecture 04
Deep Learning: Self-Taught Learning and Deep vs. Shallow Architectures Lecture 04 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Self-Taught Learning 1. Learn
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: Ryan Lowe (ryan.lowe@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted,
More informationArtificial Neural Networks
Introduction ANN in Action Final Observations Application: Poverty Detection Artificial Neural Networks Alvaro J. Riascos Villegas University of los Andes and Quantil July 6 2018 Artificial Neural Networks
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationDeep Learning Lab Course 2017 (Deep Learning Practical)
Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of
More informationArtificial Intelligence
Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory Announcements Be making progress on your projects! Three Types of Learning Unsupervised Supervised Reinforcement
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I
Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 2 October 2018 http://www.inf.ed.ac.u/teaching/courses/mlp/ MLP Lecture 3 / 2 October 2018 Deep
More informationPV021: Neural networks. Tomáš Brázdil
1 PV021: Neural networks Tomáš Brázdil 2 Course organization Course materials: Main: The lecture Neural Networks and Deep Learning by Michael Nielsen http://neuralnetworksanddeeplearning.com/ (Extremely
More informationWhat Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1
What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single
More informationLecture 4 Backpropagation
Lecture 4 Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 5, 2017 Things we will look at today More Backpropagation Still more backpropagation Quiz
More informationEEE 241: Linear Systems
EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of
More informationUnderstanding Neural Networks : Part I
TensorFlow Workshop 2018 Understanding Neural Networks Part I : Artificial Neurons and Network Optimization Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Neural Networks
More informationStatistical Machine Learning from Data
January 17, 2006 Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Multi-Layer Perceptrons Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole
More informationLab 5: 16 th April Exercises on Neural Networks
Lab 5: 16 th April 01 Exercises on Neural Networks 1. What are the values of weights w 0, w 1, and w for the perceptron whose decision surface is illustrated in the figure? Assume the surface crosses the
More informationNotes on Back Propagation in 4 Lines
Notes on Back Propagation in 4 Lines Lili Mou moull12@sei.pku.edu.cn March, 2015 Congratulations! You are reading the clearest explanation of forward and backward propagation I have ever seen. In this
More informationLogistic Regression & Neural Networks
Logistic Regression & Neural Networks CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Logistic Regression Perceptron & Probabilities What if we want a probability
More informationOPTIMIZATION METHODS IN DEEP LEARNING
Tutorial outline OPTIMIZATION METHODS IN DEEP LEARNING Based on Deep Learning, chapter 8 by Ian Goodfellow, Yoshua Bengio and Aaron Courville Presented By Nadav Bhonker Optimization vs Learning Surrogate
More information2018 EE448, Big Data Mining, Lecture 5. (Part II) Weinan Zhang Shanghai Jiao Tong University
2018 EE448, Big Data Mining, Lecture 5 Supervised Learning (Part II) Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html Content of Supervised Learning
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 03: Multi-layer Perceptron Outline Failure of Perceptron Neural Network Backpropagation Universal Approximator 2 Outline Failure of Perceptron Neural Network Backpropagation
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationDeep Feedforward Networks
Deep Feedforward Networks Yongjin Park 1 Goal of Feedforward Networks Deep Feedforward Networks are also called as Feedforward neural networks or Multilayer Perceptrons Their Goal: approximate some function
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationAlgorithms for Learning Good Step Sizes
1 Algorithms for Learning Good Step Sizes Brian Zhang (bhz) and Manikant Tiwari (manikant) with the guidance of Prof. Tim Roughgarden I. MOTIVATION AND PREVIOUS WORK Many common algorithms in machine learning,
More informationComputational statistics
Computational statistics Lecture 3: Neural networks Thierry Denœux 5 March, 2016 Neural networks A class of learning methods that was developed separately in different fields statistics and artificial
More informationStein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm Qiang Liu and Dilin Wang NIPS 2016 Discussion by Yunchen Pu March 17, 2017 March 17, 2017 1 / 8 Introduction Let x R d
More informationMACHINE LEARNING AND PATTERN RECOGNITION Fall 2005, Lecture 4 Gradient-Based Learning III: Architectures Yann LeCun
Y. LeCun: Machine Learning and Pattern Recognition p. 1/3 MACHINE LEARNING AND PATTERN RECOGNITION Fall 2005, Lecture 4 Gradient-Based Learning III: Architectures Yann LeCun The Courant Institute, New
More informationIntroduction to Neural Networks
Introduction to Neural Networks Philipp Koehn 3 October 207 Linear Models We used before weighted linear combination of feature values h j and weights λ j score(λ, d i ) = j λ j h j (d i ) Such models
More informationDeep Reinforcement Learning SISL. Jeremy Morton (jmorton2) November 7, Stanford Intelligent Systems Laboratory
Deep Reinforcement Learning Jeremy Morton (jmorton2) November 7, 2016 SISL Stanford Intelligent Systems Laboratory Overview 2 1 Motivation 2 Neural Networks 3 Deep Reinforcement Learning 4 Deep Learning
More informationDeep Learning & Artificial Intelligence WS 2018/2019
Deep Learning & Artificial Intelligence WS 2018/2019 Linear Regression Model Model Error Function: Squared Error Has no special meaning except it makes gradients look nicer Prediction Ground truth / target
More informationIntroduction to Machine Learning
Introduction to Machine Learning Neural Networks Varun Chandola x x 5 Input Outline Contents February 2, 207 Extending Perceptrons 2 Multi Layered Perceptrons 2 2. Generalizing to Multiple Labels.................
More informationLecture 2: Modular Learning
Lecture 2: Modular Learning Deep Learning @ UvA MODULAR LEARNING - PAGE 1 Announcement o C. Maddison is coming to the Deep Vision Seminars on the 21 st of Sep Author of concrete distribution, co-author
More information