Deep Learning Autoencoder Models
|
|
- Christal Arnold
- 5 years ago
- Views:
Transcription
1 Deep Learning Autoencoder Models Davide Bacciu Dipartimento di Informatica Università di Pisa Intelligent Systems for Pattern Recognition (ISPR)
2 Generative Models Wrap-up Deep Learning Module Lecture Generative Graphical Models Bayesian Models/Nets N θ z w M α Unsupervised data understanding Interpretability Weak on supervised performance β K Markov Random Fields Y i X i Knowledge and constraints through feature functions CRF: the supervised way to generative Computationally heavy Y j Dynamic Models S 2 1 S 2 2 S 2 3 Y 2 t Q 1 2 Q 2 2 Topology unfolds on data structure Structured data processing Complex causal relationships
3 Generative Models Wrap-up Deep Learning Module Lecture Module s Take Home Messages Consider using generative models when Need interpretability Need to incorporate prior knowledge Unsupervised learning or learning with partially observable supervision Need reusable/portable learned knowledge Consider avoiding generative models when Having tight computational constraints Dealing with raw, noisy low level data Variational inference Efficient way to learn an approximation to intractable distributions The variational function can be a neural network
4 Generative Models Wrap-up Deep Learning Module Lecture Deep Learning AI Prediction Deep Learning Hard-coded expert reasoning Expertdesigned features Trainable predictor Learned features Learned feature hierarchy ML Input ANN
5 Generative Models Wrap-up Deep Learning Module Lecture Module Outline Recurrent, recursive and contextual (Micheli) Recurrent NN training refresher Recursive NN Graph processing Foundational models s and RBM Convolutional Neural Networks Gated Recurrent Networks (LSTM, ) Advanced topics Adversarial models, memory networks and attentional models Variational deep learning Deep reinforcement learning API and applications seminars
6 Generative Models Wrap-up Deep Learning Module Lecture Reference Book Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press Chapters 6-10, 14,20 Integrated by course slides and additional readings Freely available online
7 Generative Models Wrap-up Deep Learning Module Lecture Module s Prerequisites Formal model of neuron Neural network Feed-forward Recurrent Cost function optimization Backpropagation/SGD Regularization Neural network hyper-parameters and model selection
8 Generative Models Wrap-up Deep Learning Module Lecture Lecture Outline Autoencoders a.k.a. The first and the latest deep learning model Autoencoders and dimensionality reduction Deep neural autoencoders Sparse Denoising Contractive Deep generative-based autoencoders Deep Belief Networks Deep Boltzmann Machines Application Examples
9 Key Concepts Neural Approaches Generative Approaches Basic Autoencoder (AE) input D Encoder f hidden h K Decoder g output x x Latent space projection (again) D Train a model to reconstruct the input Passing through some form of information bottleneck K << D, or? h sparsely active Train by loss minimization L x, x = L x, g f(x)
10 Key Concepts Neural Approaches Generative Approaches A Very Well Known Autoencoder x x h Encoding-Decoding h = f x = W e x x = g h = W d W e x Tied weights (often, not always) D W e K W d D Euclidean Loss W d = W e T = W T L x, x = x W T Wx 2 2 What if we take f and g linear and K<<D? Learns the same subspace of PCA
11 Key Concepts Neural Approaches Generative Approaches A Probabilistic View x x Stochastic Autoencoder h h P e (h x) P d ( x h) K x x A unifying view of neural and generative AE Paves the way to Variational Autoencoders
12 Key Concepts Neural Approaches Generative Approaches Neural Autoencoders Generally, we would like to train nonlinear AEs, with possibly K>D, that do not learn trivial identity Regularized autoencoders Sparse AE Denoising AE Contractive AE Autoencoders with dropout layers
13 Key Concepts Neural Approaches Generative Approaches Sparse Autoencoder Add a term to the cost function to penalize h (want the number of active units to be small) J SAE θ = x S (L(x, x) + λω(h)) Typically Ω h = Ω f(x) = j h j (x))
14 Key Concepts Neural Approaches Generative Approaches Probabilistic Interpretation (Oh No, Again!) (From ML Course) Training with regularization is MAP inference max log P h, x = max (log P x h + log P h ) Likelihood P h = λ 2 exp( λ h 1) Prior Ω h = λ h 1 Laplace
15 Key Concepts Neural Approaches Generative Approaches Denoising Autoencoder (DAE) Train the AE to minimize the function L(x, g(f x )) where x is a version of original input x corrupted by some noise process C( x x) Key Intuition - Learned representations should be robust to partial destruction of the input
16 Key Concepts Neural Approaches Generative Approaches Another Interpretation yes, exactly the one you are thinking of f h g Learns the denoising distribution P(x x) x x By minimizing C( x x) x log P d (x h = f x )
17 Key Concepts Neural Approaches Generative Approaches DAE as Manifold Learning Learning a vector field (green arrows) approximating the gradient of the unknown data generating distribution g h x log p(x) x C x x = N x x, σ 2 )
18 Key Concepts Neural Approaches Generative Approaches The Manifold Assumption Assume data lies on a lower dimensional non-linear manifold since variables in data are typically dependent Regularized AE can afford to represent only variations that are needed to reconstruct training examples AE mapping is sensitive only to changes in manifold direction Yoshua Bengio, Learning Deep Architectures for AI, Foundations and Trends in Machine Learning, 2009.
19 Key Concepts Neural Approaches Generative Approaches Contractive Autoencoder Penalize encoding function for input sensitivity J CAE θ = x S (L(x, x) + λω(h)) Ω h = Ω f(x) = f(x) x 2 F You can as well penalize on higher order derivatives
20 Key Concepts Neural Approaches Generative Approaches Supervised learning h 4 h 3 h 2 h 1 x Unsupervised training Hierarchical autoencoder Extracts a representation of inputs that facilitates Data visualization, exploration, indexing, Realization of a supervised task
21 Key Concepts Neural Approaches Generative Approaches Unsupervised Layerwise Pretraining Incremental unsupervised construction of the Deep AE Any form of AE, e.g. those shown in previous slides x W 1 T h 1 W 1 h 1 W 1 x x
22 Key Concepts Neural Approaches Generative Approaches Unsupervised Layerwise Pretraining Incremental unsupervised construction of the Deep AE h 1 W 2 T h 2 h 2 W 2 W 2 h 1 h 1 x W 1
23 Key Concepts Neural Approaches Generative Approaches Unsupervised Layerwise Pretraining Incremental unsupervised construction of the Deep AE h 2 W 3 T h 4 W 4 h 3 W 3 W 3 h 3 h 2 h 2 h 1 x W 2 W 1
24 Key Concepts Neural Approaches Generative Approaches Optional Fine Tuning Fine tune the whole autoencoder to optimize input reconstruction You can use backpropagation, but it remains an unsupervised task W 4 T h 4 W 4 h 3 W 3 T h 3 W 3 h 2 W 2 T h 2 W 2 h 1 x W 1 T h 1 x W 1
25 Key Concepts Neural Approaches Generative Approaches Rearranging the Graphics Does it look like something familiar? h 3 h 4 h 3 A layered Restricted Boltzmann Machine h 1 h 2 h 1 h 2 Can use RBM to perform layerwise pretraining and learn the matrices W i x x
26 Key Concepts Neural Approaches Generative Approaches Deep Belief Network (DBN) A stack of pairwise RBM h 4 h 3 h 2 h 1 x RBM 4 RBM 3 RBM 2 RBM 1 IMPORTANT NOTE A DBM is a deep autoencoder but it is NOT a deep RBM It is (mostly) directed!
27 Key Concepts Neural Approaches Generative Approaches Deep Boltzmann Machine (DBM) How do we get this? h 4 h 3 Training requires some attention because of the recurrent interactions from higher layers to the bottom h 2 h 1 P h j 1 x, h 2 = σ i W ij 1 x i + m W jm 2 h m 2 x P x i h 1 = σ W ij 1 h j 1 j
28 Key Concepts Neural Approaches Generative Approaches Pretraining DBM How do we get this? 2) (Pre)training the second layer changes h 1 prior by P h 1 W 2 ) = h 1 P h 1, h 2 W 2 ) h 2 h 1 x 1) (Pre)training the first layer entails fitting this model When putting things together, we need to average between the two P h 1 W 1 ) = P x θ = h 1 P h 1 W 1 )P(x h 1, W 1 ) x P h 1, x W 1 )
29 Key Concepts Neural Approaches Generative Approaches Pretraining DBM - Trick Averaging the two models of h 1 can be approximated by taking half contribution from W 1 and half from W 2 Using full W 1 and W 2 would double count x contribution as h 2 depends on x h 2 h 2 W 2 W 2 h 1 h 1 W 1 W 1 x x W 1 h 2 h 1 x W 2 When training with more than two RBMs apply trick to first and last layers and halve weights (both direction) of intermediate RBM
30 Key Concepts Neural Approaches Generative Approaches DBM Discriminative Fine Tuning W 2 T P(h 2 v) output h 2 h 1 W 2 W 1 x The pretrained DBM matrices can be used to initialize a deep autoencoder Add input from h 2 to the first hidden layer Add output layer Fine tuning of the RBM matrices by backpropagation
31 Software Conclusions Software - Deep Neural Autoencoders All deep learning frameworks offer facilities to build (deep) AEs Check out classic Theano-based tutorials for denoising autoencoders and their stacked version A variety of deep AE in Keras and their counterpart in Lua-Torch Stacked autoencoders built with official Matlab toolbox functions
32 Software Conclusions Matlab - Deep Generative Models Matlab code for the DBN paper with a demo on MNIST data Matlab code for Deep Boltzmann Machines with a demo on MNIST data Deepmat Matlab library for deep generative models DeeBNet Matlab/Octave toolbox for deep generative models with GPU support
33 Software Conclusions Python - Deep Generative Models DBN and DBM implementations exist for all major deep learning libraries Deep Boltzmann machine implementation (Tensorflow-based) with image processing application, pre-trained networks and notebooks Deepnet A Toronto based implementation of deep autoencoders (neural and generative) Check out classic Theano-based tutorials for deep belief networks and RBM
34 Software Conclusions AE - Visualization Visualizing complex data in learned latent space
35 Software Conclusions Visualizing Sound
36 Software Conclusions AE Image Restoration/Colorization Apply autoencoder construction with advanced building blocks (e.g. CNN layers)
37 Software Conclusions DBM Learning Image Features CIFAR-10 Images Level 1 Filters Level 2 Filters
38 Software Conclusions Multimodal DBM Modality fusion layers Modality 1 Modality K N. Srivastava, R. Salakhutdinov, Multimodal Learning with Deep Boltzmann Machines, JMLR 2014
39 P(txt img) Introduction Software Conclusions Multimodal DBM Image and Text P(img txt) N. Srivastava, R. Salakhutdinov, Multimodal Learning with Deep Boltzmann Machines, JMLR 2014
40 Software Conclusions Multimodal DBM Sampling N. Srivastava, R. Salakhutdinov, Multimodal Learning with Deep Boltzmann Machines, JMLR 2014
41 Software Conclusions Multimodal DBM Multimodal Quering N. Srivastava, R. Salakhutdinov, Multimodal Learning with Deep Boltzmann Machines, JMLR 2014
42 Software Conclusions Multimodal DBM Pang et al. Deep Multimodal Learning for Affective Analysis and Retrieval. IEEE Trans. on Multimedia, 2015
43 Software Conclusions Take Home Messages Regularized autoencoder Optimize reconstruction quality Constrain stored information Autoencoder training is manifold learning Learn a latent space manifold where input data resides Store only variations that are useful to represent training data Autoencoders learn a (conditional) distribution of input data P( x ) Deep AE: pretraining, fine tuning, supervised optimization Use AE for finding new/useful data representations Or to learn its distribution
44 Software Conclusions Next Lecture Next lecture will be on Monday 16 th April 2018 Stay tuned on Moodle for the topic Need to finalize dates for: Lecture recovery: 19 th April 2018, (room TBD) Midterm 2: 03 rd May 2018, (room TBD)
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationLearning Deep Architectures for AI. Part II - Vijay Chakilam
Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model
More informationCSC321 Lecture 20: Autoencoders
CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationUNSUPERVISED LEARNING
UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationDeep unsupervised learning
Deep unsupervised learning Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea Unsupervised learning In machine learning, there are 3 kinds of learning paradigm.
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationLearning Deep Architectures
Learning Deep Architectures Yoshua Bengio, U. Montreal Microsoft Cambridge, U.K. July 7th, 2009, Montreal Thanks to: Aaron Courville, Pascal Vincent, Dumitru Erhan, Olivier Delalleau, Olivier Breuleux,
More informationDeep Learning Basics Lecture 8: Autoencoder & DBM. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 8: Autoencoder & DBM Princeton University COS 495 Instructor: Yingyu Liang Autoencoder Autoencoder Neural networks trained to attempt to copy its input to its output Contain
More informationTUTORIAL PART 1 Unsupervised Learning
TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationDeep Learning Srihari. Deep Belief Nets. Sargur N. Srihari
Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous
More informationDeep Generative Models. (Unsupervised Learning)
Deep Generative Models (Unsupervised Learning) CEng 783 Deep Learning Fall 2017 Emre Akbaş Reminders Next week: project progress demos in class Describe your problem/goal What you have done so far What
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationReading Group on Deep Learning Session 4 Unsupervised Neural Networks
Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Jakob Verbeek & Daan Wynen 206-09-22 Jakob Verbeek & Daan Wynen Unsupervised Neural Networks Outline Autoencoders Restricted) Boltzmann
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationMeasuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information
Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information Mathias Berglund, Tapani Raiko, and KyungHyun Cho Department of Information and Computer Science Aalto University
More informationDeep Learning: a gentle introduction
Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More information10. Artificial Neural Networks
Foundations of Machine Learning CentraleSupélec Fall 217 1. Artificial Neural Networks Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationDeep Learning Architectures and Algorithms
Deep Learning Architectures and Algorithms In-Jung Kim 2016. 12. 2. Agenda Introduction to Deep Learning RBM and Auto-Encoders Convolutional Neural Networks Recurrent Neural Networks Reinforcement Learning
More informationMachine Learning for Physicists Lecture 1
Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT
More informationSum-Product Networks: A New Deep Architecture
Sum-Product Networks: A New Deep Architecture Pedro Domingos Dept. Computer Science & Eng. University of Washington Joint work with Hoifung Poon 1 Graphical Models: Challenges Bayesian Network Markov Network
More informationMachine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels
Machine Learning: Basis and Wavelet 32 157 146 204 + + + + + - + - 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 7 22 38 191 17 83 188 211 71 167 194 207 135 46 40-17 18 42 20 44 31 7 13-32 + + - - +
More informationWHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,
WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WITH IMPLICATIONS FOR TRAINING Sanjeev Arora, Yingyu Liang & Tengyu Ma Department of Computer Science Princeton University Princeton, NJ 08540, USA {arora,yingyul,tengyu}@cs.princeton.edu
More informationDenoising Autoencoders
Denoising Autoencoders Oliver Worm, Daniel Leinfelder 20.11.2013 Oliver Worm, Daniel Leinfelder Denoising Autoencoders 20.11.2013 1 / 11 Introduction Poor initialisation can lead to local minima 1986 -
More informationAuto-Encoders & Variants
Auto-Encoders & Variants 113 Auto-Encoders MLP whose target output = input Reconstruc7on=decoder(encoder(input)), input e.g. x code= latent features h encoder decoder reconstruc7on r(x) With bo?leneck,
More informationFeature Design. Feature Design. Feature Design. & Deep Learning
Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately
More informationUnsupervised Learning
CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality
More informationGreedy Layer-Wise Training of Deep Networks
Greedy Layer-Wise Training of Deep Networks Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle NIPS 2007 Presented by Ahmed Hefny Story so far Deep neural nets are more expressive: Can learn
More informationDeep Generative Stochastic Networks Trainable by Backprop
Yoshua Bengio FIND.US@ON.THE.WEB Éric Thibodeau-Laufer Guillaume Alain Département d informatique et recherche opérationnelle, Université de Montréal, & Canadian Inst. for Advanced Research Jason Yosinski
More informationGrundlagen der Künstlichen Intelligenz
Grundlagen der Künstlichen Intelligenz Neural networks Daniel Hennes 21.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Logistic regression Neural networks Perceptron
More informationThe connection of dropout and Bayesian statistics
The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationDirected and Undirected Graphical Models
Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed
More informationDeep Learning Lab Course 2017 (Deep Learning Practical)
Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of
More informationWelcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1
Welcome to the Machine Learning Practical Deep Neural Networks MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1 Introduction to MLP; Single Layer Networks (1) Steve Renals Machine Learning
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationKnowledge Extraction from DBNs for Images
Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental
More informationCSC411: Final Review. James Lucas & David Madras. December 3, 2018
CSC411: Final Review James Lucas & David Madras December 3, 2018 Agenda 1. A brief overview 2. Some sample questions Basic ML Terminology The final exam will be on the entire course; however, it will be
More informationLecture 14: Deep Generative Learning
Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han
More informationDeep Belief Networks are compact universal approximators
1 Deep Belief Networks are compact universal approximators Nicolas Le Roux 1, Yoshua Bengio 2 1 Microsoft Research Cambridge 2 University of Montreal Keywords: Deep Belief Networks, Universal Approximation
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationDimensionality Reduction and Principle Components Analysis
Dimensionality Reduction and Principle Components Analysis 1 Outline What is dimensionality reduction? Principle Components Analysis (PCA) Example (Bishop, ch 12) PCA vs linear regression PCA as a mixture
More informationLearning Deep Architectures
Learning Deep Architectures Yoshua Bengio, U. Montreal CIFAR NCAP Summer School 2009 August 6th, 2009, Montreal Main reference: Learning Deep Architectures for AI, Y. Bengio, to appear in Foundations and
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationTopics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 3: Introduction to Deep Learning (continued)
Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound Lecture 3: Introduction to Deep Learning (continued) Course Logistics - Update on course registrations - 6 seats left now -
More informationMachine Learning Techniques
Machine Learning Techniques ( 機器學習技法 ) Lecture 13: Deep Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationDeep Learning & Neural Networks Lecture 2
Deep Learning & Neural Networks Lecture 2 Kevin Duh Graduate School of Information Science Nara Institute of Science and Technology Jan 16, 2014 2/45 Today s Topics 1 General Ideas in Deep Learning Motivation
More informationNeural Networks. William Cohen [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ]
Neural Networks William Cohen 10-601 [pilfered from: Ziv; Geoff Hinton; Yoshua Bengio; Yann LeCun; Hongkak Lee - NIPs 2010 tutorial ] WHAT ARE NEURAL NETWORKS? William s notation Logis;c regression + 1
More informationAn Efficient Learning Procedure for Deep Boltzmann Machines
ARTICLE Communicated by Yoshua Bengio An Efficient Learning Procedure for Deep Boltzmann Machines Ruslan Salakhutdinov rsalakhu@utstat.toronto.edu Department of Statistics, University of Toronto, Toronto,
More informationQuantum Artificial Intelligence and Machine Learning: The Path to Enterprise Deployments. Randall Correll. +1 (703) Palo Alto, CA
Quantum Artificial Intelligence and Machine : The Path to Enterprise Deployments Randall Correll randall.correll@qcware.com +1 (703) 867-2395 Palo Alto, CA 1 Bundled software and services Professional
More informationNeural Networks: A Very Brief Tutorial
Neural Networks: A Very Brief Tutorial Chloé-Agathe Azencott Machine Learning & Computational Biology MPIs for Developmental Biology & for Intelligent Systems Tübingen (Germany) cazencott@tue.mpg.de October
More informationarxiv: v3 [cs.lg] 18 Mar 2013
Hierarchical Data Representation Model - Multi-layer NMF arxiv:1301.6316v3 [cs.lg] 18 Mar 2013 Hyun Ah Song Department of Electrical Engineering KAIST Daejeon, 305-701 hyunahsong@kaist.ac.kr Abstract Soo-Young
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)
More informationUsing Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes Ruslan Salakhutdinov and Geoffrey Hinton Department of Computer Science, University of Toronto 6 King s College Rd, M5S 3G4, Canada
More informationBias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions
- Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of
More informationattention mechanisms and generative models
attention mechanisms and generative models Master's Deep Learning Sergey Nikolenko Harbour Space University, Barcelona, Spain November 20, 2017 attention in neural networks attention You re paying attention
More informationOpportunities and challenges in quantum-enhanced machine learning in near-term quantum computers
Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers Alejandro Perdomo-Ortiz Senior Research Scientist, Quantum AI Lab. at NASA Ames Research Center and at the
More informationAdministration. Registration Hw3 is out. Lecture Captioning (Extra-Credit) Scribing lectures. Questions. Due on Thursday 10/6
Administration Registration Hw3 is out Due on Thursday 10/6 Questions Lecture Captioning (Extra-Credit) Look at Piazza for details Scribing lectures With pay; come talk to me/send email. 1 Projects Projects
More informationStochastic Gradient Estimate Variance in Contrastive Divergence and Persistent Contrastive Divergence
ESANN 0 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 7-9 April 0, idoc.com publ., ISBN 97-7707-. Stochastic Gradient
More informationGenerative models for missing value completion
Generative models for missing value completion Kousuke Ariga Department of Computer Science and Engineering University of Washington Seattle, WA 98105 koar8470@cs.washington.edu Abstract Deep generative
More informationAutoencoders and Score Matching. Based Models. Kevin Swersky Marc Aurelio Ranzato David Buchman Benjamin M. Marlin Nando de Freitas
On for Energy Based Models Kevin Swersky Marc Aurelio Ranzato David Buchman Benjamin M. Marlin Nando de Freitas Toronto Machine Learning Group Meeting, 2011 Motivation Models Learning Goal: Unsupervised
More informationIndex. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation
More informationPart 2. Representation Learning Algorithms
53 Part 2 Representation Learning Algorithms 54 A neural network = running several logistic regressions at the same time If we feed a vector of inputs through a bunch of logis;c regression func;ons, then
More informationChapter 16. Structured Probabilistic Models for Deep Learning
Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe
More informationAu-delà de la Machine de Boltzmann Restreinte. Hugo Larochelle University of Toronto
Au-delà de la Machine de Boltzmann Restreinte Hugo Larochelle University of Toronto Introduction Restricted Boltzmann Machines (RBMs) are useful feature extractors They are mostly used to initialize deep
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Recurrent Neural Network 2 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step
More informationThe XOR problem. Machine learning for vision. The XOR problem. The XOR problem. x 1 x 2. x 2. x 1. Fall Roland Memisevic
The XOR problem Fall 2013 x 2 Lecture 9, February 25, 2015 x 1 The XOR problem The XOR problem x 1 x 2 x 2 x 1 (picture adapted from Bishop 2006) It s the features, stupid It s the features, stupid The
More informationSpeaker Representation and Verification Part II. by Vasileios Vasilakakis
Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation
More informationarxiv: v1 [stat.ml] 2 Sep 2014
On the Equivalence Between Deep NADE and Generative Stochastic Networks Li Yao, Sherjil Ozair, Kyunghyun Cho, and Yoshua Bengio Département d Informatique et de Recherche Opérationelle Université de Montréal
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationMachine Learning. Neural Networks. (slides from Domingos, Pardo, others)
Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)
More informationPV021: Neural networks. Tomáš Brázdil
1 PV021: Neural networks Tomáš Brázdil 2 Course organization Course materials: Main: The lecture Neural Networks and Deep Learning by Michael Nielsen http://neuralnetworksanddeeplearning.com/ (Extremely
More informationHow to do backpropagation in a brain. Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto
1 How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto What is wrong with back-propagation? It requires labeled training data. (fixed) Almost
More informationBasic Principles of Unsupervised and Unsupervised
Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationc Copyright 2016 Galen Andrew
c Copyright 2016 Galen Andrew New Techniques in Deep Representation Learning Galen Andrew A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationIntroduction to Deep Learning
Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015 A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 1 / 39 Outline 1 Universality of Neural Networks
More informationAn efficient way to learn deep generative models
An efficient way to learn deep generative models Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto Joint work with: Ruslan Salakhutdinov, Yee-Whye
More informationEE-559 Deep learning 9. Autoencoders and generative models
EE-559 Deep learning 9. Autoencoders and generative models François Fleuret https://fleuret.org/dlc/ [version of: May 1, 2018] ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE Embeddings and generative models
More informationMachine Learning. Boris
Machine Learning Boris Nadion boris@astrails.com @borisnadion @borisnadion boris@astrails.com astrails http://astrails.com awesome web and mobile apps since 2005 terms AI (artificial intelligence)
More informationSpectral Hashing: Learning to Leverage 80 Million Images
Spectral Hashing: Learning to Leverage 80 Million Images Yair Weiss, Antonio Torralba, Rob Fergus Hebrew University, MIT, NYU Outline Motivation: Brute Force Computer Vision. Semantic Hashing. Spectral
More informationDeep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning
Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationFrom perceptrons to word embeddings. Simon Šuster University of Groningen
From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written
More informationOpportunities and challenges in near-term quantum computers
Opportunities and challenges in near-term quantum computers Alejandro Perdomo-Ortiz Senior Research Scientist, Quantum AI Lab. at NASA Ames Research Center and at the University Space Research Association,
More informationarxiv: v4 [cs.lg] 16 Apr 2015
REWEIGHTED WAKE-SLEEP Jörg Bornschein and Yoshua Bengio Department of Computer Science and Operations Research University of Montreal Montreal, Quebec, Canada ABSTRACT arxiv:1406.2751v4 [cs.lg] 16 Apr
More information