Lecture 14: Deep Generative Learning

Size: px
Start display at page:

Download "Lecture 14: Deep Generative Learning"

Transcription

1 Generative Modeling CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 14: Deep Generative Learning Density estimation Reconstructing probability density function using samples Bohyung Han Computer Vision Lab. Sample generation 2 Training examples Generative samples Generative Modeling by Neural Networks Variational Auto Encoder (VAE) Neural networks with continuous latent variables Encoder: approximates a posterior distribution Decoder: stochastically reconstructs the data from the latent variables Generative Adversarial Networks (GANs) Generator: generates samples using a uniform distribution Discriminator: discriminates between real and generated images Deep Boltzmann Machines (DBMs) Deep Belief Networks (DBNs) Why Generative Modeling? Limitations of discriminative neural networks Requirement of large-scale labeled training data Unreasonable mistakes for trivial data variations Poor local optima Generative modeling Adjust model parameters to maximize the probability that the data would have been produced 3 4

2 Autoencoder Traditional autoencoders Feature learning Learning with the reconstruction error Basic pipeline Autoencoder Traditional autoencoders Feature learning Learning with the reconstruction error Supervised feature learning Input Features Reconstructed input Input Features Encoder! " Decoder! Encoder! " Classifier '!: image ": latent variable! : image!: image ": latent variable ': class label L =!! Fine-tuning by backpropagation 5 6 Variational Autoencoder Characteristics A Bayesian approach on an autoencoder: sample generation Estimating model parameter ( without access to latent variable " Problem scenario true prior +, ()) true conditional +, (0 )) Variational Autoencoder Characteristics A Bayesian approach on an autoencoder: sample generation Estimating model parameter ( without access to latent variable " Limitation true prior +, ()) true conditional +, (0 )) ) 0 ) 0 ): classes, attributes 0: image ): classes, attributes 0: image A value ) * is generated from a prior distribution +, ()). A value 0 * is generated from a conditional distribution +, (0 )). It is difficult to estimate +, ()) and +, (0 )) in practice. We need to approximate these distributions. True parameters 2 and the values of the latent variables ) 3 are hidden. 7 8

3 Variational Autoencoder Characteristics A Bayesian approach on an autoencoder: sample generation Estimating model parameter ( without access to latent variable " Main idea +, ()) +, (0 )) Decoder ) 0 Variational Autoencoder Characteristics A Bayesian approach on an autoencoder: sample generation Estimating model parameter ( without access to latent variable " Main idea 4 5 () 0) Encoder 0 ) Decoder +, (0 )) 0 ): classes, attributes 0: image 0: image ): classes, attributes 0 : image Assume the prior +, ()) is a unit Gaussian. Assume the likelihood +, (0 )) is a diagonal Gaussian. Decoder estimates mean and variance of +, (0 )). Encoder estimates mean and variance of 4 5 () 0). Decoder estimates mean and variance of +, (0 )) Optimization Optimization Variational lower bound of marginal likelihood log +, 0 * = 9 :; ) 0 log +, 0 * = < => 4 5 ) 0 * +, ) 0 * + L (, A; 0 * Maximization of variational lower bound L (, A; 0 = 9 :; ) 0 log 4 5 ) 0 + log +, 0, ) = 9 :; ) 0 log 4 5 ) 0 + log + ) + log +, 0 ) KL divergence of the approximate from the true posterior Variational lower bound on the marginal likelihood = < => 4 5 ) 0 +, ) + 9 :; ) 0 log +, 0 ) Regularization term Reconstruction term log +, 0 * L (, A; 0 * Maximize the variational lower bound of the marginal likelihood. Training Reconstruct term: MSE between input and generated images Regularization term: optimization by reparametrization trick using mean and standard deviation 11 12

4 Variational Autoencoder Characteristics A Bayesian approach on an autoencoder: generate samples Estimating model parameter ( without access to latent variable " Inference +, ()) ) Decoder +, (0 )) 0 Learned Manifold Visualizations of learned manifold for generative models with 2D latent space ): classes, attributes 0: image Decoder samples mean and variance from +, ()). Decoder samples 0 from the mean and variance Generative Adversarial Networks Optimization 15 Two models A generative model D that captures the data distribution A discriminative model < that estimates the probability that a sample came from the training data rather than D Objective Maximize the probability of < making a mistake Trained with backpropagation Generator ) 0 Random noise Generated image Discriminator Train generator and 0 discriminator jointly! Real image [Goodfellow14] I. J. Goodfellow, et al.: Generative Adversarial Nets, NIPS 2014 Real or fake? 16 Objective O <, D 9 0STUVU 0 log < 0; ( G + 9 )S) ) log 1 < D ); ( E ; ( G min K max O <, D N < and D play the two-player minimax game with value function O <, D Simultaneously training D to minimize log 1 < D ); ( E ; ( G + ) ) : prior on input noise variables D ); ( E Representing a mapping from ) to data space A differentiable function represented by a multilayer perceptron with parameters ( E < 0; ( G Representing the probability that 0 came from data rather than generator distribution + E [Goodfellow14] I. J. Goodfellow, et al.: Generative Adversarial Nets, NIPS 2014

5 Saddle point estimation Optimization O <, D 9 0STUVU 0 log < 0; ( G + 9 )S) ) log 1 < D ); ( E ; ( G min K max O <, D N Optimization Algorithm 1 Minibatch stochastic gradient descent training of generative adversarial nets. The number of steps to apply to the discriminator, k, is a hyperparameter. We used k =1, the least expensive option, in our experiments. for number of training iterations do for k steps do Sample minibatch of m noise samples {z (1),...,z (m) } from noise prior p g (z). Sample minibatch of m examples {x (1),...,x (m) } from data generating distribution p data (x). Update the discriminator by ascending its stochastic gradient: r d 1 m mx i=1 h log D x (i) + log 1 D G z (i) i. end for Sample minibatch of m noise samples {z (1),...,z (m) } from noise prior p g (z). Update the generator by descending its stochastic gradient: r g 1 m mx i=1 log 1 D G z (i). end for The gradient-based updates can use any standard gradient-based learning rule. We used momentum in our experiments. 17 [Goodfellow14] I. J. Goodfellow, et al.: Generative Adversarial Nets, NIPS [Goodfellow14] I. J. Goodfellow, et al.: Generative Adversarial Nets, NIPS 2014 Generated Examples Laplacian Generative Adversarial Networks Generating high-resolution images progressively WX Y = Z WX Y[\ + h^y = Z WX Y[\ + D Y " Y, Z WX Y[\ Residual image Upsampling Generative model Nearest neighbors from training set I 0 64x64 h 0 G 0 z 0 l 0 I 1 h 1 G 1 z 1 l 1 I 2 l 2 I 3 h 2 G 2 G 3 z 2 z 3 19 [Goodfellow14] I. J. Goodfellow, et al.: Generative Adversarial Nets, NIPS 2014 [Denton15] E. Denton, S. Chintalal, A. Szlam, R. Fergus: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, NIPS

6 Laplacian Generative Adversarial Networks Results Training: the opposite direction Model at each level is trained independently. Blur and downsample Upsample I 2 I 2 I 3 I 3 I = I 0 I 1 I 1 z 0 z 1 z 2 I 3 l 2 D 3 Real/ Generated? h 0 l 0 h 0 G 0 h 1 l 1 D 1 h 1 G 1 Real/Generated? D 2 G 2 G 3 h 2 h 2 z 3 Real/ Generated? CC-LAPGAN: Truck LAPGAN GAN [14] D 0 Real/Generated? [Denton15] E. Denton, S. Chintalal, A. Szlam, R. Fergus: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, NIPS Nearest neighbors from training set [Denton15] E. Denton, S. Chintalal, A. Szlam, R. Fergus: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, NIPS DCGAN Representation learning by GAN Unsupervised representation learning Learning from scratch First CNN-based GAN model with good performance Apply to classification Combine discriminator and any classifier for prediction Good performance even without fine tuning Generator Discriminator DCGAN Architecture guidelines for stable DCGANs Replace pooling layers with strided convolutions (in discriminator) and fractional-strided convolutions (in generator). Use batch normalization in both the generator and the discriminator. Remove fully connected hidden layers for deeper architectures. Use ReLU in generator for all layers except for the output, which uses tanh. Use LeakyReLU activation in the discriminator for all layers. Real or fake? [Radford16] A. Radford, L. Metz, S. Chintala: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR [Radford16] A. Radford, L. Metz, S. Chintala: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR

7 Results Interpolation in Latent Space On CIFAR-10 using trained model with ImageNet-1K Model 1 Layer K-means 3 Layer K-means Learned RF View Invariant K-means Exemplar CNN DCGAN (ours) + L2-SVM Accuracy 80.6% 82.0% 81.9% 84.3% 82.8% Accuracy (400 per class) 63.7% (±0.7%) 70.7% (±0.7%) 72.6% (±0.7%) 77.4% (±0.2%) 73.8% (±0.4%) max # of features units On SVHN with 1000 labels Model KNN TSVM M1+KNN M1+TSVM M1+M2 SWWAE without dropout SWWAE with dropout DCGAN (ours) + L2-SVM Supervised CNN with the same architecture error rate 77.93% 66.55% 65.63% 54.33% 36.02% 27.83% 23.56% 22.48% 28.87% (validation) [Radford16] A. Radford, L. Metz, S. Chintala: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR 2016 [Radford16] A. Radford, L. Metz, S. Chintala: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR 2016 Vector Arithmetic for Visual Concepts CycleGAN Image-to-image translation Converting an image from one representation to another e.g., gray to color, image to semantic labels, edge map to photograph Generative adversarial network Cycle consistency Using transitivity as a way to regularize structured data [Radford16] A. Radford, L. Metz, S. Chintala: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR Monet photo zebra horse summer photo Monet horse zebra winter winter summer [Zhu17] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV

8 Unpaired Translation There is some underlying relationship between domains. Seeking the relationship without explicit mapping Learning to translate between domains without input-output pairs Using supervision in the level of sets Paired { x i y i },, { } { }, Unpaired { X } { Y } [Zhu17] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV , Use bidirectional GAN and Introduce cycle consistency loss! min K,j cycle-consistency loss max L D, f, < g, < b N k,n l X { Formulation by GAN L _`a D, < b, c, d + L _`a f, < g, d, c + L hih D, f L _`a D, < b, c, d = 9 estuvu e log < b e + 9 0STUVU 0 log 1 < b D 0 L _`a f, < g, d, c = 9 0STUVU 0 log < g estuvu e log 1 < g D e Y D X X G F [Zhu17] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV D Y Y X { Y cycle-consistency loss L hih D, f = 9 0STUVU 0 f D 0 0 \ + 9 estuvu e D f e e \ Results of CycleGAN Wasserstein GAN Input Cycle alone GAN alone GAN+forward GAN+backward CycleGAN (ours) Ground truth Original GAN Discriminator minimize Jensen-Shannon (JS) divergence between the real data distribution and the generated distribution mn + o, +, = 1 2 qr(+ o + s qr(+, + s winter Yosemite summer Yosemite = 1 2 qr + \ + o + +, qr +, + o + +, 2 + o : real data distribution, +, : generated distribution Not well suited for optimization: ill-defined gradients in all situations Wasserstein GAN Discriminator minimizes Earth Mover s distance (EMD) t + o, +, = inf 9 {, v! ' v x S y,s z summer Yosemite winter Yosemite [Zhu17] J.-Y. Zhu, T. Park, P. Isola, A. A. Efros: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. ICCV Π + o, +, : a set of joint distribution of + o and +, Gradients are much more well-defined. [Arjovsky17] M. Arjovsky, S. Chintala, L. Bottou: Wasserstein Generative Adversarial Networks. ICML

9 Wasserstein GAN EMD Optimal transport cost from one distribution to another Jensen-Shannon divergence vs earth mover s distance 1-Lipschitz constraint Wasserstein GAN t + o, +, = inf v x S y,s z 9 {, v! ' t + o, +, ã-lipschitz constraint = sup Å ÇÉÑ 9 {Sy Ö! 9 {Sz Ö! Kantorovich-Rubinstein Duality min, max Ü áy,y à 9 {S y Ö Ü! 9 {Sz Ö Ü (â, " ) Generator 0 Random noise Fake image Critic network Ö Ü Ö Ü! or Ö Ü â, " 0 ä JS EMD [Arjovsky17] M. Arjovsky, S. Chintala, L. Bottou: Wasserstein Generative Adversarial Networks. ICML Real image [Arjovsky17] M. Arjovsky, S. Chintala, L. Bottou: Wasserstein Generative Adversarial Networks. ICML WGAN VAE vs. GAN WGAN vs. GAN Generator: MLP with 4 hidden layers Discriminator: DCGAN Variational Autoencoder (VAE) Generative Adversarial Network (GAN) [Arjovsky17] M. Arjovsky, S. Chintala, L. Bottou: Wasserstein Generative Adversarial Networks. ICML

10 VAE vs. GAN VAE GAN Pros Stable training Clear evaluation metrics High quality generation Less overfitting problem Cons Crude generation quality Overfitting Slow training Mode collapsing Unstable training No good evaluation metrics 37 38

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Emily Denton 1, Soumith Chintala 2, Arthur Szlam 2, Rob Fergus 2 1 New York University 2 Facebook AI Research Denotes equal

More information

Deep Generative Models. (Unsupervised Learning)

Deep Generative Models. (Unsupervised Learning) Deep Generative Models (Unsupervised Learning) CEng 783 Deep Learning Fall 2017 Emre Akbaş Reminders Next week: project progress demos in class Describe your problem/goal What you have done so far What

More information

GENERATIVE ADVERSARIAL LEARNING

GENERATIVE ADVERSARIAL LEARNING GENERATIVE ADVERSARIAL LEARNING OF MARKOV CHAINS Jiaming Song, Shengjia Zhao & Stefano Ermon Computer Science Department Stanford University {tsong,zhaosj12,ermon}@cs.stanford.edu ABSTRACT We investigate

More information

Generative Adversarial Networks, and Applications

Generative Adversarial Networks, and Applications Generative Adversarial Networks, and Applications Ali Mirzaei Nimish Srivastava Kwonjoon Lee Songting Xu CSE 252C 4/12/17 2/44 Outline: Generative Models vs Discriminative Models (Background) Generative

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

The Success of Deep Generative Models

The Success of Deep Generative Models The Success of Deep Generative Models Jakub Tomczak AMLAB, University of Amsterdam CERN, 2018 What is AI about? What is AI about? Decision making: What is AI about? Decision making: new data High probability

More information

Wasserstein GAN. Juho Lee. Jan 23, 2017

Wasserstein GAN. Juho Lee. Jan 23, 2017 Wasserstein GAN Juho Lee Jan 23, 2017 Wasserstein GAN (WGAN) Arxiv submission Martin Arjovsky, Soumith Chintala, and Léon Bottou A new GAN model minimizing the Earth-Mover s distance (Wasserstein-1 distance)

More information

Unsupervised Learning

Unsupervised Learning CS 3750 Advanced Machine Learning hkc6@pitt.edu Unsupervised Learning Data: Just data, no labels Goal: Learn some underlying hidden structure of the data P(, ) P( ) Principle Component Analysis (Dimensionality

More information

Generative adversarial networks

Generative adversarial networks 14-1: Generative adversarial networks Prof. J.C. Kao, UCLA Generative adversarial networks Why GANs? GAN intuition GAN equilibrium GAN implementation Practical considerations Much of these notes are based

More information

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous

More information

Generative Adversarial Networks

Generative Adversarial Networks Generative Adversarial Networks SIBGRAPI 2017 Tutorial Everything you wanted to know about Deep Learning for Computer Vision but were afraid to ask Presentation content inspired by Ian Goodfellow s tutorial

More information

Generative Adversarial Networks (GANs) Ian Goodfellow, OpenAI Research Scientist Presentation at Berkeley Artificial Intelligence Lab,

Generative Adversarial Networks (GANs) Ian Goodfellow, OpenAI Research Scientist Presentation at Berkeley Artificial Intelligence Lab, Generative Adversarial Networks (GANs) Ian Goodfellow, OpenAI Research Scientist Presentation at Berkeley Artificial Intelligence Lab, 2016-08-31 Generative Modeling Density estimation Sample generation

More information

Nishant Gurnani. GAN Reading Group. April 14th, / 107

Nishant Gurnani. GAN Reading Group. April 14th, / 107 Nishant Gurnani GAN Reading Group April 14th, 2017 1 / 107 Why are these Papers Important? 2 / 107 Why are these Papers Important? Recently a large number of GAN frameworks have been proposed - BGAN, LSGAN,

More information

UNSUPERVISED LEARNING

UNSUPERVISED LEARNING UNSUPERVISED LEARNING Topics Layer-wise (unsupervised) pre-training Restricted Boltzmann Machines Auto-encoders LAYER-WISE (UNSUPERVISED) PRE-TRAINING Breakthrough in 2006 Layer-wise (unsupervised) pre-training

More information

Do you like to be successful? Able to see the big picture

Do you like to be successful? Able to see the big picture Do you like to be successful? Able to see the big picture 1 Are you able to recognise a scientific GEM 2 How to recognise good work? suggestions please item#1 1st of its kind item#2 solve problem item#3

More information

A Unified View of Deep Generative Models

A Unified View of Deep Generative Models SAILING LAB Laboratory for Statistical Artificial InteLigence & INtegreative Genomics A Unified View of Deep Generative Models Zhiting Hu and Eric Xing Petuum Inc. Carnegie Mellon University 1 Deep generative

More information

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i

More information

Generative Adversarial Networks

Generative Adversarial Networks Generative Adversarial Networks Stefano Ermon, Aditya Grover Stanford University Lecture 10 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 10 1 / 17 Selected GANs https://github.com/hindupuravinash/the-gan-zoo

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference

Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference Louis C. Tiao 1 Edwin V. Bonilla 2 Fabio Ramos 1 July 22, 2018 1 University of Sydney, 2 University of New South Wales Motivation:

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

arxiv: v1 [cs.lg] 20 Apr 2017

arxiv: v1 [cs.lg] 20 Apr 2017 Softmax GAN Min Lin Qihoo 360 Technology co. ltd Beijing, China, 0087 mavenlin@gmail.com arxiv:704.069v [cs.lg] 0 Apr 07 Abstract Softmax GAN is a novel variant of Generative Adversarial Network (GAN).

More information

Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning

Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning Stochastic Backpropagation, Variational Inference, and Semi-Supervised Learning Diederik (Durk) Kingma Danilo J. Rezende (*) Max Welling Shakir Mohamed (**) Stochastic Gradient Variational Inference Bayesian

More information

Variational Inference via Stochastic Backpropagation

Variational Inference via Stochastic Backpropagation Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation

More information

Basic Principles of Unsupervised and Unsupervised

Basic Principles of Unsupervised and Unsupervised Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization

More information

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Speaker Representation and Verification Part II. by Vasileios Vasilakakis Speaker Representation and Verification Part II by Vasileios Vasilakakis Outline -Approaches of Neural Networks in Speaker/Speech Recognition -Feed-Forward Neural Networks -Training with Back-propagation

More information

Neural networks and optimization

Neural networks and optimization Neural networks and optimization Nicolas Le Roux Criteo 18/05/15 Nicolas Le Roux (Criteo) Neural networks and optimization 18/05/15 1 / 85 1 Introduction 2 Deep networks 3 Optimization 4 Convolutional

More information

A QUANTITATIVE MEASURE OF GENERATIVE ADVERSARIAL NETWORK DISTRIBUTIONS

A QUANTITATIVE MEASURE OF GENERATIVE ADVERSARIAL NETWORK DISTRIBUTIONS A QUANTITATIVE MEASURE OF GENERATIVE ADVERSARIAL NETWORK DISTRIBUTIONS Dan Hendrycks University of Chicago dan@ttic.edu Steven Basart University of Chicago xksteven@uchicago.edu ABSTRACT We introduce a

More information

Understanding GANs: Back to the basics

Understanding GANs: Back to the basics Understanding GANs: Back to the basics David Tse Stanford University Princeton University May 15, 2018 Joint work with Soheil Feizi, Farzan Farnia, Tony Ginart, Changho Suh and Fei Xia. GANs at NIPS 2017

More information

Normalization Techniques in Training of Deep Neural Networks

Normalization Techniques in Training of Deep Neural Networks Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

AClassic Generative Adversarial Net (GAN) [1] learns

AClassic Generative Adversarial Net (GAN) [1] learns PRPRINT 1 Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities Guo-Jun Qi arxiv:1701.06264v3 [cs.cv] 14 Feb 2017 Abstract In this paper, we present a novel Loss-Sensitive GAN (LS-GAN)

More information

CSC321 Lecture 20: Autoencoders

CSC321 Lecture 20: Autoencoders CSC321 Lecture 20: Autoencoders Roger Grosse Roger Grosse CSC321 Lecture 20: Autoencoders 1 / 16 Overview Latent variable models so far: mixture models Boltzmann machines Both of these involve discrete

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

Variational Autoencoders

Variational Autoencoders Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly

More information

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow, Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation

More information

Reading Group on Deep Learning Session 1

Reading Group on Deep Learning Session 1 Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

Auto-Encoding Variational Bayes

Auto-Encoding Variational Bayes Auto-Encoding Variational Bayes Diederik P Kingma, Max Welling June 18, 2018 Diederik P Kingma, Max Welling Auto-Encoding Variational Bayes June 18, 2018 1 / 39 Outline 1 Introduction 2 Variational Lower

More information

Maxout Networks. Hien Quoc Dang

Maxout Networks. Hien Quoc Dang Maxout Networks Hien Quoc Dang Outline Introduction Maxout Networks Description A Universal Approximator & Proof Experiments with Maxout Why does Maxout work? Conclusion 10/12/13 Hien Quoc Dang Machine

More information

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab

More information

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Supplementary Material Xingang Pan 1, Ping Luo 1, Jianping Shi 2, and Xiaoou Tang 1 1 CUHK-SenseTime Joint Lab, The Chinese University

More information

Singing Voice Separation using Generative Adversarial Networks

Singing Voice Separation using Generative Adversarial Networks Singing Voice Separation using Generative Adversarial Networks Hyeong-seok Choi, Kyogu Lee Music and Audio Research Group Graduate School of Convergence Science and Technology Seoul National University

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Spring 2018 http://vllab.ee.ntu.edu.tw/dlcv.html (primary) https://ceiba.ntu.edu.tw/1062dlcv (grade, etc.) FB: DLCV Spring 2018 Yu-Chiang Frank Wang 王鈺強, Associate Professor

More information

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/

More information

Generative models for missing value completion

Generative models for missing value completion Generative models for missing value completion Kousuke Ariga Department of Computer Science and Engineering University of Washington Seattle, WA 98105 koar8470@cs.washington.edu Abstract Deep generative

More information

Greedy Layer-Wise Training of Deep Networks

Greedy Layer-Wise Training of Deep Networks Greedy Layer-Wise Training of Deep Networks Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle NIPS 2007 Presented by Ahmed Hefny Story so far Deep neural nets are more expressive: Can learn

More information

Nonparametric Inference for Auto-Encoding Variational Bayes

Nonparametric Inference for Auto-Encoding Variational Bayes Nonparametric Inference for Auto-Encoding Variational Bayes Erik Bodin * Iman Malik * Carl Henrik Ek * Neill D. F. Campbell * University of Bristol University of Bath Variational approximations are an

More information

Deep Learning Autoencoder Models

Deep Learning Autoencoder Models Deep Learning Autoencoder Models Davide Bacciu Dipartimento di Informatica Università di Pisa Intelligent Systems for Pattern Recognition (ISPR) Generative Models Wrap-up Deep Learning Module Lecture Generative

More information

Theory and Applications of Wasserstein Distance. Yunsheng Bai Mar 2, 2018

Theory and Applications of Wasserstein Distance. Yunsheng Bai Mar 2, 2018 Theory and Applications of Wasserstein Distance Yunsheng Bai Mar 2, 2018 Roadmap 1. 2. 3. Why Study Wasserstein Distance? Elementary Distances Between Two Distributions Applications of Wasserstein Distance

More information

Deep Learning Architectures and Algorithms

Deep Learning Architectures and Algorithms Deep Learning Architectures and Algorithms In-Jung Kim 2016. 12. 2. Agenda Introduction to Deep Learning RBM and Auto-Encoders Convolutional Neural Networks Recurrent Neural Networks Reinforcement Learning

More information

Deep Learning Lab Course 2017 (Deep Learning Practical)

Deep Learning Lab Course 2017 (Deep Learning Practical) Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

AdaGAN: Boosting Generative Models

AdaGAN: Boosting Generative Models AdaGAN: Boosting Generative Models Ilya Tolstikhin ilya@tuebingen.mpg.de joint work with Gelly 2, Bousquet 2, Simon-Gabriel 1, Schölkopf 1 1 MPI for Intelligent Systems 2 Google Brain Radford et al., 2015)

More information

Machine Learning Summer 2018 Exercise Sheet 4

Machine Learning Summer 2018 Exercise Sheet 4 Ludwig-Maimilians-Universitaet Muenchen 17.05.2018 Institute for Informatics Prof. Dr. Volker Tresp Julian Busch Christian Frey Machine Learning Summer 2018 Eercise Sheet 4 Eercise 4-1 The Simpsons Characters

More information

Training Generative Adversarial Networks Via Turing Test

Training Generative Adversarial Networks Via Turing Test raining enerative Adversarial Networks Via uring est Jianlin Su School of Mathematics Sun Yat-sen University uangdong, China bojone@spaces.ac.cn Abstract In this article, we introduce a new mode for training

More information

arxiv: v3 [stat.ml] 20 Feb 2018

arxiv: v3 [stat.ml] 20 Feb 2018 MANY PATHS TO EQUILIBRIUM: GANS DO NOT NEED TO DECREASE A DIVERGENCE AT EVERY STEP William Fedus 1, Mihaela Rosca 2, Balaji Lakshminarayanan 2, Andrew M. Dai 1, Shakir Mohamed 2 and Ian Goodfellow 1 1

More information

Deep unsupervised learning

Deep unsupervised learning Deep unsupervised learning Advanced data-mining Yongdai Kim Department of Statistics, Seoul National University, South Korea Unsupervised learning In machine learning, there are 3 kinds of learning paradigm.

More information

arxiv: v3 [cs.lg] 2 Nov 2018

arxiv: v3 [cs.lg] 2 Nov 2018 PacGAN: The power of two samples in generative adversarial networks Zinan Lin, Ashish Khetan, Giulia Fanti, Sewoong Oh Carnegie Mellon University, University of Illinois at Urbana-Champaign arxiv:72.486v3

More information

TUTORIAL PART 1 Unsupervised Learning

TUTORIAL PART 1 Unsupervised Learning TUTORIAL PART 1 Unsupervised Learning Marc'Aurelio Ranzato Department of Computer Science Univ. of Toronto ranzato@cs.toronto.edu Co-organizers: Honglak Lee, Yoshua Bengio, Geoff Hinton, Yann LeCun, Andrew

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

Feature Design. Feature Design. Feature Design. & Deep Learning

Feature Design. Feature Design. Feature Design. & Deep Learning Artificial Intelligence and its applications Lecture 9 & Deep Learning Professor Daniel Yeung danyeung@ieee.org Dr. Patrick Chan patrickchan@ieee.org South China University of Technology, China Appropriately

More information

Unsupervised Neural Nets

Unsupervised Neural Nets Unsupervised Neural Nets (and ICA) Lyle Ungar (with contributions from Quoc Le, Socher & Manning) Lyle Ungar, University of Pennsylvania Semi-Supervised Learning Hypothesis:%P(c x)%can%be%more%accurately%computed%using%

More information

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks

Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Reading Group on Deep Learning Session 4 Unsupervised Neural Networks Jakob Verbeek & Daan Wynen 206-09-22 Jakob Verbeek & Daan Wynen Unsupervised Neural Networks Outline Autoencoders Restricted) Boltzmann

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Neural Networks and Deep Learning.

Neural Networks and Deep Learning. Neural Networks and Deep Learning www.cs.wisc.edu/~dpage/cs760/ 1 Goals for the lecture you should understand the following concepts perceptrons the perceptron training rule linear separability hidden

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Variational Autoencoders (VAEs)

Variational Autoencoders (VAEs) September 26 & October 3, 2017 Section 1 Preliminaries Kullback-Leibler divergence KL divergence (continuous case) p(x) andq(x) are two density distributions. Then the KL-divergence is defined as Z KL(p

More information

Introduction to Neural Networks

Introduction to Neural Networks CUONG TUAN NGUYEN SEIJI HOTTA MASAKI NAKAGAWA Tokyo University of Agriculture and Technology Copyright by Nguyen, Hotta and Nakagawa 1 Pattern classification Which category of an input? Example: Character

More information

Deep Generative Models

Deep Generative Models Deep Generative Models Durk Kingma Max Welling Deep Probabilistic Models Worksop Wednesday, 1st of Oct, 2014 D.P. Kingma Deep generative models Transformations between Bayes nets and Neural nets Transformation

More information

Adversarial Examples

Adversarial Examples Adversarial Examples presentation by Ian Goodfellow Deep Learning Summer School Montreal August 9, 2015 In this presentation. - Intriguing Properties of Neural Networks. Szegedy et al., ICLR 2014. - Explaining

More information

Chapter 20. Deep Generative Models

Chapter 20. Deep Generative Models Peng et al.: Deep Learning and Practice 1 Chapter 20 Deep Generative Models Peng et al.: Deep Learning and Practice 2 Generative Models Models that are able to Provide an estimate of the probability distribution

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»

More information

Open Set Learning with Counterfactual Images

Open Set Learning with Counterfactual Images Open Set Learning with Counterfactual Images Lawrence Neal, Matthew Olson, Xiaoli Fern, Weng-Keen Wong, Fuxin Li Collaborative Robotics and Intelligent Systems Institute Oregon State University Abstract.

More information

Some theoretical properties of GANs. Gérard Biau Toulouse, September 2018

Some theoretical properties of GANs. Gérard Biau Toulouse, September 2018 Some theoretical properties of GANs Gérard Biau Toulouse, September 2018 Coauthors Benoît Cadre (ENS Rennes) Maxime Sangnier (Sorbonne University) Ugo Tanielian (Sorbonne University & Criteo) 1 video Source:

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Regularization in Neural Networks

Regularization in Neural Networks Regularization in Neural Networks Sargur Srihari 1 Topics in Neural Network Regularization What is regularization? Methods 1. Determining optimal number of hidden units 2. Use of regularizer in error function

More information

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!!

CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!! CSE 190 Fall 2015 Midterm DO NOT TURN THIS PAGE UNTIL YOU ARE TOLD TO START!!!! November 18, 2015 THE EXAM IS CLOSED BOOK. Once the exam has started, SORRY, NO TALKING!!! No, you can t even say see ya

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Negative Momentum for Improved Game Dynamics

Negative Momentum for Improved Game Dynamics Negative Momentum for Improved Game Dynamics Gauthier Gidel Reyhane Askari Hemmat Mohammad Pezeshki Gabriel Huang Rémi Lepriol Simon Lacoste-Julien Ioannis Mitliagkas Mila & DIRO, Université de Montréal

More information

Composite Functional Gradient Learning of Generative Adversarial Models. Appendix

Composite Functional Gradient Learning of Generative Adversarial Models. Appendix A. Main theorem and its proof Appendix Theorem A.1 below, our main theorem, analyzes the extended KL-divergence for some β (0.5, 1] defined as follows: L β (p) := (βp (x) + (1 β)p(x)) ln βp (x) + (1 β)p(x)

More information

Semi-Supervised Learning with the Deep Rendering Mixture Model

Semi-Supervised Learning with the Deep Rendering Mixture Model Semi-Supervised Learning with the Deep Rendering Mixture Model Tan Nguyen 1,2 Wanjia Liu 1 Ethan Perez 1 Richard G. Baraniuk 1 Ankit B. Patel 1,2 1 Rice University 2 Baylor College of Medicine 6100 Main

More information

Deep Learning Architecture for Univariate Time Series Forecasting

Deep Learning Architecture for Univariate Time Series Forecasting CS229,Technical Report, 2014 Deep Learning Architecture for Univariate Time Series Forecasting Dmitry Vengertsev 1 Abstract This paper studies the problem of applying machine learning with deep architecture

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric

More information

Loss Functions and Optimization. Lecture 3-1

Loss Functions and Optimization. Lecture 3-1 Lecture 3: Loss Functions and Optimization Lecture 3-1 Administrative Assignment 1 is released: http://cs231n.github.io/assignments2017/assignment1/ Due Thursday April 20, 11:59pm on Canvas (Extending

More information

Bayesian Deep Learning

Bayesian Deep Learning Bayesian Deep Learning Mohammad Emtiyaz Khan AIP (RIKEN), Tokyo http://emtiyaz.github.io emtiyaz.khan@riken.jp June 06, 2018 Mohammad Emtiyaz Khan 2018 1 What will you learn? Why is Bayesian inference

More information

Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks

Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Towards a Data-driven Approach to Exploring Galaxy Evolution via Generative Adversarial Networks Tian Li tian.li@pku.edu.cn EECS, Peking University Abstract Since laboratory experiments for exploring astrophysical

More information

MMD GAN 1 Fisher GAN 2

MMD GAN 1 Fisher GAN 2 MMD GAN 1 Fisher GAN 1 Chun-Liang Li, Wei-Cheng Chang, Yu Cheng, Yiming Yang, and Barnabás Póczos (CMU, IBM Research) Youssef Mroueh, and Tom Sercu (IBM Research) Presented by Rui-Yi(Roy) Zhang Decemeber

More information

RegML 2018 Class 8 Deep learning

RegML 2018 Class 8 Deep learning RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x

More information