Understanding CNNs using visualisation and transformation analysis
|
|
- Logan Cannon
- 5 years ago
- Views:
Transcription
1 Understanding CNNs using visualisation and transformation analysis Andrea Vedaldi Medical Imaging Summer School August 2016
2 Image representations 2 encoder Φ representation An encoder maps the data into a vectorial representation Facilitate labelling of images, text, sound, videos,
3 Modern convolutional nets 3? [AlexNet by Krizhevsky et al. 2012] Excellent performance in image understanding tasks Learn a sequence of general-purpose representations Millions of parameters learned from data The meaning of the representation is unclear
4 Understanding visual representations 4 Visualizing representations Backpropagation networks and deconvolution Representations: equivalence & transformations
5 Visualization: Pre-Image 5 Image Space Φ Representation Space Φ -1 Φ(x) Φ -1 The representation is not injective
6 Visualization: Pre-Image 6 Image Space Representation Space Φ -1 Φ(x) The representation is not injective The reconstruction ambiguity provides useful information about the representation
7 Finding a Pre-Image 7 A simple yet general and effective method Image Representation Pre-Image Reconstruction Start from random noise Optimize using stochastic gradient descent
8 Finding a Pre-Image 8 A simple yet general and effective method No prior TV-norm β = 1 TV-norm β = 2
9 Related Work 9 Analysis tools Visualizing higher-layer features of a deep network Ethan et al [intermediate features] Deep inside convolutional networks Simonyan et al [deepest features, aka deep dreams ] DeConvNets Zeiler et al. In ECCV, 2014 [intermediate features] Understanding neural networks through deep visualisation Yosinksi et al [intermediate features] Artistic tools Google s inceptionsm Mordvintsev et al Style synthesis and transfer Gatys et al. 2015
10 Inversion 10 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Conv 1 ReLU 1 Max pool 1 LRN 1 Conv 2 ReLU 2 Max pool 2 LRN 2 Conv 4 ReLU 4 Conv 5 ReLU 5 Max pool 5 FC 6 ReLU 6 Conv 3 ReLU 3 FC 7 ReLU 7 AlexNet [Krizhevsky et al. 2012] FC 8
11 Inversion 11 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
12 Inversion 12 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
13 Inversion 13 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
14 Inversion 14 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
15 Inversion 15 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
16 Inversion 16 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
17 Inversion 17 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
18 Inversion 18 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
19 Inversion 19 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
20 Inversion 20 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
21 Inversion 21 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
22 Inversion 22 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
23 Inversion 23 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
24 Inversion 24 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
25 Inversion 25 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
26 Inversion 26 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
27 Inversion 27 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
28 Inversion 28 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
29 Inversion 29 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
30 Inversion 30 conv 1 conv 2 conv 3 conv 4 conv 5 fc 6 fc 7 fc 8 Original Image
31 Inverting a Deep CNN Conv 1 31 Conv 2 Original Image Conv 3 Conv 4 Conv 5 ReLU 6 FC 6 fc 8 FC 7 FC 8
32 CNNs = visual codes? 32 conv1 conv5 fc8
33 Activation Maximization 33 Look for an image that maximally activates a specific feature component
34 34
35 35
36 36
37 37
38 38
39 39
40 Remember: the starting point is white noise 40 Not an image!
41 41
42 conv1 42
43 conv2 43
44 conv3 44
45 conv4 45
46 conv5 46
47 Network comparison 47 conv5 features AlexNet VGG-M VGG-VD
48 Caricaturization 48 [Google Inceptionism 2015, Mahendran et al. 2016] Emphasise patterns that are detected by a certain representation Key differences: the starting point is the image x0 particular configurations of features are emphasized, not individual features
49 Caricaturization (VGG-M) input conv2 conv3 49 conv4
50 Caricaturization (VGG-M) conv5 fc6 fc7 50 fc8
51 51
52 Interlude: neural art 52 Surprisingly, the filters learned by discriminative neural networks capture well the style of an image. This can be used to transfer the style of an image (e.g. a painting) to any other. Optimisation based L. A. Gatys, A. S. Ecker, and M. Bethge. Texture synthesis and the controlled generation of natural stimuli using convolutional neural networks. In Proc. NIPS, Feed-forward neural network equivalents D. Ulyanov, V. Lebedev, A. Vedaldi, and V. Lempitsky. Texture networks: Feedforward synthesis of textures and stylized images. Proc. ICML, J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In Proc. ECCV, 2016.
53 Generation by moment matching 53 sum loss loss loss loss loss Moment matching Content statistics: same as inversion Style statistics: cross-channel correlations
54 54
55 55
56 56
57 57
58 58
59 59
60 60
61 61
62 62
63 63
64 64
65 65
66 66
67 67
68 68
69 Understanding visual representations 69 Visualizing representations Backpropagation networks and deconvolution Representations: equivalence & transformations
70 Backpropagation 70 Compute derivatives using the chain rule bike c1 c2 c3 c4 c5 f6 f7 f8 loss error w1 w2 w3 w4 w5 w6 w7 w8 x forward R backward derror derror derror derror derror derror derror derror dw1 dw2 dw3 dw4 dw5 dw6 dw7 dw8
71 Chain rule: scalar version 71 x0 x1 xn-1 f1 f2 fn-1 fn xn
72 Chain rule: scalar version 72 xn xn-1 x1 x0 A composition of n functions Derivative chain rule
73 Tensor-valued functions 73 E.g. linear convolution = bank of 3D filters height width channels instances input x H W C 1 or N filters F Hf Wf C K output y H - Hf + 1 W - Wf + 1 K 1 or N
74 Vector representation 74 3D tensors vec vectors
75 Derivative of tensor-valued functions 75 Derivative (Jacobian): every output element w.r.t. every input element! The vec operator allows us to use a familiar matrix notation for the derivatives
76 Chain rule: tensor version 76 Using vec() and matrix notation xn xn-1 x1 x0
77 The (unbearable) size of tensor derivatives 77 The size of these Jacobian matrices is huge. Example: 275 B elements TB of memory required!!
78 Unless the output is a scalar 78 Scalar This is always the case if the last layer is the loss function Now the Jacobian has the same size as x. Example: 524K elements Just 2MB of memory 1 1 1
79 Backpropagation 79 Assume that xn is a scalar (e.g. loss) xn xn-1 x1 x0 compute this first! small explicitly compute uber matrices do not explicitly compute
80 Backpropagation 80 Assume that xn is a scalar (e.g. loss) xn xn-1 x1 x0 small explicitly compute uber matrices do not explicitly compute
81 Backpropagation 81 Assume that xn is a scalar (e.g. loss) xn xn-1 x1 x0 small explicitly compute uber matrix do not explicitly compute
82 Backpropagation 82 Assume that xn is a scalar (e.g. loss) xn xn-1 x1 x0 small explicitly compute
83 Projected function derivative 83 The BP-reversed layer z y x function projected onto p projected function derivative = An equivalent circuit is obtained by introducing a transposed function f T
84 Anatomy of a building block 84 forward (eval) x vl_nnconv y W, b y = vl_nnconv(x, W, b)
85 Anatomy of a building block 85 forward (eval) x vl_nnconv y z() z ε R W, b y = vl_nnconv(x, W, b)
86 Anatomy of a building block 86 forward (eval) x vl_nnconv y z ε R W, b y = vl_nnconv(x, W, b)
87 Anatomy of a building block 87 forward (eval) x vl_nnconv y z ε R W, b y = vl_nnconv(x, W, b) backward (backprop) dz dx vl_nnconv dz dy dz dw dz db dzdx = vl_nnconv(x, W, b, dzdy)
88 Backpropagation network 88 BP induces a reversed network xn xn-1 x1 x0 dxn dxn-1 dx1 dx0 where Note: the BP network is linear in dx1,, dxn-1,dxn. Why?
89 Backpropagation network 89 BP induces a transposed network forward x0 x1 xn-1 xn dx0 dx1 dxn-1 dxn backward
90 Backpropagation network 90 Conv, ReLU, MP and their transposed blocks forward x0 x1 x2 conv ReLU MP x3 dx0 dx1 dx2 conv BP ReLU BP MP BP dx3 backward
91 Sufficient statistics and bottlenecks 91 Usually much less information is needed forward x0 x1 x2 conv ReLU MP x3 nothing! on/off mask pooling switches dx0 dx1 dx2 conv BP ReLU BP MP BP dx3 backward
92 Three visualisation techniques 92 Modified backpropagation networks x0 x1 x2 conv ReLU MP x3 nothing! on/off mask pooling switches SaliNet Equiv. to [Simonyian et al. 14] dx0 dx1 dx2 conv BP ReLU BP MP BP dx3 DeConvNet Same as [Zeiler Fergus 14] dx0 dx1 dx2 dx3 conv BP ReLU MP BP DeSaliNet [Mahendran Vedaldi 16, Springenberg et al. 15] dx0 dx1 dx2 dx3 conv BP ReLU + ReLU BP MP BP
93 Results 93 - VGG-VD Pool VGG-VD FC8 - Image DeConvNet SaliNet DeSaliNet DeConvNet SaliNet DeSaliNet
94 ECCV ECCV #1160 ECCV-16 submission ID 1160 They are largely not neuron selective pool fc8 pool5 fc Rnd. neuron 363 pool Rnd. noise 362 fc8 9 Max neuron # Key limitation of DeConvNets and similar DeConvNet DeSaliNet SaliNet Fig. 4: Lack of neuron selectivity. The bottleneck information r is fixed to the one computed during the forward pass pxq through AlexNet and the output of : pe, rq is comgood for saliency puted by choosing e as: the most active neuron (top row), a second neuron at random (middle), or as a positive random mixture of all neurons (bottom row). Results barely Bad for studying individual neurons differ, particularly for the deeper layers. See figure 1 for the original house input image (use inversion, act. max., etc. instead) x. Best viewed on screen
95 Understanding visual representations 95 Visualizing representations Backpropagation networks and deconvolution Representations: equivalence & transformations
96 When are two representations the same? 96 Learning representations means that there is an endless number of them Variants obtained by learning on different datasets, or different local optima CNN-A x CNN-B E Equivalence ΦB(x) = E ΦA(x)
97 Equivalence 97 AlexNet, same training data, different parametrization: CNN-A CNN-B FC ΦA ΦB Are ΦA and ΦB equivalent?
98 Equivalence 98 AlexNet, same training data, different parametrization: CNN-A CNN-B 5 FC label E 5 6 Classif. loss Train with SGD
99 Equivalence with different random seeds 99 AlexNet A E AlexNet B 100 Baseline 75 x FC Before training gx FC Top-5 Error [%] 50 After training 25 gx E 5 FC 0 E conv1 E conv2 E conv3 E conv4 E conv
100 Equivalence of similar architecture 100 Train on two different datasets ILSVRC12 dataset Places dataset CNN-IMNET CNN-PLACES FC
101 Equivalence with different training data 101 Places E Image Net 100 Baseline 75 x FC Naive stitching gx FC Top-5 Error [%] 50 Learned stitches 25 gx E 5 FC 0 E conv1 E conv2 E conv3 E conv4 E conv
102 Meaningful representations 102 representation Semantic similarity Vector similarity (distance) x embedding space R d congruous pair y near far incongruous pair z Desiderata: invariance and distinctiveness
103 Invariance is task dependent 103 Are x and y the same category? near Are x and y the same colour? far
104 Equivariance 104 image space feature space x g gx Mg Φ(gx) x g gx Φ Φ(x) Φ(x ) Mg Φ(gx ) g is an image transformation g is the corresponding feature transformation Φ(gx) = Mg Φ(x)
105 Invariance 105 image space feature space x g gx Φ(x) = Φ(gx) Mg x g gx Φ Mg Φ(x ) = Φ(gx ) g is an image transformation g is the corresponding feature transformation the map Mg is the identity
106 No equivariance 106 image space feature space x g gx Φ(gx) x g gx Φ Φ(x) Φ(x ) Φ(gx ) g is an image transformation g is the corresponding feature transformation the map Mg does not exists (or is intractable)
107 Representations and transformations 107 image space feature space Equivariance There is a (simple) Mg g is an image transformation Invariance Mg is the identity Neither There is no (simple) Mg Φ(gx) = Mg Φ(x)
108 An empirical test of equivariance 108 Regularized linear regression g Φ Φ(x) Mg Mg Φ(x) Φ(gx) Φ Ag Φ(x) + bg (learned empirically)
109 An empirical test of equivariance 109 Regularized linear regression g Φ Mg Φ Ag Φ(x) + bg (learned empirically) permutation convolution by1 1 filter bank Ag
110 Φ M Φ An empirical test of equivariance 110 HOG features rotation 45 deg Φ Mg Φ Φ Mg Φ Φ Mg Φ
111 Φ M Φ An empirical test of equivariance 111 HOG features rotation 45 deg Φ Mg Φ Φ Mg Φ Φ Mg Φ
112 CNN: a sequence of representations 112 x FC dog Φ Ψ label x FC Φ Ψ Classif. error We run the same analysis on a typical CNN architecture AlexNet [Krizevsky et al. 12] 5 convolutional layers + fully-connected layers Trained on ImageNet ILSVRC
113 A discriminative goal to learn equivariance 113 x FC dog Φ Ψ label x FC Ψ Classif. error g Mg Mg is learned empirically Φ All the other layers (representation and classifier) are frozen
114 Vertical flips 114 g 60 Uncompensated, no TF x FC Uncompensated, TF gx FC Top-5 Error [%] Compensated TF w perm. 15 gx FC Compensated TF, learned gx FC 0 Mg conv Mg conv Mg conv Mg conv Mg conv
115 Summary 115 Modern CNNs are still new technology Expect bigger and meaner CNNs to further improve performance CNNs are more than big balls of parameters We do not really understand what they do Understanding deep nets What do deep networks do? Are there computational / statistical principles to be learned? How much do deep nets learn about the visual world? Possible angles Visualisations Probing statistical properties
Tasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based
UNDERSTANDING CNN ADAS Tasks Self Driving Localizati on Perception Planning/ Control Driver state Vehicle Diagnosis Smart factory Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning
More informationMachine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/
More informationNeural Network Back-propagation HYUNG IL KOO
Neural Network Back-propagation HYUNG IL KOO Hidden Layer Representations Backpropagation has an ability to discover useful intermediate representations at the hidden unit layers inside the networks which
More informationIntroduction to Convolutional Neural Networks (CNNs)
Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More information<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)
Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation
More informationDeep Learning (CNNs)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell
More informationCS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS
CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting
More informationConvolutional Neural Network Architecture
Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation
More informationClassification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses
More informationUnderstanding How ConvNets See
Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,
More informationLecture 7 Convolutional Neural Networks
Lecture 7 Convolutional Neural Networks CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 17, 2017 We saw before: ŷ x 1 x 2 x 3 x 4 A series of matrix multiplications:
More informationLecture 17: Neural Networks and Deep Learning
UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions
More informationNeural Networks 2. 2 Receptive fields and dealing with image inputs
CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There
More informationSpatial Transformer Networks
BIL722 - Deep Learning for Computer Vision Spatial Transformer Networks Max Jaderberg Andrew Zisserman Karen Simonyan Koray Kavukcuoglu Contents Introduction to Spatial Transformers Related Works Spatial
More informationTwo at Once: Enhancing Learning and Generalization Capacities via IBN-Net
Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Supplementary Material Xingang Pan 1, Ping Luo 1, Jianping Shi 2, and Xiaoou Tang 1 1 CUHK-SenseTime Joint Lab, The Chinese University
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Neural Networks: A brief touch Yuejie Chi Department of Electrical and Computer Engineering Spring 2018 1/41 Outline
More informationLecture 35: Optimization and Neural Nets
Lecture 35: Optimization and Neural Nets CS 4670/5670 Sean Bell DeepDream [Google, Inceptionism: Going Deeper into Neural Networks, blog 2015] Aside: CNN vs ConvNet Note: There are many papers that use
More informationArtificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino
Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as
More informationConvolutional neural networks
11-1: Convolutional neural networks Prof. J.C. Kao, UCLA Convolutional neural networks Motivation Biological inspiration Convolution operation Convolutional layer Padding and stride CNN architecture 11-2:
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationJakub Hajic Artificial Intelligence Seminar I
Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network
More informationTutorial on Methods for Interpreting and Understanding Deep Neural Networks. Part 3: Applications & Discussion
Tutorial on Methods for Interpreting and Understanding Deep Neural Networks W. Samek, G. Montavon, K.-R. Müller Part 3: Applications & Discussion ICASSP 2017 Tutorial W. Samek, G. Montavon & K.-R. Müller
More informationarxiv: v1 [cs.cv] 11 May 2015 Abstract
Training Deeper Convolutional Networks with Deep Supervision Liwei Wang Computer Science Dept UIUC lwang97@illinois.edu Chen-Yu Lee ECE Dept UCSD chl260@ucsd.edu Zhuowen Tu CogSci Dept UCSD ztu0@ucsd.edu
More informationSGD and Deep Learning
SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationIntroduction to Convolutional Neural Networks 2018 / 02 / 23
Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that
More informationNeural Networks Learning the network: Backprop , Fall 2018 Lecture 4
Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:
More informationConvolutional Neural Networks II. Slides from Dr. Vlad Morariu
Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate
More informationAsaf Bar Zvi Adi Hayat. Semantic Segmentation
Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random
More informationRAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY
RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY SIFT HOG ALL ABOUT THE FEATURES DAISY GABOR AlexNet GoogleNet CONVOLUTIONAL NEURAL NETWORKS VGG-19 ResNet FEATURES COMES FROM DATA
More informationNeural Architectures for Image, Language, and Speech Processing
Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationMachine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016
Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural
More informationA summary of Deep Learning without Poor Local Minima
A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given
More informationHow to do backpropagation in a brain
How to do backpropagation in a brain Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto & Google Inc. Prelude I will start with three slides explaining a popular type of deep
More informationConvolutional Neural Networks
Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»
More information(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann
(Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for
More informationarxiv: v1 [cs.lg] 7 Dec 2017
Take it in your stride: Do we need striding in CNNs? Chen Kong imon Lucey Carnegie Mellon University {chenk, slucey}@andrew.cmu.edu arxiv:1712.02502v1 [cs.lg] 7 Dec 2017 Abstract ince their inception,
More informationTheories of Deep Learning
Theories of Deep Learning Lecture 02 Donoho, Monajemi, Papyan Department of Statistics Stanford Oct. 4, 2017 1 / 50 Stats 385 Fall 2017 2 / 50 Stats 285 Fall 2017 3 / 50 Course info Wed 3:00-4:20 PM in
More informationDeep Learning Recurrent Networks 2/28/2018
Deep Learning Recurrent Networks /8/8 Recap: Recurrent networks can be incredibly effective Story so far Y(t+) Stock vector X(t) X(t+) X(t+) X(t+) X(t+) X(t+5) X(t+) X(t+7) Iterated structures are good
More informationConvolutional Neural Network. Hung-yi Lee
al Neural Network Hung-yi Lee Why CNN for Image? [Zeiler, M. D., ECCV 2014] x 1 x 2 Represented as pixels x N The most basic classifiers Use 1 st layer as module to build classifiers Use 2 nd layer as
More informationCS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning
CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network
More informationCSC321 Lecture 16: ResNets and Attention
CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the
More informationIntroduction to (Convolutional) Neural Networks
Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient
More informationEncoder Based Lifelong Learning - Supplementary materials
Encoder Based Lifelong Learning - Supplementary materials Amal Rannen Rahaf Aljundi Mathew B. Blaschko Tinne Tuytelaars KU Leuven KU Leuven, ESAT-PSI, IMEC, Belgium firstname.lastname@esat.kuleuven.be
More informationBayesian Networks (Part I)
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,
More informationWHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY,
WHY ARE DEEP NETS REVERSIBLE: A SIMPLE THEORY, WITH IMPLICATIONS FOR TRAINING Sanjeev Arora, Yingyu Liang & Tengyu Ma Department of Computer Science Princeton University Princeton, NJ 08540, USA {arora,yingyul,tengyu}@cs.princeton.edu
More informationCSCI567 Machine Learning (Fall 2018)
CSCI567 Machine Learning (Fall 2018) Prof. Haipeng Luo U of Southern California Sep 12, 2018 September 12, 2018 1 / 49 Administration GitHub repos are setup (ask TA Chi Zhang for any issues) HW 1 is due
More informationDeep Learning: a gentle introduction
Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why
More informationApprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning Nicolas Thome Prenom.Nom@cnam.fr http://cedric.cnam.fr/vertigo/cours/ml2/ Département Informatique Conservatoire
More informationIntroduction to Deep Neural Networks
Introduction to Deep Neural Networks Presenter: Chunyuan Li Pattern Classification and Recognition (ECE 681.01) Duke University April, 2016 Outline 1 Background and Preliminaries Why DNNs? Model: Logistic
More informationBACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation
BACKPROPAGATION Neural network training optimization problem min J(w) w The application of gradient descent to this problem is called backpropagation. Backpropagation is gradient descent applied to J(w)
More informationHomework 3 COMS 4705 Fall 2017 Prof. Kathleen McKeown
Homework 3 COMS 4705 Fall 017 Prof. Kathleen McKeown The assignment consists of a programming part and a written part. For the programming part, make sure you have set up the development environment as
More informationLayer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers
Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers Alexander Binder 1, Grégoire Montavon 2, Sebastian Lapuschkin 3, Klaus-Robert Müller 2,4, and Wojciech Samek 3 1 ISTD
More informationMaxout Networks. Hien Quoc Dang
Maxout Networks Hien Quoc Dang Outline Introduction Maxout Networks Description A Universal Approximator & Proof Experiments with Maxout Why does Maxout work? Conclusion 10/12/13 Hien Quoc Dang Machine
More informationFine-grained Classification
Fine-grained Classification Marcel Simon Department of Mathematics and Computer Science, Germany marcel.simon@uni-jena.de http://www.inf-cv.uni-jena.de/ Seminar Talk 23.06.2015 Outline 1 Motivation 2 3
More informationGlobal Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond
Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond Ben Haeffele and René Vidal Center for Imaging Science Mathematical Institute for Data Science Johns Hopkins University This
More informationDeep Learning Lab Course 2017 (Deep Learning Practical)
Deep Learning Lab Course 207 (Deep Learning Practical) Labs: (Computer Vision) Thomas Brox, (Robotics) Wolfram Burgard, (Machine Learning) Frank Hutter, (Neurorobotics) Joschka Boedecker University of
More informationAgenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT
versus 1 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox:
More informationMore on Neural Networks
More on Neural Networks Yujia Yan Fall 2018 Outline Linear Regression y = Wx + b (1) Linear Regression y = Wx + b (1) Polynomial Regression y = Wφ(x) + b (2) where φ(x) gives the polynomial basis, e.g.,
More informationAnticipating Visual Representations from Unlabeled Data. Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
Anticipating Visual Representations from Unlabeled Data Carl Vondrick, Hamed Pirsiavash, Antonio Torralba Overview Problem Key Insight Methods Experiments Problem: Predict future actions and objects Image
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable
More informationCompressed Fisher vectors for LSVR
XRCE@ILSVRC2011 Compressed Fisher vectors for LSVR Florent Perronnin and Jorge Sánchez* Xerox Research Centre Europe (XRCE) *Now with CIII, Cordoba University, Argentina Our system in a nutshell High-dimensional
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationNeural Networks, Computation Graphs. CMSC 470 Marine Carpuat
Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ
More informationDeep Learning Year in Review 2016: Computer Vision Perspective
Deep Learning Year in Review 2016: Computer Vision Perspective Alex Kalinin, PhD Candidate Bioinformatics @ UMich alxndrkalinin@gmail.com @alxndrkalinin Architectures Summary of CNN architecture development
More informationRegML 2018 Class 8 Deep learning
RegML 2018 Class 8 Deep learning Lorenzo Rosasco UNIGE-MIT-IIT June 18, 2018 Supervised vs unsupervised learning? So far we have been thinking of learning schemes made in two steps f(x) = w, Φ(x) F, x
More informationECE521 Lecture 7/8. Logistic Regression
ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression
More informationAxiomatic Attribution of Neural Networks
Axiomatic Attribution of Neural Networks Mukund Sundararajan* 1, Ankur Taly* 1, Qiqi Yan* 1 1 Google Inc. Presenter: Arshdeep Sekhon Mukund Sundararajan*, Ankur Taly*, Qiqi Yan*Axiomatic Attribution of
More informationNeural Networks in Structured Prediction. November 17, 2015
Neural Networks in Structured Prediction November 17, 2015 HWs and Paper Last homework is going to be posted soon Neural net NER tagging model This is a new structured model Paper - Thursday after Thanksgiving
More informationNeural Networks: Backpropagation
Neural Networks: Backpropagation Machine Learning Fall 2017 Based on slides and material from Geoffrey Hinton, Richard Socher, Dan Roth, Yoav Goldberg, Shai Shalev-Shwartz and Shai Ben-David, and others
More informationNonparametric regression using deep neural networks with ReLU activation function
Nonparametric regression using deep neural networks with ReLU activation function Johannes Schmidt-Hieber February 2018 Caltech 1 / 20 Many impressive results in applications... Lack of theoretical understanding...
More informationFeedforward Neural Networks. Michael Collins, Columbia University
Feedforward Neural Networks Michael Collins, Columbia University Recap: Log-linear Models A log-linear model takes the following form: p(y x; v) = exp (v f(x, y)) y Y exp (v f(x, y )) f(x, y) is the representation
More informationStatistical Machine Learning
Statistical Machine Learning Lecture 9 Numerical optimization and deep learning Niklas Wahlström Division of Systems and Control Department of Information Technology Uppsala University niklas.wahlstrom@it.uu.se
More informationMachine Learning Basics III
Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient
More informationIntroduction to Machine Learning (67577)
Introduction to Machine Learning (67577) Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Deep Learning Shai Shalev-Shwartz (Hebrew U) IML Deep Learning Neural Networks
More informationDEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY
DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY 1 On-line Resources http://neuralnetworksanddeeplearning.com/index.html Online book by Michael Nielsen http://matlabtricks.com/post-5/3x3-convolution-kernelswith-online-demo
More informationDeep Learning for NLP
Deep Learning for NLP Instructor: Wei Xu Ohio State University CSE 5525 Many slides from Greg Durrett Outline Motivation for neural networks Feedforward neural networks Applying feedforward neural networks
More informationNeural Network Approximation. Low rank, Sparsity, and Quantization Oct. 2017
Neural Network Approximation Low rank, Sparsity, and Quantization zsc@megvii.com Oct. 2017 Motivation Faster Inference Faster Training Latency critical scenarios VR/AR, UGV/UAV Saves time and energy Higher
More informationON THE STABILITY OF DEEP NETWORKS
ON THE STABILITY OF DEEP NETWORKS AND THEIR RELATIONSHIP TO COMPRESSED SENSING AND METRIC LEARNING RAJA GIRYES AND GUILLERMO SAPIRO DUKE UNIVERSITY Mathematics of Deep Learning International Conference
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 8. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 8 Sinan Kalkan Loss functions Many correct labels case: Binary prediction for each label, independently: L i = σ j max 0, 1 y ij f j y ij = +1 if
More informationIndex. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation
More informationMultimodal context analysis and prediction
Multimodal context analysis and prediction Valeria Tomaselli (valeria.tomaselli@st.com) Sebastiano Battiato Giovanni Maria Farinella Tiziana Rotondo (PhD student) Outline 2 Context analysis vs prediction
More informationEVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN)
EVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN) TARGETED PIECES OF KNOWLEDGE Linear regression Activation function Multi-Layers Perceptron (MLP) Stochastic Gradient Descent
More informationBackpropagation and Neural Networks part 1. Lecture 4-1
Lecture 4: Backpropagation and Neural Networks part 1 Lecture 4-1 Administrative A1 is due Jan 20 (Wednesday). ~150 hours left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Also note: Lectures
More informationDeep Feedforward Networks. Sargur N. Srihari
Deep Feedforward Networks Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation and Other Differentiation
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Tambet Matiisen
Artificial Neural Networks Introduction to Computational Neuroscience Tambet Matiisen 2.04.2018 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More information8-1: Backpropagation Prof. J.C. Kao, UCLA. Backpropagation. Chain rule for the derivatives Backpropagation graphs Examples
8-1: Backpropagation Prof. J.C. Kao, UCLA Backpropagation Chain rule for the derivatives Backpropagation graphs Examples 8-2: Backpropagation Prof. J.C. Kao, UCLA Motivation for backpropagation To do gradient
More informationTopic 3: Neural Networks
CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationBackpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationSTA 414/2104: Lecture 8
STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA
More informationAn overview of deep learning methods for genomics
An overview of deep learning methods for genomics Matthew Ploenzke STAT115/215/BIO/BIST282 Harvard University April 19, 218 1 Snapshot 1. Brief introduction to convolutional neural networks What is deep
More informationIntro to Neural Networks and Deep Learning
Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More informationDeep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning
Convolutional Neural Network (CNNs) University of Waterloo October 30, 2015 Slides are partially based on Book in preparation, by Bengio, Goodfellow, and Aaron Courville, 2015 Convolutional Networks Convolutional
More informationThe XOR problem. Machine learning for vision. The XOR problem. The XOR problem. x 1 x 2. x 2. x 1. Fall Roland Memisevic
The XOR problem Fall 2013 x 2 Lecture 9, February 25, 2015 x 1 The XOR problem The XOR problem x 1 x 2 x 2 x 1 (picture adapted from Bishop 2006) It s the features, stupid It s the features, stupid The
More information