Convolutional Neural Network Architecture

Size: px
Start display at page:

Download "Convolutional Neural Network Architecture"

Transcription

1 Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

2 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

3 Outline ResNeXt, MultiResNet, FractalNet, Go wider CONV layer NN CNN Go deeper AlexNet, VGGNet, GoogleNet, ResNet Information Flow DenseNet, CliqueNet (Our work) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

4 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

5 Introduction of Convolution Motivation In the 1960s, scientists studied the cat s visual cortex cells and found that each visual neuron processes only a small area of the visual image, the receptive field. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

6 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

7 Introduction of Convolution Operation We first take a input with single channel as an example. P W P P w W 0 0 H 0 0 h H 0 0 P Input Convolution Kernel Output (S = 1) Here are some parameters:h, W, P, S, h, w, H, W. The relations between these parameters: H = (H + 2P h)/s + 1 W = (W + 2P w)/s + 1 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

8 Introduction of Convolution Operation P W P P w W 0 0 H 0 0 h H 0 0 P Input Convolution Kernel Output (S = 1) Suppose we use X to represent the input matrix, W for the convolution kernel, X for the output matrix. The convolution operation is computed by the following formula ( represents convolution operation): X (m, n) = X W = h w X (m + i, n + j)w (i, j). (1) i=1 j=1 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

9 Introduction of Convolution Operation In general, the input is not a matrix but a 3-D tensor, X R H W C i, the kernel becomes to a 4-D tensor W R h w C i C o, and the output is also a 3-D tensor X R H W C o. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

10 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

11 Introduction of Convolution Properties In image processing, the convolution kernel is also is called a filter. It is important to note that filters acts as feature detectors from the original input image. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

12 Introduction of Convolution Properties In signal processing, convolution has a strong connection with Fourier transform. Theorem Convolution Theorem: X Y = F 1 (F(X ) F(Y )), (2) where F represents Fourier transform and F 1 represents inverse Fourier transform, denotes element-wise product. Convolution in the spatial domain are equivalent to element-wise products in the Fourier domain. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

13 Introduction of Convolution Properties Because the computation complexities of FFT and IFFT are both O(HW log(hw )), directly computation of convolution requires O(HWhw), we can use FFT and IFFT to reduce the number of parameters and accelerate the forward propagation. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

14 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

15 Introduction of Convolution From FC Layer to CONV Layer Items FC layer CONV layer Input vector 3-D tensor Weight matrix 4-D tensor Output vector 3-D tensor Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

16 Introduction of Convolution From FC Layer to CONV Layer a b c d = I J K L Figure: Convent CONV Layer to FC Layer Comparing with FC layer, CONV Layer has 2 properties: Local Connectivity. Weight sharing. Both FC layer and CONV layer are linear transformations. Due to the convolution operator, CONV layer can preserve the spatial information. I J K L Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

17 Introduction of Convolution From FC Layer to CONV Layer Strong Representation and Predictive Power The toy models are run on MNIST dataset, which contains 60,000 training examples and each example is a pixels image of digit from 0 to 9. Items Setting 1 Setting 2 Hidden layer 1 CONV (5,5,1,10) FC (784,1440) Number of parameter of H ,128,960 Hidden layer 2 FC (1440,10) FC (1440,10) Number of parameter of H 2 14,400 14,400 The minimum of training loss The maximum of test acc % 97.25% Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

18 Introduction of Convolution From FC Layer to CONV Layer Strong Representation and Predictive Power Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

19 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

20 The Benchmark Dataset ImageNet Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

21 The Benchmark Dataset ImageNet difficulty Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

22 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

23 Go Deeper: Overview Revolution of Depth 152 layers layers 19 layers layers 8 layers shallow ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ZFNet ILSVRC'12 AlexNet ImageNet Classification top-5 error (%) ILSVRC'11 ILSVRC'10 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

24 Go Deeper Vanishing gradient problem the vanishing gradient problem is a difficulty found in training neural networks with gradient-based learning methods and backpropagation. We take backpropagation as an example: Denote an L-layer feedforward neural network by W X Σ 1 W 0 2 Σ1 Σ2 WL Σ L Y, where Σ i = σ(σ i 1 W i ), i = 1,..., L 1, Σ L = g(σ L 1 W L ). σ( ): activation, sigmoid, Relu... g( ): transform function of the last layer, e.g., identity for regression, softmax for classification. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

25 Go Deeper Vanishing gradient problem We define f (W) is the loss on W. After choosing a proper loss function l, Using backpropagation we can get the gradients for all weights: WL f = Σ T L 1 Φ 1, Φ 1 = g (Σ L 1 W L 1 ) l (Σ L ) WL 1 f = Σ T L 2 Φ 2, Φ 2 = σ (Σ L 2 W L 2 ) (Φ 1 W T L ) WL 2 f = Σ T L 3 Φ 3, Φ 3 = σ (Σ L 3 W L 3 ) (Φ 2 W T L 1 ) where denotes element-wise product. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

26 Go Deeper Vanishing gradient problem When we use sigmoid function as an activation function, σ(x) = 1, σ(x) (0, 1) 1 + e x. Consider the first order derivative of the sigmoid function: σ (x) = σ(x)(1 σ(x)), σ (x) 1 4. When we use Relu function as an activation function, { σ 1, if x 0, (x) = 0, otherwise. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

27 Go Deeper AlexNet Feature extraction Classification Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

28 Go Deeper VGGNet Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

29 Go Deeper VGGNet Main contribution: The use of only 3 3 sized filters is quite different from AlexNets filters in the first layer and ZF Nets 7 7 filters. This idea is widely applied by later CNN architectures such as ResNet, DenseNet and so on. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

30 Go Deeper GoogleNet Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

31 Go Deeper GoogleNet:Inception Main contribution: 1. GoogLeNet was one of the first models that introduced the idea that CNN layers didnt always have to be stacked up sequentially. Inception block contains parallel paths. 2. Assistant loss functions in middle layers helps to learn more efficiently. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

32 Go Deeper ResNet: Overview Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

33 Go Deeper ResNet: Motivation training error (%) layer 20-layer test error (%) layer 20-layer iter. (1e4) iter. (1e4) Figure: Training error (left) and test error (right) on CIFAR-10 with 20-layer and 56-layer plain networks. The degradation (of training accuracy) indicates that not all systems are similarly easy to optimize. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

34 Go Deeper ResNet: Conclusion & Solution Conclusion: Identity map may be difficult to learn when networks become deeper. Solution: Design a residual block with skip identity map. x F(x) weight layer relu weight layer F(x) + x relu x identity X i+1 = σ(f(x i, W i ) + X i ). (3) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

35 Go Deeper ResNet: Improvement error (%) layer 30 plain layer plain iter. (1e4) error (%) layer 30 ResNet-18 ResNet layer iter. (1e4) Figure: Left: plain networks of 18 and 34 layers. Right: ResNets of 18 and 34 layers. In this plot, the residual networks have no extra parameter compared to their plain counterparts. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

36 Go Deeper ResNet: Main contribution Importance of identity skip connection For simplicity, we do not consider the effect of activation function. X i+1 = X i + F(X i, W i ), (4) X i+2 = X i+1 +F(X i+1, W i+1 ) = X i +F(X i, W i )+F(X i+1, W i+1 ). (5) From the two equations, we can get the recursive form: L 1 X L = X k + F(X i, W i ). (6) i=k Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

37 Go Deeper ResNet: Main contribution Importance of identity skip connection L 1 X L = X k + F(X i, W i ). i=k We compute the gradient of kth layer: l X k = l l L 1 F(X i, W i ) (1 + )). (7) X L X k i=k The additive term of X L ensures that information is directly propagated back to any shallower layer k. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

38 Go Deeper ResNet: Main contribution Importance of identity skip connection Lets consider a simple modification to break the identity shortcut: X i+1 = λ i X i + F(X i, W i ), (8) L 1 X L = X k i=k the gradient of kth layer becomes to: l X k = l L 1 ( X L i=k L 1 λ i + F(X i, W i ). (9) i=k L 1 λ i + i=k F(X i, W i ) X k )). (10) For an extremely deep network (L is large), if λ i > 1 for all i, this factor can be exponentially large; if λ i < 1 for all i, this factor can be small and vanish. (information flow was broken) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

39 Go Deeper ResNet: Another inference Ensemble X 3 = X 2 + f 3 (X 2 ) (11) X 3 = X 1 + f 2 (X 1 ) + f 3 (X 1 + f 2 (X 1 )) (12) X 3 = X 0 + f 1 (X 0 ) + f 2 (X 0 + f 1 (X 0 )) + f 3 (X 0 + f 1 (X 0 ) + f 2 (X 0 + f 1 (X 0 ))) If f 1, f 2, f 3 are linear operators: X 3 = X 0 +f 1 (X 0 )+f 2 (X 0 )+f 2 f 1 (X 0 )+f 3 (X 0 )+f 3 f 1 (X 0 )+f 3 f 2 (X 0 )+f 3 f 2 f 1 (X 0 ) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

40 Go Deeper ResNet: Another inference Ensemble Distribution of path lengths follows a Binomial distribution. 54 blocks in total, more than 95% of paths go through 19 to 35 modules. (ResNet-110) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

41 Go Deeper ResNet: Another inference Ensemble The effective paths in residual networks are relatively shallow The gradient of the input layer mostly pass by the shallow path. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

42 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

43 Go Wider Motivated by ResNet, the effective paths in residual networks are relatively shallow. Researchers tried to shallow and widen the networks and achieved better results. ResNeXt Multi-ResNet FractalNet Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

44 Outline 1 Introduction of Convolution Motivation Operation Properties From FC Layer to CONV Layer 2 CNN Architecture The Benchmark Dataset Go Deeper Go Wider Information Flow 3 Summary Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

45 Information Flow DenseNet Motivated by ResNet, identity skip connections in ResNet are very important. Because of identity skip connections, the signal can be directly propagated from any unit to another, both forward and backward. X l = H l ([X 0, X 1,..., X l 1 ]) (13) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

46 Information Flow DenseNet This introduces (L 1)L 2 connections in an L-layer network. Problems: 1. the number of connections grows quadratically with depth. 2. the number of input feature maps grows linearly with depth. Solutions: Add blocks and transition layers. Transition layer Transition layer Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

47 Information Flow CliqueNet (our work) Motivated by the success of ResNet and DenseNet, we add more identity skip connections to maximize the information flow of the networks. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

48 Information Flow CliqueNet (our work) Stage-I: X (1) i = σ( W li X (1) l ) (14) 0 l<i Stage-II (k 2): = σ( X (k) i 0<l<i W li X (k) l + i<m L W mi X (k) m ) (15) Loop Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

49 Information Flow CliqueNet (our work) Transition Feature Block Feature Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

50 Information Flow CliqueNet (our work) Comparison of the number of parameters. (DenseNet vs. CliqueNet) We set the number of block equal to 5. Each block contains 5 layers. each layer produces 64 feature maps. 700 The number of input channel per layer of DenseNet 700 The number of input channel per layer of CliqueNet Under the same conditions, the number of parameters of DenseNet / the number of parameters of CliqueNet is about When the number of Block becomes larger, the number of CliqueNet will be less than the number of DenseNet. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

51 Information Flow CliqueNet (our work) Figure: Visualization of the weights in the first block in pretrained DenseNet (left) and CliqueNet (right) by calculating the average absolute value of W ij. From the visualization, CliqueNet s parameter efficiency is better than DenseNet s. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

52 Information Flow CliqueNet (our work) Results on CIFAR-10, CIFAR-100 and SVHN Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

53 Information Flow CliqueNet (our work) Results on ImageNet Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

54 Information Flow CliqueNet (our work) Our contribution: 1. To maximize the information flow, we use fully connected graph. 2. We are the first to propose a unfolded loop block structure. 3. We use multi-scale features as an input of loss function and 2 kind of features are propagated in our network. Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

55 Summary Local connected Weight sharing More powerful in Image Feature extraction ResNeXt, MultiResNet, FractalNet, Go wider CONV layer NN CNN Go deeper AlexNet, VGGNet, GoogleNet, ResNet The most important part: Skip identity connection Design criterion: Directed computation graph without loop Information Flow DenseNet, CliqueNet (Our work) Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, / 55

Deep Residual. Variations

Deep Residual. Variations Deep Residual Network and Its Variations Diyu Yang (Originally prepared by Kaiming He from Microsoft Research) Advantages of Depth Degradation Problem Possible Causes? Vanishing/Exploding Gradients. Overfitting

More information

Convolutional neural networks

Convolutional neural networks 11-1: Convolutional neural networks Prof. J.C. Kao, UCLA Convolutional neural networks Motivation Biological inspiration Convolution operation Convolutional layer Padding and stride CNN architecture 11-2:

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Jakub Hajic Artificial Intelligence Seminar I

Jakub Hajic Artificial Intelligence Seminar I Jakub Hajic Artificial Intelligence Seminar I. 11. 11. 2014 Outline Key concepts Deep Belief Networks Convolutional Neural Networks A couple of questions Convolution Perceptron Feedforward Neural Network

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Neural Networks 2. 2 Receptive fields and dealing with image inputs CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There

More information

CSCI 315: Artificial Intelligence through Deep Learning

CSCI 315: Artificial Intelligence through Deep Learning CSCI 35: Artificial Intelligence through Deep Learning W&L Fall Term 27 Prof. Levy Convolutional Networks http://wernerstudio.typepad.com/.a/6ad83549adb53ef53629ccf97c-5wi Convolution: Convolution is

More information

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models Dong Su 1*, Huan Zhang 2*, Hongge Chen 3, Jinfeng Yi 4, Pin-Yu Chen 1, and Yupeng Gao

More information

Binary Convolutional Neural Network on RRAM

Binary Convolutional Neural Network on RRAM Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning

CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning CS 229 Project Final Report: Reinforcement Learning for Neural Network Architecture Category : Theory & Reinforcement Learning Lei Lei Ruoxuan Xiong December 16, 2017 1 Introduction Deep Neural Network

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

Introduction to (Convolutional) Neural Networks

Introduction to (Convolutional) Neural Networks Introduction to (Convolutional) Neural Networks Philipp Grohs Summer School DL and Vis, Sept 2018 Syllabus 1 Motivation and Definition 2 Universal Approximation 3 Backpropagation 4 Stochastic Gradient

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

Introduction to Convolutional Neural Networks 2018 / 02 / 23

Introduction to Convolutional Neural Networks 2018 / 02 / 23 Introduction to Convolutional Neural Networks 2018 / 02 / 23 Buzzword: CNN Convolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that

More information

Deep Learning (CNNs)

Deep Learning (CNNs) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell

More information

ECE521 Lectures 9 Fully Connected Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»

More information

Neural networks COMS 4771

Neural networks COMS 4771 Neural networks COMS 4771 1. Logistic regression Logistic regression Suppose X = R d and Y = {0, 1}. A logistic regression model is a statistical model where the conditional probability function has a

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

Spatial Transformer Networks

Spatial Transformer Networks BIL722 - Deep Learning for Computer Vision Spatial Transformer Networks Max Jaderberg Andrew Zisserman Karen Simonyan Koray Kavukcuoglu Contents Introduction to Spatial Transformers Related Works Spatial

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Lecture 7 Convolutional Neural Networks

Lecture 7 Convolutional Neural Networks Lecture 7 Convolutional Neural Networks CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 17, 2017 We saw before: ŷ x 1 x 2 x 3 x 4 A series of matrix multiplications:

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization

TTIC 31230, Fundamentals of Deep Learning David McAllester, April Vanishing and Exploding Gradients. ReLUs. Xavier Initialization TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Vanishing and Exploding Gradients ReLUs Xavier Initialization Batch Normalization Highway Architectures: Resnets, LSTMs and GRUs Causes

More information

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders

More information

Training Neural Networks Practical Issues

Training Neural Networks Practical Issues Training Neural Networks Practical Issues M. Soleymani Sharif University of Technology Fall 2017 Most slides have been adapted from Fei Fei Li and colleagues lectures, cs231n, Stanford 2017, and some from

More information

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen

Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary. Neural Networks - I. Henrik I Christensen Neural Networks - I Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Neural Networks 1 /

More information

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline

More information

CSE 591: Introduction to Deep Learning in Visual Computing. - Parag S. Chandakkar - Instructors: Dr. Baoxin Li and Ragav Venkatesan

CSE 591: Introduction to Deep Learning in Visual Computing. - Parag S. Chandakkar - Instructors: Dr. Baoxin Li and Ragav Venkatesan CSE 591: Introduction to Deep Learning in Visual Computing - Parag S. Chandakkar - Instructors: Dr. Baoxin Li and Ragav Venkatesan Overview Background Why another network structure? Vanishing and exploding

More information

Asaf Bar Zvi Adi Hayat. Semantic Segmentation

Asaf Bar Zvi Adi Hayat. Semantic Segmentation Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random

More information

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat Neural Networks, Computation Graphs CMSC 470 Marine Carpuat Binary Classification with a Multi-layer Perceptron φ A = 1 φ site = 1 φ located = 1 φ Maizuru = 1 φ, = 2 φ in = 1 φ Kyoto = 1 φ priest = 0 φ

More information

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino

Artificial Neural Networks D B M G. Data Base and Data Mining Group of Politecnico di Torino. Elena Baralis. Politecnico di Torino Artificial Neural Networks Data Base and Data Mining Group of Politecnico di Torino Elena Baralis Politecnico di Torino Artificial Neural Networks Inspired to the structure of the human brain Neurons as

More information

Convolutional Neural Networks. Srikumar Ramalingam

Convolutional Neural Networks. Srikumar Ramalingam Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text

More information

Spatial Transformation

Spatial Transformation Spatial Transformation Presented by Liqun Chen June 30, 2017 1 Overview 2 Spatial Transformer Networks 3 STN experiments 4 Recurrent Models of Visual Attention (RAM) 5 Recurrent Models of Visual Attention

More information

Deep Feedforward Networks. Han Shao, Hou Pong Chan, and Hongyi Zhang

Deep Feedforward Networks. Han Shao, Hou Pong Chan, and Hongyi Zhang Deep Feedforward Networks Han Shao, Hou Pong Chan, and Hongyi Zhang Deep Feedforward Networks Goal: approximate some function f e.g., a classifier, maps input to a class y = f (x) x y Defines a mapping

More information

Neural Architectures for Image, Language, and Speech Processing

Neural Architectures for Image, Language, and Speech Processing Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent

More information

Introduction to Machine Learning Spring 2018 Note Neural Networks

Introduction to Machine Learning Spring 2018 Note Neural Networks CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,

More information

Tasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based

Tasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based UNDERSTANDING CNN ADAS Tasks Self Driving Localizati on Perception Planning/ Control Driver state Vehicle Diagnosis Smart factory Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning

More information

Deep learning attracts lots of attention.

Deep learning attracts lots of attention. Deep Learning Deep learning attracts lots of attention. I believe you have seen lots of exciting results before. Deep learning trends at Google. Source: SIGMOD/Jeff Dean Ups and downs of Deep Learning

More information

Normalization Techniques in Training of Deep Neural Networks

Normalization Techniques in Training of Deep Neural Networks Normalization Techniques in Training of Deep Neural Networks Lei Huang ( 黄雷 ) State Key Laboratory of Software Development Environment, Beihang University Mail:huanglei@nlsde.buaa.edu.cn August 17 th,

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01

Pytorch Tutorial. Xiaoyong Yuan, Xiyao Ma 2018/01 (Li Lab) National Science Foundation Center for Big Learning (CBL) Department of Electrical and Computer Engineering (ECE) Department of Computer & Information Science & Engineering (CISE) Pytorch Tutorial

More information

Ch.6 Deep Feedforward Networks (2/3)

Ch.6 Deep Feedforward Networks (2/3) Ch.6 Deep Feedforward Networks (2/3) 16. 10. 17. (Mon.) System Software Lab., Dept. of Mechanical & Information Eng. Woonggy Kim 1 Contents 6.3. Hidden Units 6.3.1. Rectified Linear Units and Their Generalizations

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks Datamining Seminar Kaspar Märtens Karl-Oskar Masing Today's Topics Modeling sequences: a brief overview Training RNNs with back propagation A toy example of training an RNN Why

More information

Deep Learning. Hung-yi Lee 李宏毅

Deep Learning. Hung-yi Lee 李宏毅 Deep Learning Hung-yi Lee 李宏毅 Deep learning attracts lots of attention. I believe you have seen lots of exciting results before. Deep learning trends at Google. Source: SIGMOD 206/Jeff Dean 958: Perceptron

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Learning Deep Architectures for AI. Part II - Vijay Chakilam

Learning Deep Architectures for AI. Part II - Vijay Chakilam Learning Deep Architectures for AI - Yoshua Bengio Part II - Vijay Chakilam Limitations of Perceptron x1 W, b 0,1 1,1 y x2 weight plane output =1 output =0 There is no value for W and b such that the model

More information

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning Ian Goodfellow Last updated

Deep Feedforward Networks. Lecture slides for Chapter 6 of Deep Learning  Ian Goodfellow Last updated Deep Feedforward Networks Lecture slides for Chapter 6 of Deep Learning www.deeplearningbook.org Ian Goodfellow Last updated 2016-10-04 Roadmap Example: Learning XOR Gradient-Based Learning Hidden Units

More information

Introduction to CNN and PyTorch

Introduction to CNN and PyTorch Introduction to CNN and PyTorch Kripasindhu Sarkar kripasindhu.sarkar@dfki.de Kaiserslautern University, DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de Some of the contents

More information

Introduction to Deep Learning CMPT 733. Steven Bergner

Introduction to Deep Learning CMPT 733. Steven Bergner Introduction to Deep Learning CMPT 733 Steven Bergner Overview Renaissance of artificial neural networks Representation learning vs feature engineering Background Linear Algebra, Optimization Regularization

More information

Introduction to Machine Learning (67577)

Introduction to Machine Learning (67577) Introduction to Machine Learning (67577) Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Deep Learning Shai Shalev-Shwartz (Hebrew U) IML Deep Learning Neural Networks

More information

arxiv: v2 [cs.cv] 21 Oct 2018

arxiv: v2 [cs.cv] 21 Oct 2018 Interpretable Convolutional Neural Networks via Feedforward Design arxiv:1810.02786v2 [cs.cv] 21 Oct 2018 C.-C. Jay Kuo, Min Zhang, Siyang Li, Jiali Duan and Yueru Chen Abstract University of Southern

More information

CONVOLUTIONAL neural networks [18] have contributed

CONVOLUTIONAL neural networks [18] have contributed SUBMITTED FOR PUBLICATION, 2016 1 Multi-Residual Networks: Improving the Speed and Accuracy of Residual Networks Masoud Abdi, and Saeid Nahavandi, Senior Member, IEEE arxiv:1609.05672v4 cs.cv 15 Mar 2017

More information

TYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad

TYPES OF MODEL COMPRESSION. Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad TYPES OF MODEL COMPRESSION Soham Saha, MS by Research in CSE, CVIT, IIIT Hyderabad 1. Pruning 2. Quantization 3. Architectural Modifications PRUNING WHY PRUNING? Deep Neural Networks have redundant parameters.

More information

CSC321 Lecture 20: Reversible and Autoregressive Models

CSC321 Lecture 20: Reversible and Autoregressive Models CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321 Lecture 20: Reversible and Autoregressive Models 1 / 23 Overview Four modern approaches to generative modeling:

More information

Agenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT

Agenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT versus 1 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox:

More information

arxiv: v4 [cs.cv] 6 Sep 2017

arxiv: v4 [cs.cv] 6 Sep 2017 Deep Pyramidal Residual Networks Dongyoon Han EE, KAIST dyhan@kaist.ac.kr Jiwhan Kim EE, KAIST jhkim89@kaist.ac.kr Junmo Kim EE, KAIST junmo.kim@kaist.ac.kr arxiv:1610.02915v4 [cs.cv] 6 Sep 2017 Abstract

More information

Deep Learning Year in Review 2016: Computer Vision Perspective

Deep Learning Year in Review 2016: Computer Vision Perspective Deep Learning Year in Review 2016: Computer Vision Perspective Alex Kalinin, PhD Candidate Bioinformatics @ UMich alxndrkalinin@gmail.com @alxndrkalinin Architectures Summary of CNN architecture development

More information

Demystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK

Demystifying deep learning. Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK Demystifying deep learning Petar Veličković Artificial Intelligence Group Department of Computer Science and Technology, University of Cambridge, UK London Data Science Summit 20 October 2017 Introduction

More information

Internal Covariate Shift Batch Normalization Implementation Experiments. Batch Normalization. Devin Willmott. University of Kentucky.

Internal Covariate Shift Batch Normalization Implementation Experiments. Batch Normalization. Devin Willmott. University of Kentucky. Batch Normalization Devin Willmott University of Kentucky October 23, 2017 Overview 1 Internal Covariate Shift 2 Batch Normalization 3 Implementation 4 Experiments Covariate Shift Suppose we have two distributions,

More information

arxiv: v2 [cs.cv] 12 Apr 2016

arxiv: v2 [cs.cv] 12 Apr 2016 arxiv:1603.05027v2 [cs.cv] 12 Apr 2016 Identity Mappings in Deep Residual Networks Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun Microsoft Research Abstract Deep residual networks [1] have emerged

More information

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions 2018 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 18) Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions Authors: S. Scardapane, S. Van Vaerenbergh,

More information

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz

PRUNING CONVOLUTIONAL NEURAL NETWORKS. Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz PRUNING CONVOLUTIONAL NEURAL NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE CNNS? 2 WHY WE CAN PRUNE CNNS? Optimization failures : Some neurons are "dead":

More information

RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY

RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY SIFT HOG ALL ABOUT THE FEATURES DAISY GABOR AlexNet GoogleNet CONVOLUTIONAL NEURAL NETWORKS VGG-19 ResNet FEATURES COMES FROM DATA

More information

From perceptrons to word embeddings. Simon Šuster University of Groningen

From perceptrons to word embeddings. Simon Šuster University of Groningen From perceptrons to word embeddings Simon Šuster University of Groningen Outline A basic computational unit Weighting some input to produce an output: classification Perceptron Classify tweets Written

More information

ANALYSIS ON GRADIENT PROPAGATION IN BATCH NORMALIZED RESIDUAL NETWORKS

ANALYSIS ON GRADIENT PROPAGATION IN BATCH NORMALIZED RESIDUAL NETWORKS ANALYSIS ON GRADIENT PROPAGATION IN BATCH NORMALIZED RESIDUAL NETWORKS Anonymous authors Paper under double-blind review ABSTRACT We conduct mathematical analysis on the effect of batch normalization (BN)

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Intro to Neural Networks and Deep Learning

Intro to Neural Networks and Deep Learning Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs

More information

A practical theory for designing very deep convolutional neural networks. Xudong Cao.

A practical theory for designing very deep convolutional neural networks. Xudong Cao. A practical theory for designing very deep convolutional neural networks Xudong Cao notcxd@gmail.com Abstract Going deep is essential for deep learning. However it is not easy, there are many ways of going

More information

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks

SPSS, University of Texas at Arlington. Topics in Machine Learning-EE 5359 Neural Networks Topics in Machine Learning-EE 5359 Neural Networks 1 The Perceptron Output: A perceptron is a function that maps D-dimensional vectors to real numbers. For notational convenience, we add a zero-th dimension

More information

Understanding CNNs using visualisation and transformation analysis

Understanding CNNs using visualisation and transformation analysis Understanding CNNs using visualisation and transformation analysis Andrea Vedaldi Medical Imaging Summer School August 2016 Image representations 2 encoder Φ representation An encoder maps the data into

More information

Nonparametric regression using deep neural networks with ReLU activation function

Nonparametric regression using deep neural networks with ReLU activation function Nonparametric regression using deep neural networks with ReLU activation function Johannes Schmidt-Hieber February 2018 Caltech 1 / 20 Many impressive results in applications... Lack of theoretical understanding...

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

arxiv: v1 [cs.cv] 11 May 2015 Abstract

arxiv: v1 [cs.cv] 11 May 2015 Abstract Training Deeper Convolutional Networks with Deep Supervision Liwei Wang Computer Science Dept UIUC lwang97@illinois.edu Chen-Yu Lee ECE Dept UCSD chl260@ucsd.edu Zhuowen Tu CogSci Dept UCSD ztu0@ucsd.edu

More information

Outline. CSCI567 Machine Learning (Spring 2019) Outline. Math formulation. Prof. Victor Adamchik. Feb. 12, 2019

Outline. CSCI567 Machine Learning (Spring 2019) Outline. Math formulation. Prof. Victor Adamchik. Feb. 12, 2019 Outline CSCI56 Machine Learning (Spring 29) Review of last lecture Prof. Victor Adamchik 2 Convolutional neural networks U of Southern California Feb. 2, 29 Kernel methods February 2, 29 / 48 February

More information

Topic 3: Neural Networks

Topic 3: Neural Networks CS 4850/6850: Introduction to Machine Learning Fall 2018 Topic 3: Neural Networks Instructor: Daniel L. Pimentel-Alarcón c Copyright 2018 3.1 Introduction Neural networks are arguably the main reason why

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Learning Neural Networks

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex decision boundaries Variable size. Any boolean function can be represented. Hidden units can be interpreted as new features Deterministic

More information

Bayesian Networks (Part I)

Bayesian Networks (Part I) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Oliver Schulte - CMPT 310 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of biological plausibility We will focus on

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6

Machine Learning for Large-Scale Data Analysis and Decision Making A. Neural Networks Week #6 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Neural Networks Week #6 Today Neural Networks A. Modeling B. Fitting C. Deep neural networks Today s material is (adapted)

More information

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/

More information

Maxout Networks. Hien Quoc Dang

Maxout Networks. Hien Quoc Dang Maxout Networks Hien Quoc Dang Outline Introduction Maxout Networks Description A Universal Approximator & Proof Experiments with Maxout Why does Maxout work? Conclusion 10/12/13 Hien Quoc Dang Machine

More information

Introduction to Deep Learning

Introduction to Deep Learning Introduction to Deep Learning Some slides and images are taken from: David Wolfe Corne Wikipedia Geoffrey A. Hinton https://www.macs.hw.ac.uk/~dwcorne/teaching/introdl.ppt Feedforward networks for function

More information

Recurrent Neural Network

Recurrent Neural Network Recurrent Neural Network Xiaogang Wang xgwang@ee..edu.hk March 2, 2017 Xiaogang Wang (linux) Recurrent Neural Network March 2, 2017 1 / 48 Outline 1 Recurrent neural networks Recurrent neural networks

More information

Convolution and Pooling as an Infinitely Strong Prior

Convolution and Pooling as an Infinitely Strong Prior Convolution and Pooling as an Infinitely Strong Prior Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Convolutional

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Lecture 4: Deep Learning Essentials Pierre Geurts, Gilles Louppe, Louis Wehenkel 1 / 52 Outline Goal: explain and motivate the basic constructs of neural networks. From linear

More information

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,

Index. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow, Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation

More information

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms

Comments. Assignment 3 code released. Thought questions 3 due this week. Mini-project: hopefully you have started. implement classification algorithms Neural networks Comments Assignment 3 code released implement classification algorithms use kernels for census dataset Thought questions 3 due this week Mini-project: hopefully you have started 2 Example:

More information

Deep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering

Deep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Deep learning on 3D geometries Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering Overview Background Methods Numerical Result Future improvements Conclusion Background

More information

IN recent years, convolutional neural networks (CNNs) have

IN recent years, convolutional neural networks (CNNs) have SUBMISSION TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE Holistic CNN Compression via Low-rank Decomposition with Knowledge Transfer Shaohui Lin, Rongrong Ji, Senior Member, IEEE, Chao

More information

Bits of Machine Learning Part 1: Supervised Learning

Bits of Machine Learning Part 1: Supervised Learning Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

arxiv: v1 [cs.cv] 18 May 2018

arxiv: v1 [cs.cv] 18 May 2018 Norm-Preservation: Why Residual Networks Can Become Extremely Deep? arxiv:85.7477v [cs.cv] 8 May 8 Alireza Zaeemzadeh University of Central Florida zaeemzadeh@eecs.ucf.edu Nazanin Rahnavard University

More information

4. Multilayer Perceptrons

4. Multilayer Perceptrons 4. Multilayer Perceptrons This is a supervised error-correction learning algorithm. 1 4.1 Introduction A multilayer feedforward network consists of an input layer, one or more hidden layers, and an output

More information