Convolutional Networks 2: Training, deep convolutional networks

Size: px
Start display at page:

Download "Convolutional Networks 2: Training, deep convolutional networks"

Transcription

1 Convoutiona Networks 2: Training, deep convoutiona networks Hakan Bien Machine Learning Practica MLP Lecture 8 30 October / 6 November 2018 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 1

2 Question Q1. How can we increase the receptive fied area of a conv ayer? MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 2

3 Question Q1. How can we increase the receptive fied area of a conv ayer? Q2. Can we do it without increasing kerne size? MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 2

4 Input arguments for convoution function in channes out channes kerne size stride padding bias MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 3

5 Input arguments for convoution function in channes out channes kerne size stride padding bias diation? groups? MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 3

6 Diated convoutions MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 4

7 Diated convoutions Increased receptive fied by infating the kerne by inserting D 1 spaces between the kerne eements MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 4

8 Diated convoutions Increased receptive fied by infating the kerne by inserting D 1 spaces between the kerne eements MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 4

9 Diated convoutions Increased receptive fied by infating the kerne by inserting D 1 spaces between the kerne eements Why to increase receptive fied size? MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 4

10 Diated convoutions Increased receptive fied by infating the kerne by inserting D 1 spaces between the kerne eements Yu & Kotun, Muti-scae context aggregation by diated convoutions, ICLR, MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 4

11 (Convoutiona) fiter groups MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 5

12 (Convoutiona) fiter groups G is number of groups Reduces number of convoutiona fiters (or parameters) Reguarisation effect MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 5

13 Convoution and cross-correation We can write the feature map hidden unit equation (Index-0): h i,j = I (m + i, n + j)w (m, n) m=0 n=0 h = W I is a cross-correation MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 6

14 Convoution and cross-correation We can write the feature map hidden unit equation (Index-0): h i,j = I (m + i, n + j)w (m, n) m=0 n=0 h = W I is a cross-correation In signa processing a 2D convoution is written as h i,j = (V I ) = I (m, n)v (i m, j n) m=0 n=0 h i,j = (I V ) = I (i m, j n)v (m, n) m=0 n=0 If we fip (refect horizontay and verticay) W (cross-correation) then we obtain V (convoution) MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 6

15 Training Convoutiona Networks Forward pass MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 7

16 Training Convoutiona Networks Forward pass Backward pass MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 7

17 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 8

18 Gradients of E w.r.t W MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 9

19 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the parameter updates ( E W ) E w 11 = E h11 h11 w11 + E h12 h12 w11 + E h21 h21 w11 + E h22 h22 w11 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 10

20 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the parameter updates ( E W ) E w 11 = E h11 h E h12 h E h21 h E h22 h22 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 10

21 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the parameter updates ( E W ) E w 12 = E h11 h E h1,2 h E h21 h E h22 h23 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 10

22 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the parameter updates ( E W ) E w 2,1 = E h11 h E h12 h E h21 h E h22 h32 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 10

23 Exampe h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the parameter updates ( E W ) E w 2,2 = E h11 h E h12 h E h21 h E h22 h33 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 10

24 Gradients of E w.r.t W E w 1,1 E w 12 E w 21 E w 22 = E h11 h E h12 h E h21 h E h22 h 1 22 = E h11 h E h12 h E h21 h E h22 h 1 23 = E h11 h E h12 h E h21 h E h22 h 1 32 = E h11 h E h12 h E h21 h E h22 h 1 33 Given H R M N E w r,s = M N E h m=1 n=1 m,n h 1 r+m 1,s+n 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 11

25 Gradients of E w.r.t W E w r,s = M N E h m=1 n=1 m,n h 1 r+m 1,s+n 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 12

26 Gradients of E w.r.t H 1 Forward pass Backward pass MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 13

27 Gradients of E w.r.t H 1 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 14

28 Gradients of E w.r.t H 1 Imagine inverting the receptive fied! MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 14

29 Gradients of E w.r.t h 1 h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the gradients of oss function (E) with respect to previous ayer (H 1 ) E h 1 11 = E h 11 h11 h E h12 h12 h E h21 h21 h E h22 h 22 h 1 11 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 15

30 Gradients of E w.r.t h 1 h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the gradients of oss function (E) with respect to previous ayer (H 1 ) E h 1 11 = E h 11 h11 h E h12 h12 h E h21 h21 h E h22 h 22 h 1 11 E h 1 11 = E h11 w11 + E h E h E h22 0 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 15

31 Gradients of E w.r.t h 1 h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the gradients of oss function (E) with respect to previous ayer (H 1 ) E h 1 22 = E h 11 h11 h E h12 h12 h E h21 h21 h E h22 h 22 h 1 22 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 15

32 Gradients of E w.r.t h 1 h11 = w11h w 12h w 21h w 22h b h12 = w11h w 12h w 21h w 22h b h21 = w11h w 12h w 21h w 22h b h22 = w11h w 12h w 21h w 22h b Let s cacuate the gradients of oss function (E) with respect to previous ayer (H 1 ) E h 1 22 = E h 11 h11 h E h12 h12 h E h21 h21 h E h22 h 22 h 1 22 E h 1 22 = E h11 w22 + E h12 w21 + E h21 w12 + E h22 w11 MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 15

33 Gradients of E w.r.t H 1 Imagine inverting the receptive fied! MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 16

34 Backpropagation for pooing Max function: m = max(a, b) { 1 if a > b, m a = 0 ese. m b = { 1 if b > a, 0 ese. MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 17

35 Backpropagation for pooing Max function: m = max(a, b) { 1 if a > b, m a = 0 ese. m b = { 1 if b > a, 0 ese. MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 17

36 Backpropagation for pooing Max function: m = max(a, b) { 1 if a > b, m a = 0 ese. m b = { 1 if b > a, 0 ese. MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 17

37 Impementing fuy-connected networks Exampe at a time: d k k input vector output vector d weight matrix MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 18

38 Impementing fuy-connected networks Minibatch: d k k n d n input vector (minibatch) weight matrix output vector (minibatch) MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 18

39 Impementing fuy-connected networks Minibatch: d k k n d n input vector (minibatch) weight matrix output vector (minibatch) input dimension x minibatch: Represent each ayer as a 2-dimension matrix, where each row corresponds to a training exampe, and the number of minibatch exampes is the number of rows MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 18

40 Impementing Convoutiona Networks Exampe at a time, singe input image, singe feature map: x m y input image weight matrix (kerne) feature map MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 19

41 Impementing Convoutiona Networks Exampe at a time, singe input image, mutipe feature map: x m y input image weight matrices (kernes) feature maps MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 19

42 Impementing Convoutiona Networks Exampe at a time, mutipe input images, mutipe feature map: x m y mutipe input images weight matrices (kernes) feature maps MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 19

43 Impementing Convoutiona Networks Minibatch, mutipe input images, mutipe feature map: x m y n n weight matrices (kernes) minibatch of mutipe input images minibatch of feature maps MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 19

44 Impementing Convoutiona Networks Inputs / ayer vaues: Each input image (and convutiona and pooing ayer) is 2-dimensions (x,y) If we have mutipe feature maps, then that is a third dimension And the minibatch adds a fourth dimension Thus we represent each input (ayer vaues) using a 4-dimension tensor (array): (minibatch-size, num-fmaps, x, y) Weight matrices (kernes) Each weight matrix used to scan across an image has 2 spatia dimensions (x,y) If there are mutipe feature maps to be computed, then that is a third dimension Mutipe input feature maps adds a fourth dimension Thus the weight matrices are aso represented using a 4-dimension tensor: (F in, F out, x, y) MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 20

45 4D tensors in numpy Both forward and back prop thus invoves mutipying 4D tensors. There are various ways to do this: Expicity oop over the dimensions: this resuts in simper code, but can be inefficient. Athough using cython to compie the oops as C can speed things up Seriaisation: By repicating input patches and weight matrices, it is possibe to convert the required 4D tensor mutipications into a arge dot product. Requires carefu manipuation of indices! Convoutions: use expicit convoution functions for forward and back prop, rotating for the backprop MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 21

46 Deep convoutiona networks MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 22

47 LeNet5 (LeCun et a, 1997) 2 convoutiona ayers {C1, C3} + non-inearity 2 average pooing {S2, S4} 2 fuy connected hidden ayer (no weight sharing) {C5, F6} Softmax cassifier ayer MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 23

48 ImageNet Cassification ( AexNet ) Krizhevsky, Sutskever and Hinton, ImageNet Cassification with Deep Convoutiona Neura Networks, NIPS convoutiona ayers + non-inearity (ReLU) 3 max pooing ayers 2 fuy connected hidden ayer Softmax cassifier ayer MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 24

49 ImageNet Cassification ( VGGNet ) Simonyan and Zisserman, Very Deep Convoutiona Networks for Large-Scae Visua Recognition, ILSVRC Network Design Key design choices: 3x3 conv. kernes very sma conv. stride 1 no oss of information Other detais: Rectification (ReLU) non-inearity 5 max-poo ayers (x2 reduction) no normaisation 3 fuy-connected (FC) ayers 224x x112 56x56 28x28 14x14 7x7 image conv-64 conv-64 maxpoo conv-128 conv-128 maxpoo conv-256 conv-256 maxpoo conv-512 conv-512 maxpoo conv-512 conv-512 maxpoo FC-4096 FC-4096 FC-1000 softmax 0.1M wts 0.2M wts 0.3M wts 0.6M wts 1.2M wts 2.4M wts 2.4M wts 2.4M wts 49x512x4096 = 102.8M wts 16.8M wts 4.1M wts Tota 134M wts MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 25

50 Simpy stacking more ayers? He et a, Deep Residua Learning for Image Recognition, CVPR ayer net has higher training error and test error than 20-ayer net! MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 26

51 Deep Residua Learning ( ResNets ) He et a, Deep Residua Learning for Image Recognition, CVPR MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 27

52 Deep Residua Learning ( ResNets ) He et a, Deep Residua Learning for Image Recognition, CVPR MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 27

53 Deep Residua Learning ( ResNets ) He et a, Deep Residua Learning for Image Recognition, CVPR MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 27

54 Deep Residua Learning ( ResNets ) He et a, Deep Residua Learning for Image Recognition, CVPR x weight ayer F(x) reu x weight ayer identity F(x) + x reu Figure 2. Residua earning: a buiding bock. A soution by construction: origina ayers: copied from a earned extra ayers: set as identity at east the same training error MLP Lecture 8 / 30 October / 6 November 2018 are comparaby good or better than the constructed soution (or unabe to do so in feasibe time). In this paper, we address the degradation probem by introducing a deep residua earning framework. Instead of hoping each few stacked ayers directy fit a desired underying mapping, we expicity et these ayfit a residua mapping. Formay, denoting the desired shaowerers mode underying mapping as H(x), we et the stacked noninear ayers fit another mapping of F(x) := H(x) x. The origina mapping is recast into F(x)+x. We hypothesize that it is easier to optimize the residua mapping than to optimize Convoutiona Networks 2: Training, deep convoutiona networks 27

55 Deep Residua Learning ( ResNets ) He et a, Deep Residua Learning for Image Recognition, CVPR MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 28

56 Hierarchica Representations Pixe edge texton motif part object Zeier & Fergus, Visuaizing and Understanding Convoutiona Networks, ECCV Side credits: Lecun & Ranzato MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 29

57 Summary Convoutiona networks incude oca receptive fieds, weight sharing, and pooing eading Backprop training can aso be impemented as a reverse convoutiona ayer (with the weight matrix rotated) Impement using 4D tensors: Inputs / Layer vaues: minibatch-size, number-fmaps, x, y Weights: F in, F out, x, y Arguments: stride, kerne size, diation, fiter groups Reading: Goodfeow et a, Deep Learning (ch 9) MLP Lecture 8 / 30 October / 6 November 2018 Convoutiona Networks 2: Training, deep convoutiona networks 30

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Decoupled Parallel Backpropagation with Convergence Guarantee

Decoupled Parallel Backpropagation with Convergence Guarantee Zhouyuan Huo 1 Bin Gu 1 Qian Yang 1 Heng Huang 1 Abstract Backpropagation agorithm is indispensabe for the training of feedforward neura networks. It requires propagating error gradients sequentiay from

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

TRAINED TERNARY QUANTIZATION

TRAINED TERNARY QUANTIZATION Pubished as a conference paper at ICLR 2017 TRAINED TERNARY QUANTIZATION Chenzhuo Zhu Tsinghua University zhucz13@mais.tsinghua.edu.cn Huizi Mao Stanford University huizi@stanford.edu Song Han Stanford

More information

Convolutional Neural Network Architecture

Convolutional Neural Network Architecture Convolutional Neural Network Architecture Zhisheng Zhong Feburary 2nd, 2018 Zhisheng Zhong Convolutional Neural Network Architecture Feburary 2nd, 2018 1 / 55 Outline 1 Introduction of Convolution Motivation

More information

Lecture 7 Convolutional Neural Networks

Lecture 7 Convolutional Neural Networks Lecture 7 Convolutional Neural Networks CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 17, 2017 We saw before: ŷ x 1 x 2 x 3 x 4 A series of matrix multiplications:

More information

Lecture 17: Neural Networks and Deep Learning

Lecture 17: Neural Networks and Deep Learning UVA CS 6316 / CS 4501-004 Machine Learning Fall 2016 Lecture 17: Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions

More information

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1 Multi-layer networks Steve Renals Machine Learning Practical MLP Lecture 3 7 October 2015 MLP Lecture 3 Multi-layer networks 2 What Do Single

More information

Introduction to Convolutional Neural Networks (CNNs)

Introduction to Convolutional Neural Networks (CNNs) Introduction to Convolutional Neural Networks (CNNs) nojunk@snu.ac.kr http://mipal.snu.ac.kr Department of Transdisciplinary Studies Seoul National University, Korea Jan. 2016 Many slides are from Fei-Fei

More information

NISP: Pruning Networks using Neuron Importance Score Propagation

NISP: Pruning Networks using Neuron Importance Score Propagation NISP: Pruning Networks using Neuron Importance Score Propagation Ruichi Yu 1 Ang Li 3 Chun-Fu Chen 2 Jui-Hsin Lai 5 Vad I. Morariu 4 Xintong Han 1 Mingfei Gao 1 Ching-Yung Lin 6 Larry S. Davis 1 1 University

More information

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation) Learning for Deep Neural Networks (Back-propagation) Outline Summary of Previous Standford Lecture Universal Approximation Theorem Inference vs Training Gradient Descent Back-Propagation

More information

Notes on Backpropagation with Cross Entropy

Notes on Backpropagation with Cross Entropy Notes on Backpropagation with Cross Entropy I-Ta ee, Dan Gowasser, Bruno Ribeiro Purue University October 3, 07. Overview This note introuces backpropagation for a common neura network muti-cass cassifier.

More information

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 Machine Learning for Signal Processing Neural Networks Continue Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016 1 So what are neural networks?? Voice signal N.Net Transcription Image N.Net Text

More information

Paragraph Topic Classification

Paragraph Topic Classification Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu Convolutional Neural Networks II Slides from Dr. Vlad Morariu 1 Optimization Example of optimization progress while training a neural network. (Loss over mini-batches goes down over time.) 2 Learning rate

More information

arxiv: v2 [cs.cv] 27 Nov 2018

arxiv: v2 [cs.cv] 27 Nov 2018 Universa Semi-Supervised Semantic Segmentation Tarun Kauri1 arxiv:1811.10323v2 [cs.cv] 27 Nov 2018 1 Girish Varma1 IIIT Hyderabad Manmohan Chandraker2 2 C V Jawahar1 University of Caifornia, San Diego

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Understanding How ConvNets See

Understanding How ConvNets See Understanding How ConvNets See Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops) CSC321: Intro to Machine Learning and Neural Networks,

More information

Neural Architectures for Image, Language, and Speech Processing

Neural Architectures for Image, Language, and Speech Processing Neural Architectures for Image, Language, and Speech Processing Karl Stratos June 26, 2018 1 / 31 Overview Feedforward Networks Need for Specialized Architectures Convolutional Neural Networks (CNNs) Recurrent

More information

CSC321 Lecture 16: ResNets and Attention

CSC321 Lecture 16: ResNets and Attention CSC321 Lecture 16: ResNets and Attention Roger Grosse Roger Grosse CSC321 Lecture 16: ResNets and Attention 1 / 24 Overview Two topics for today: Topic 1: Deep Residual Networks (ResNets) This is the state-of-the

More information

Adaptive Noise Cancellation Using Deep Cerebellar Model Articulation Controller

Adaptive Noise Cancellation Using Deep Cerebellar Model Articulation Controller daptive Noise Canceation Using Deep Cerebear Mode rticuation Controer Yu Tsao, Member, IEEE, Hao-Chun Chu, Shih-Wei an, Shih-Hau Fang, Senior Member, IEEE, Junghsi ee*, and Chih-Min in, Feow, IEEE bstract

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

PROXIMAL BACKPROPAGATION

PROXIMAL BACKPROPAGATION Pubished as a conference paper at ICLR 208 PROXIMAL BACKPROPAGATION Thomas Frerix, Thomas Möenhoff, Michae Moeer 2, Danie Cremers thomas.frerix@tum.de thomas.moeenhoff@tum.de michae.moeer@uni-siegen.de

More information

18-660: Numerical Methods for Engineering Design and Optimization

18-660: Numerical Methods for Engineering Design and Optimization 8-660: Numerica Methods for Engineering esign and Optimization in i epartment of ECE Carnegie Meon University Pittsburgh, PA 523 Side Overview Conjugate Gradient Method (Part 4) Pre-conditioning Noninear

More information

arxiv: v2 [stat.ml] 23 May 2016

arxiv: v2 [stat.ml] 23 May 2016 A Kronecker-factored approximate Fisher matrix for convoution ayers arxiv:1602.01407v2 [stat.ml] 23 May 2016 Roger Grosse & James Martens Department of Computer Science University of Toronto Toronto, ON,

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

Introduction to Machine Learning (67577)

Introduction to Machine Learning (67577) Introduction to Machine Learning (67577) Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Deep Learning Shai Shalev-Shwartz (Hebrew U) IML Deep Learning Neural Networks

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabiistic Graphica Modes Inference & Learning in DL Zhiting Hu Lecture 19, March 29, 2017 Reading: 1 Deep Generative Modes Expicit probabiistic modes Provide an expicit parametric specification of the

More information

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations

Optimality of Inference in Hierarchical Coding for Distributed Object-Based Representations Optimaity of Inference in Hierarchica Coding for Distributed Object-Based Representations Simon Brodeur, Jean Rouat NECOTIS, Département génie éectrique et génie informatique, Université de Sherbrooke,

More information

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net Supplementary Material Xingang Pan 1, Ping Luo 1, Jianping Shi 2, and Xiaoou Tang 1 1 CUHK-SenseTime Joint Lab, The Chinese University

More information

Deep Learning for Automatic Speech Recognition Part II

Deep Learning for Automatic Speech Recognition Part II Deep Learning for Automatic Speech Recognition Part II Xiaodong Cui IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Fall, 2018 Outline A brief revisit of sampling, pitch/formant and MFCC DNN-HMM

More information

Neural Networks: Backpropagation

Neural Networks: Backpropagation Neural Networks: Backpropagation Seung-Hoon Na 1 1 Department of Computer Science Chonbuk National University 2018.10.25 eung-hoon Na (Chonbuk National University) Neural Networks: Backpropagation 2018.10.25

More information

VI.G Exact free energy of the Square Lattice Ising model

VI.G Exact free energy of the Square Lattice Ising model VI.G Exact free energy of the Square Lattice Ising mode As indicated in eq.(vi.35), the Ising partition function is reated to a sum S, over coections of paths on the attice. The aowed graphs for a square

More information

Supervised Deep Learning

Supervised Deep Learning Supervised Deep Learning Joana Frontera-Pons Grigorios Tsagkatakis Dictionary Learning on Manifolds workshop Nice, September 2017 Supervised Learning Data Labels Model Prediction Spiral Exploiting prior

More information

Deep Learning (CNNs)

Deep Learning (CNNs) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deep Learning (CNNs) Deep Learning Readings: Murphy 28 Bishop - - HTF - - Mitchell

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

https://doi.org/ /epjconf/

https://doi.org/ /epjconf/ HOW TO APPLY THE OPTIMAL ESTIMATION METHOD TO YOUR LIDAR MEASUREMENTS FOR IMPROVED RETRIEVALS OF TEMPERATURE AND COMPOSITION R. J. Sica 1,2,*, A. Haefee 2,1, A. Jaai 1, S. Gamage 1 and G. Farhani 1 1 Department

More information

A Kronecker-factored approximate Fisher matrix for convolution layers

A Kronecker-factored approximate Fisher matrix for convolution layers Roger Grosse James Martens Department of Computer Science, University of Toronto RGROSSE@CS.TORONTO.EDU JMARTENS@CS.TORONTO.EDU Abstract Second-order optimization methods such as natura gradient descent

More information

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Motivation Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses

More information

Introduction to CNN and PyTorch

Introduction to CNN and PyTorch Introduction to CNN and PyTorch Kripasindhu Sarkar kripasindhu.sarkar@dfki.de Kaiserslautern University, DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de Some of the contents

More information

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning Acceerating Convoutiona Networks via Goba & Dynamic Fiter Pruning Shaohui Lin 1,2, Rongrong Ji 1,2, Yuchao Li 1,2, Yongjian Wu 3, Feiyue Huang 3, Baochang Zhang 4 1 Fujian Key Laboratory of Sensing and

More information

Identity Matters in Deep Learning

Identity Matters in Deep Learning Identity Matters in Deep Learning Moritz Hardt. Tengyu Ma arxiv:1611.04231v2 [cs.lg] 11 Dec 2016 December 13, 2016 Abstract An emerging design principe in deep earning is that each ayer of a deep artificia

More information

Intro to Neural Networks and Deep Learning

Intro to Neural Networks and Deep Learning Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi UVA CS 6316 1 Neurons 1-Layer Neural Network Multi-layer Neural Network Loss Functions Backpropagation Nonlinearity Functions NNs

More information

Reading Group on Deep Learning Session 2

Reading Group on Deep Learning Session 2 Reading Group on Deep Learning Session 2 Stephane Lathuiliere & Pablo Mesejo 10 June 2016 1/39 Chapter Structure Introduction. 5.1. Feed-forward Network Functions. 5.2. Network Training. 5.3. Error Backpropagation.

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA

T.C. Banwell, S. Galli. {bct, Telcordia Technologies, Inc., 445 South Street, Morristown, NJ 07960, USA ON THE SYMMETRY OF THE POWER INE CHANNE T.C. Banwe, S. Gai {bct, sgai}@research.tecordia.com Tecordia Technoogies, Inc., 445 South Street, Morristown, NJ 07960, USA Abstract The indoor power ine network

More information

Learning Structured Weight Uncertainty in Bayesian Neural Networks

Learning Structured Weight Uncertainty in Bayesian Neural Networks Learning Structured Weight Uncertainty in Bayesian Neura Networks Shengyang Sun Changyou Chen Lawrence Carin Tsinghua University Duke University Duke University Abstract Deep neura networks DNNs are increasingy

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer CSE446: Neural Networks Spring 2017 Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer Human Neurons Switching time ~ 0.001 second Number of neurons 10 10 Connections per neuron 10 4-5 Scene

More information

Lower bounds on the robustness to adversarial perturbations

Lower bounds on the robustness to adversarial perturbations Lower bounds on the robustness to adversarial perturbations Jonathan Peck 1,2, Joris Roels 2,3, Bart Goossens 3, and Yvan Saeys 1,2 1 Department of Applied Mathematics, Computer Science and Statistics,

More information

BP neural network-based sports performance prediction model applied research

BP neural network-based sports performance prediction model applied research Avaiabe onine www.jocpr.com Journa of Chemica and Pharmaceutica Research, 204, 6(7:93-936 Research Artice ISSN : 0975-7384 CODEN(USA : JCPRC5 BP neura networ-based sports performance prediction mode appied

More information

arxiv: v1 [cs.cv] 11 May 2015 Abstract

arxiv: v1 [cs.cv] 11 May 2015 Abstract Training Deeper Convolutional Networks with Deep Supervision Liwei Wang Computer Science Dept UIUC lwang97@illinois.edu Chen-Yu Lee ECE Dept UCSD chl260@ucsd.edu Zhuowen Tu CogSci Dept UCSD ztu0@ucsd.edu

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

Translation Microscopy (TRAM) for super-resolution imaging.

Translation Microscopy (TRAM) for super-resolution imaging. ransation Microscopy (RAM) for super-resoution imaging. Zhen Qiu* 1,,3, Rhodri S Wison* 1,3, Yuewei Liu 1,3, Aison Dun 1,3, Rebecca S Saeeb 1,3, Dongsheng Liu 4, Coin Ricman 1,3, Rory R Duncan 1,3,5, Weiping

More information

Neural Networks. Lecture 6. Rob Fergus

Neural Networks. Lecture 6. Rob Fergus Neural Networks Lecture 6 Rob Fergus Overview Individual neuron Non-linearities (RELU, tanh, sigmoid) Single layer model Multiple layer models Theoretical discussion: representational power Examples shown

More information

Integrating Factor Methods as Exponential Integrators

Integrating Factor Methods as Exponential Integrators Integrating Factor Methods as Exponentia Integrators Borisav V. Minchev Department of Mathematica Science, NTNU, 7491 Trondheim, Norway Borko.Minchev@ii.uib.no Abstract. Recenty a ot of effort has been

More information

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c)

A Simple and Efficient Algorithm of 3-D Single-Source Localization with Uniform Cross Array Bing Xue 1 2 a) * Guangyou Fang 1 2 b and Yicai Ji 1 2 c) A Simpe Efficient Agorithm of 3-D Singe-Source Locaization with Uniform Cross Array Bing Xue a * Guangyou Fang b Yicai Ji c Key Laboratory of Eectromagnetic Radiation Sensing Technoogy, Institute of Eectronics,

More information

Two view learning: SVM-2K, Theory and Practice

Two view learning: SVM-2K, Theory and Practice Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk

More information

Traffic data collection

Traffic data collection Chapter 32 Traffic data coection 32.1 Overview Unike many other discipines of the engineering, the situations that are interesting to a traffic engineer cannot be reproduced in a aboratory. Even if road

More information

FOURIER SERIES ON ANY INTERVAL

FOURIER SERIES ON ANY INTERVAL FOURIER SERIES ON ANY INTERVAL Overview We have spent considerabe time earning how to compute Fourier series for functions that have a period of 2p on the interva (-p,p). We have aso seen how Fourier series

More information

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i

More information

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria

More information

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Deep Learning: a gentle introduction

Deep Learning: a gentle introduction Deep Learning: a gentle introduction Jamal Atif jamal.atif@dauphine.fr PSL, Université Paris-Dauphine, LAMSADE February 8, 206 Jamal Atif (Université Paris-Dauphine) Deep Learning February 8, 206 / Why

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

A Novel Learning Method for Elman Neural Network Using Local Search

A Novel Learning Method for Elman Neural Network Using Local Search Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama

More information

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing

Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing Νεςπο-Ασαυήρ Υπολογιστική Neuro-Fuzzy Computing ΗΥ418 Διδάσκων Δημήτριος Κατσαρός @ Τμ. ΗΜΜΥ Πανεπιστήμιο Θεσσαλίαρ Διάλεξη 21η BackProp for CNNs: Do I need to understand it? Why do we have to write the

More information

Neural Networks. Lecture 2. Rob Fergus

Neural Networks. Lecture 2. Rob Fergus Neural Networks Lecture 2 Rob Fergus Overview Individual neuron Non-linearities (RELU, tanh, sigmoid) Single layer model Multiple layer models Theoretical discussion: representational power Examples shown

More information

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018

Backpropagation Introduction to Machine Learning. Matt Gormley Lecture 12 Feb 23, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation Matt Gormley Lecture 12 Feb 23, 2018 1 Neural Networks Outline

More information

Neural Networks Compression for Language Modeling

Neural Networks Compression for Language Modeling Neura Networks Compression for Language Modeing Artem M. Grachev 1,2, Dmitry I. Ignatov 2, and Andrey V. Savchenko 3 arxiv:1708.05963v1 [stat.ml] 20 Aug 2017 1 Samsung R&D Institute Rus, Moscow, Russia

More information

FFTs in Graphics and Vision. Spherical Convolution and Axial Symmetry Detection

FFTs in Graphics and Vision. Spherical Convolution and Axial Symmetry Detection FFTs in Graphics and Vision Spherica Convoution and Axia Symmetry Detection Outine Math Review Symmetry Genera Convoution Spherica Convoution Axia Symmetry Detection Math Review Symmetry: Given a unitary

More information

Stat 155 Game theory, Yuval Peres Fall Lectures 4,5,6

Stat 155 Game theory, Yuval Peres Fall Lectures 4,5,6 Stat 155 Game theory, Yuva Peres Fa 2004 Lectures 4,5,6 In the ast ecture, we defined N and P positions for a combinatoria game. We wi now show more formay that each starting position in a combinatoria

More information

Deep Learning II: Momentum & Adaptive Step Size

Deep Learning II: Momentum & Adaptive Step Size Deep Learning II: Momentum & Adaptive Step Size CS 760: Machine Learning Spring 2018 Mark Craven and David Page www.biostat.wisc.edu/~craven/cs760 1 Goals for the Lecture You should understand the following

More information

Deep Residual. Variations

Deep Residual. Variations Deep Residual Network and Its Variations Diyu Yang (Originally prepared by Kaiming He from Microsoft Research) Advantages of Depth Degradation Problem Possible Causes? Vanishing/Exploding Gradients. Overfitting

More information

arxiv: v2 [astro-ph.im] 17 Aug 2018

arxiv: v2 [astro-ph.im] 17 Aug 2018 Creating Virtua Universes Using Generative Adversaria Networks Mustafa Mustafa 1, Deborah Bard 1, Wahid Bhimji 1, Zarija Lukić 1, Rami A-Rfou 2, and Jan Kratochvi 3 1 Lawrence Berkeey Nationa Laboratory,

More information

RESIDUAL LEARNING WITHOUT NORMALIZATION

RESIDUAL LEARNING WITHOUT NORMALIZATION RESIDUAL LEARNING WITHOUT NORMALIZATION VIA BETTER INITIALIZATION Anonymous authors Paper under doube-bind review ABSTRACT Normaization ayers are a stape in state-of-the-art deep neura network architectures.

More information

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance

Research of Data Fusion Method of Multi-Sensor Based on Correlation Coefficient of Confidence Distance Send Orders for Reprints to reprints@benthamscience.ae 340 The Open Cybernetics & Systemics Journa, 015, 9, 340-344 Open Access Research of Data Fusion Method of Muti-Sensor Based on Correation Coefficient

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Machine Learning for Computer Vision 8. Neural Networks and Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group INTRODUCTION Nonlinear Coordinate Transformation http://cs.stanford.edu/people/karpathy/convnetjs/

More information

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation

Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Deep Neural Networks (3) Computational Graphs, Learning Algorithms, Initialisation Steve Renals Machine Learning Practical MLP Lecture 5 16 October 2018 MLP Lecture 5 / 16 October 2018 Deep Neural Networks

More information

SASIMI: Sparsity-Aware Simulation of Interconnect-Dominated Circuits with Non-Linear Devices

SASIMI: Sparsity-Aware Simulation of Interconnect-Dominated Circuits with Non-Linear Devices SASIMI: Sparsity-Aware Simuation of Interconnect-Dominated Circuits with Non-Linear Devices Jitesh Jain, Stephen Cauey, Cheng-Kok Koh, and Venkataramanan Baakrishnan Schoo of Eectrica and Computer Engineering

More information

Learning Gaussian Processes from Multiple Tasks

Learning Gaussian Processes from Multiple Tasks Kai Yu kai.yu@siemens.com Information and Communication, Corporate Technoogy, Siemens AG, Munich, Germany Voker Tresp voker.tresp@siemens.com Information and Communication, Corporate Technoogy, Siemens

More information

Lecture 4 Backpropagation

Lecture 4 Backpropagation Lecture 4 Backpropagation CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago April 5, 2017 Things we will look at today More Backpropagation Still more backpropagation Quiz

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

A Better Way to Pretrain Deep Boltzmann Machines

A Better Way to Pretrain Deep Boltzmann Machines A Better Way to Pretrain Deep Botzmann Machines Rusan Saakhutdino Department of Statistics and Computer Science Uniersity of Toronto rsaakhu@cs.toronto.edu Geoffrey Hinton Department of Computer Science

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

Theories of Deep Learning

Theories of Deep Learning Theories of Deep Learning Lecture 02 Donoho, Monajemi, Papyan Department of Statistics Stanford Oct. 4, 2017 1 / 50 Stats 385 Fall 2017 2 / 50 Stats 285 Fall 2017 3 / 50 Course info Wed 3:00-4:20 PM in

More information

Convolutional Neural Networks. Srikumar Ramalingam

Convolutional Neural Networks. Srikumar Ramalingam Convolutional Neural Networks Srikumar Ramalingam Reference Many of the slides are prepared using the following resources: neuralnetworksanddeeplearning.com (mainly Chapter 6) http://cs231n.github.io/convolutional-networks/

More information

Bayesian Networks (Part I)

Bayesian Networks (Part I) 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Bayesian Networks (Part I) Graphical Model Readings: Murphy 10 10.2.1 Bishop 8.1,

More information

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method High Spectra Resoution Infrared Radiance Modeing Using Optima Spectra Samping (OSS) Method J.-L. Moncet and G. Uymin Background Optima Spectra Samping (OSS) method is a fast and accurate monochromatic

More information

Classification of Multi-Frequency Polarimetric SAR Images Based on Multi-Linear Subspace Learning of Tensor Objects

Classification of Multi-Frequency Polarimetric SAR Images Based on Multi-Linear Subspace Learning of Tensor Objects Remote Sens. 2015, 7, 9253-9268; doi:10.3390/rs70709253 OPEN ACCESS remote sensing ISSN 2072-4292 www.mdpi.com/journa/remotesensing Artice Cassification of Muti-Frequency Poarimetric SAR Images Based on

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

Physics 127c: Statistical Mechanics. Fermi Liquid Theory: Collective Modes. Boltzmann Equation. The quasiparticle energy including interactions

Physics 127c: Statistical Mechanics. Fermi Liquid Theory: Collective Modes. Boltzmann Equation. The quasiparticle energy including interactions Physics 27c: Statistica Mechanics Fermi Liquid Theory: Coective Modes Botzmann Equation The quasipartice energy incuding interactions ε p,σ = ε p + f(p, p ; σ, σ )δn p,σ, () p,σ with ε p ε F + v F (p p

More information

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4

Neural Networks Learning the network: Backprop , Fall 2018 Lecture 4 Neural Networks Learning the network: Backprop 11-785, Fall 2018 Lecture 4 1 Recap: The MLP can represent any function The MLP can be constructed to represent anything But how do we construct it? 2 Recap:

More information

Deep Neural Networks (1) Hidden layers; Back-propagation

Deep Neural Networks (1) Hidden layers; Back-propagation Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single

More information

arxiv: v4 [math.na] 25 Aug 2014

arxiv: v4 [math.na] 25 Aug 2014 BIT manuscript No. (wi be inserted by the editor) A muti-eve spectra deferred correction method Robert Speck Danie Ruprecht Matthew Emmett Michae Minion Matthias Boten Rof Krause arxiv:1307.1312v4 [math.na]

More information

Neural Networks 2. 2 Receptive fields and dealing with image inputs

Neural Networks 2. 2 Receptive fields and dealing with image inputs CS 446 Machine Learning Fall 2016 Oct 04, 2016 Neural Networks 2 Professor: Dan Roth Scribe: C. Cheng, C. Cervantes Overview Convolutional Neural Networks Recurrent Neural Networks 1 Introduction There

More information

The Shape of the CMB Lensing Bispectrum

The Shape of the CMB Lensing Bispectrum The Shape of the CMB Lensing Bispectrum Duncan Hanson Berkeey Lensing Workshop, Apri 22, 2011 The shape of the CMB ensing bispectrum Lewis, Chainor, Hanson (2011) arxiv:1101.2234 CMB ensing and primordia

More information