Agenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT

Size: px

Start display at page:

Download "Agenda. Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2 DCT"

Augustine Rose
5 years ago
Views:

1 versus 1

2 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 2

3 3

4 4

5 5

6 6

7 7

8 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 8

9 input x i 1 E x i 1 from E x i output x i label y i Energy (cost) E = y i x i 7 update w i = w i μ E w i w i weight E w i from E x i 2 f i (x i 1, w i )=w it x i 1 1 E = E f i(x i 1,w i ) w i x i w i 3 (w it x i ) w i = x i 1 4 E w i 1 = E f i(x i 2,w i 1) x i 1 w i 1 5 E = E f i(x i 1,w i ) x i 1 x i x i 1 6 f i (w it x i 1 ) x i 1 = w i T 9

10 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 10

11 Convolutional Neural Network Max pooling 2x2 Max pooling 2x2 Prediction Labels NN NN 11

12 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 12

13 Autoencoder Reconstructed Input Sigmoid e x Decoder w 2 = N in xn out b 2 = N out xn batch Y 2 = W 2 Y 1 + b 2 Y 1 = N out xn batch Input I 1 = N in xn batch Encoder w 1 = N out xn in b 1 = N inxn batch Y 1 = W 1 I 1 + b 1 Sigmoid e x 13

14 Stacked Autoencoder I 1 = 784x100 Reconstructed Input I 2 = 1000x100 I 3 = 500x100 I 4 = 250x100 Reconstructed Input Reconstructed Input Reconstructed Input Decoder Decoder Decoder Decoder Input Input Input Input Encoder Encoder Encoder Encoder I 5 = 30x100 Classifier Y = 10x100 14

15 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 15

16 Caffe Deep learning framework Advantages: Expressive architecture Extensible code Speed (1ms/image for inference and 4 ms/images for learning) Community (>1,000 developers) 16

17 Caffe Model Zoo 17

18 Others 18

[chunks, diffs] Blobs: bottom->input, top->output Layers: setup, forward

19 Caffe Layer-by-layer design [from data to loss] Information storage -> blobs (4D array - [Batch Size, Number of channels, Height, Weight]) Blobs: [chunks, diffs] Blobs: bottom->input, top->output Layers: setup, forward and backward Net: Set of layers Solver: manages model optimization (updates) 19

20 Caffe Model definition: protocol buffer definition file (prototxt) Protocol buffers: Google s language-neutral, platform-neutral, extensible mechanism for serializing structured data 20

21 Layers [KEYWORD] Vision Layers - Convolution [CONVOLUTION] - Pooling [POOLING] - Local Response Normalization [LRN] Common Layers - Inner Product [INNER_PRODUCT] - Splitting [SPLIT]: input blob -> multiple output blobs - Flattening [FLATTEN]: Blob to vector conversion - Concatenation [CONCAT] - Slicing [SLICE]: input layer -> multiple output layer - Element-wise operations [ELTWISE] - Argmax [ARGMAX] - Softmax [SOFTMAX] - Mean-Variance Normalization [MVN] Activation / Neuron Layers - ReLU / Rectifies-Linear and Leaky-ReLU [RELU] - Sigmoid [SIGMOID] - TanH / Hyperbolic Tangent [TANH] - Absolute Value [ABSVAL] - Power [POWER] - Binomial Normal Log Likelihood [BNLL] Loss Layers - Softmax [SOFTMAX_LOSS] - Sum-of-Squares / Euclidean [EUCLIDEAN_LOSS] - Hinge/Margin [HINGE_LOSS] - Sigmoid Cross Entropy [SIGMOID_CROSS_ENTROPY_LOSS] - Infogain [INFOGAIN_LOSS] - Accuracy and Top-k: [ACCURACY]: accuracy of the output - with respect to the target, no backward steps Data Layers - Database [DATA] - Memory [In-Memory]: Reads data directly from memory without copying it - HDF5 Output [HDF5_OUTPUT]: Write input blobs to disk - Images [IMAGE_DATA] - Windows [WINDOWS_DATA] - Dummy [DUMMY_DATA] PS: Keywords can change from version to the version 21

22 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 22

23 23

24 24

25 25

26 26

27 Max pooling 2x2 27

28 Max pooling 2x2 Max pooling 2x2 28

29 Max pooling 2x2 Max pooling 2x2 NN 29

30 Max pooling 2x2 Max pooling 2x2 Prediction NN NN 30

31 31

32 32

33 33

34 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 34

35 35

36 36

37 37

38 I 1 = N in xn batch 38

39 6- Define Encoder 6 - Define the 1 st Encoder 7- Define 2 nd the Encoder 2 nd Encoder 8- Define the 3 rd Encoder 9- Define 4the Encoder 4 th Encoder layer { name: "encode1" type: "InnerProduct" bottom: "data" top: "encode1" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "encode1neuron" type: "Sigmoid" bottom: "encode1" top: "encode1neuron" layer { name: "encode2" type: "InnerProduct" bottom: "encode1neuron" top: "encode2" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 500 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "encode2neuron" type: "Sigmoid" bottom: "encode2" top: "encode2neuron" layer { name: "encode3" type: "InnerProduct" bottom: "encode2neuron" top: "encode3" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 250 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "encode3neuron" type: "Sigmoid" bottom: "encode3" top: "encode3neuron" layer { name: "encode4" type: "InnerProduct" bottom: "encode3neuron" top: "encode4" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 30 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 Input Input Input Input Encoder Encoder Encoder Encoder 39

${ lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 250 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer {$

40 10 - Define the 1 st Decoder 11 - Define the 2 nd Decoder 12 - Define the 3 rd Decoder 13 - Define the 4 th Decoder layer { name: "decode4" type: "InnerProduct" bottom: "encode4" top: "decode4" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 250 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "decode4neuron" type: "Sigmoid" bottom: "decode4" top: "decode4neuron" layer { name: "decode3" type: "InnerProduct" bottom: "decode4neuron" top: "decode3" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 500 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "decode3neuron" type: "Sigmoid" bottom: "decode3" top: "decode3neuron" layer { name: "decode2" type: "InnerProduct" bottom: "decode3neuron" top: "decode2" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 1000 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 layer { name: "decode2neuron" type: "Sigmoid" bottom: "decode2" top: "decode2neuron" layer { name: "decode1" type: "InnerProduct" bottom: "decode2neuron" top: "decode1" param { lr_mult: 1 decay_mult: 1 param { lr_mult: 1 decay_mult: 0 inner_product_param { num_output: 784 weight_filler { type: "gaussian" std: 1 sparse: 15 bias_filler { type: "constant" value: 0 Reconstructed Input Reconstructed Input Reconstructed Input Reconstructed Input Decoder Decoder Decoder Decoder 40

41 I 1 = 784x100 Reconstructed Input I 2 = 1000x100 I 3 = 500x100 I 4 = 250x100 Reconstructed Input Reconstructed Input Reconstructed Input Decoder Decoder Decoder Decoder Input Input Input Input Encoder Encoder Encoder Encoder I 5 = 30x100 Classifier Y = 10x100 Labels = 10x100 Loss= Y Labels 2 41

42 42

43 43

44 Agenda Deep Learning: Motivation Learning: Backpropagation Deep architectures I: Convolutional Neural Networks (CNN) Deep architectures II: Stacked Auto Encoders (SAE) Caffe Deep Learning Toolbox: Basics Caffe Deep Learning Toolbox: Examples Digit Classification using CNN Digit Classification using SAE Visualization: Class models, filters and saliency 44

45 Visualization How? Deep Inside Convolutional Networks: Visualizing Image Classification Models and Saliency Maps 45

46 VISUALIZING CLASS MODELS 46

47 VISUALIZING CLASS MODELS 47

48 VISUALIZING CLASS MODELS 48

49 VISUALIZING FILTERS 49

50 SALIENCY 50

51 51

52 52

53 Backup 53

54 Caffe Tutorial Nets, layers and blobs Forward/backward Loss Solver Layer catalogue Interfaces Data

Convolutional Neural Networks

Convolutional Neural Networks Books» http://www.deeplearningbook.org/ Books http://neuralnetworksanddeeplearning.com/.org/ reviews» http://www.deeplearningbook.org/contents/linear_algebra.html» http://www.deeplearningbook.org/contents/prob.html»