Classification of Hand-Written Digits Using Scattering Convolutional Network

Similar documents
Convolutional Neural Networks

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Neural networks and optimization

Lipschitz Properties of General Convolutional Neural Networks

Theories of Deep Learning

Jakub Hajic Artificial Intelligence Seminar I

Invariant Scattering Convolution Networks

Sajid Anwar, Kyuyeon Hwang and Wonyong Sung

Brief Introduction of Machine Learning Techniques for Content Analysis

Introduction to Deep Neural Networks

Introduction to Deep Learning CMPT 733. Steven Bergner

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Welcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

SGD and Deep Learning

Harmonic Analysis of Deep Convolutional Neural Networks

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning

Introduction to Deep Learning

Wavelet Scattering Transforms

Spatial Transformer Networks

Constrained Optimization and Support Vector Machines

Deep Learning (CNNs)

Lecture 17: Neural Networks and Deep Learning

Neural Networks: Backpropagation

Intelligent Handwritten Digit Recognition using Artificial Neural Network

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

RegML 2018 Class 8 Deep learning

Classification with Perceptrons. Reading:

arxiv: v1 [cs.lg] 17 Dec 2017

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Handwritten Indic Character Recognition using Capsule Networks

Comparison of Modern Stochastic Optimization Algorithms

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Based on the original slides of Hung-yi Lee

Lecture 7 Convolutional Neural Networks

When Harmonic Analysis Meets Machine Learning: Lipschitz Analysis of Deep Convolution Networks

Convolutional Neural Networks II. Slides from Dr. Vlad Morariu

CSE446: Neural Networks Spring Many slides are adapted from Carlos Guestrin and Luke Zettlemoyer

Maxout Networks. Hien Quoc Dang

Least Mean Squares Regression

Feature Design. Feature Design. Feature Design. & Deep Learning

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Deep learning on 3D geometries. Hope Yao Design Informatics Lab Department of Mechanical and Aerospace Engineering

How to do backpropagation in a brain

Topics in Harmonic Analysis, Sparse Representations, and Data Analysis

Learning Deep Architectures for AI. Part II - Vijay Chakilam

GAMINGRE 8/1/ of 7

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels

Machine Learning Lecture 10

A Hybrid Model of Wavelet and Neural Network for Short Term Load Forecasting

Deep Learning: a gentle introduction

Understanding How ConvNets See

Deep Feedforward Networks

Large-Scale Feature Learning with Spike-and-Slab Sparse Coding

Deep Learning. Convolutional Neural Networks Applications

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 16

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Artificial Neural Networks

Reconnaissance d objetsd et vision artificielle

Bayesian Networks (Part I)

Deep Convolutional Neural Networks on Cartoon Functions

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

Notes on Adversarial Examples

Machine Learning and Deep Learning! Vincent Lepetit!

Logistic Regression. COMP 527 Danushka Bollegala

DEEP LEARNING AND NEURAL NETWORKS: BACKGROUND AND HISTORY

RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY

Grundlagen der Künstlichen Intelligenz

Neural Networks, Computation Graphs. CMSC 470 Marine Carpuat

Reading Group on Deep Learning Session 1

CSCI 1951-G Optimization Methods in Finance Part 12: Variants of Gradient Descent

EVERYTHING YOU NEED TO KNOW TO BUILD YOUR FIRST CONVOLUTIONAL NEURAL NETWORK (CNN)

Logistic Regression Introduction to Machine Learning. Matt Gormley Lecture 8 Feb. 12, 2018

Course 10. Kernel methods. Classical and deep neural networks.

A summary of Deep Learning without Poor Local Minima

Artificial neural networks

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Introduction to Neural Networks

ECE521 Lectures 9 Fully Connected Neural Networks

Neural networks and optimization

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Learning to Disentangle Factors of Variation with Manifold Learning

Sparse Support Vector Machines by Kernel Discriminant Analysis

Machine Learning Lecture 12

Machine Learning for Computer Vision 8. Neural Networks and Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Convolutional Neural Networks. Srikumar Ramalingam

Machine Learning 2007: Slides 1. Instructor: Tim van Erven Website: erven/teaching/0708/ml/

Topic 3: Neural Networks

B-Positive Particle Swarm Optimization (B.P.S.O)

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Image classification via Scattering Transform

Advanced computational methods X Selected Topics: SGD

arxiv: v1 [cs.cv] 14 Sep 2016

Convolutional Neural Network Architecture

I D I A P. Online Policy Adaptation for Ensemble Classifiers R E S E A R C H R E P O R T. Samy Bengio b. Christos Dimitrakakis a IDIAP RR 03-69

Deep Convolutional Neural Networks for Pairwise Causality

Change point method: an exact line search method for SVMs

Global Optimality in Matrix and Tensor Factorizations, Deep Learning and More

Transcription:

Mid-year Progress Report Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan Co-Advisor: Dr. Maneesh Singh (SRI)

Background

Overview Image classification Hand-written digits Feature extractor Convolutional neural network Machine learning techniques Typical MNIST training data of 28-by-28 pixels

Scattering Convolutional Network Convolution stage Detector stage Pooling stage dilation of wavelets low-pass filter

Scattering Convolutional Network Scattering propagator (t) = d ( t) q =( 1, 2,, m) U[q]x = x 1 2 m Scattering transform S[q]x = U[q]x J x Input Output S[;]x = x S[ 1 1]x = U[ 1 1] U[ 1 1]x U[ 2 1]x S[ 2 1]x = U[ 2 1] U[ 1 1, 1 2]x U[ 1 1, 2 2]x U[ 1 1, 3 2]x U[ 2 1, 4 2]x U[ 2 1, 5 2]x U[ 2 1, 6 2]x

Machine Learning Techniques Gradient descent Back-propagation Support Vector Machine (SVM)

images of 28-by-28 pixels Input: N pairs of (x, t) 0 labels (which class x belongs to) Unknown parameters: {,w,b} x g # L g # L h 1 1 h 2 1 h 3 1 g # L g # L y SVM {w, b} h 1 2 h 2 2 h 3 2 h 4 2 h 5 2 h 6 2 h 7 2 h 8 2 h 9 2 g # L h j k (t 1,t 2 )= j k,1 j k,2 ( j k,1 t 1) ( j k,2 t 2) L is down-sampling factor g is the low-pass filter use cross validation for both

Approach and Implementation

The Optimization Problem

Two Step Optimization The above problem is non-convex difficult to solve with respect to {λ, w, b} We can iterate between two problems First, fix the filters in the scattering network, train the SVM convex We use libsvm library to train the SVM Publicly available at https://www.csie.ntu.edu.tw/~cjlin/libsvm/ Second, fix w and b, train the filters in the scattering network easier

Convolution Layer Filters to train: h j k Each filter contains two parameters j k,1, j k,2 h j k (t 1,t 2 )= j k,1 j k,2 ( j k,1 t 1) ( j k,2 t 2) Gradient descent

Algorithm

Implementation Personal Laptop CPU: 2GHz Intel Core i7 Memory: 8 GB 1600 MHz DDR3 OS X El Capitan Version 10.11 MATLAB R2015b Run from terminal window (no GUI) MNIST database Publicly available at http://yann.lecun.com/exdb/mnist/ Image: 28 28 pixels (each pixels value between 0~255) 60,000 training examples 10,000 testing examples

Implementation Use 5400 training examples for each digit cross-validation for model selection 3600 for training / 1800 for testing Rescale pixel values from [0,255] to [0,1] Start with η = 1, C = 1 Randomly generate λ > 0.1 For gradient descent, adjust η to make sure that λ > 0.1

Forward Propagation Convolution a b A c d d b c a X H X H

LIBSVM Install libsvm MATLAB generates.mex file to call.c file MATLAB 2015b does not detect Xcode7 Need to download xcode7_mexopts.zip Label: rescale {0,1} to {-1,1} Update w and b from the output of libsvm

Back Propagation The derivative of F (t) =( t 2 + 2 ) 1/2 F 0 (t) = The partial derivative of the loss function t ( t 2 + 2 ) 1/2

Back Propagation

Back Propagation Propagate backward Take product of all marked values Sum when branches merge

Parameter Selection Det = Deterministic method for training w and b Sto = Stochastic method for training w and b LP = Take a low-pass filter for g AV = Take local average value for convolution with g L=3 / Sto / LP error % sum of loss function L=3 / Sto / AV error % sum of loss function 0 1.67 399.3 0 0.72 253.6 1 0.39 202.1 1 0.11 135.7 Training Time 14.5h Training Time 5.5h

Parameter Selection Det = Deterministic method for training w and b Sto = Stochastic method for training w and b LP = Take a low-pass filter for g AV = Take local average value for convolution with g L=4 / Sto / LP error % sum of loss function L=4 / Sto / AV error % sum of loss function 0 7.44 555.1 0 3.17 456.6 1 2.11 454.5 1 0 122.9 Training Time 10.2h Training Time 5.2h

Parameter Selection Det = Deterministic method for training w and b Sto = Stochastic method for training w and b LP = Take a low-pass filter for g AV = Take local average value for convolution with g L=5 / Sto / LP error % sum of loss function L=5 / Sto / AV error % sum of loss function 0 13.5 976.8 0 3.56 494.3 1 0 336.7 1 0.11 63.8 Training Time 8.5h Training Time 5.3h

Parameter Selection Det = Deterministic method for training w and b Sto = Stochastic method for training w and b LP = Take a low-pass filter for g AV = Take local average value for convolution with g L=4 / Det / LP error % sum of loss function 0 1.56 70.2 1 1.17 54.4 Training Time 10.1h for 1 loop

Project Status

Timeline 2015 Sept - Oct: define the project, design the algorithm Nov: write codes for network training Dec: write codes for 2-class classification Multi-class if time permits 2016 Jan - Feb: complete validation Mar: complete testing Apr: wrap up the project

Validation and Testing Validation MatConvNet toolbox publicly available at http://www.vlfeat.org/matconvnet/ Testing Testing examples in MNIST database Measure: percentage of errors Compare with libsvm results

Deliverables Datasets Toolboxes MATLAB codes Trained network Results Proposal, Reports, Presentation slides, etc.

Bibliography [1] Y. Bengio, I. J. Goodfellow and A. Courville, Deep Learning, Book in preparation for MIT press, 2015. [2] C. M. Bishop, Pattern Recognition and Machine Learning, New York: Springer, 2006. [3] J. Bruna and S. Mallat, Invariant Scattering Convolution Networks, Pattern Analysis and Machine Intelligence, IEEE Transactions 35(8)(2013), 1872-1886. [4] C. Chang and C. Lin, LIBSVM : a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm [5] S. Mallat, Group Invariant Scattering, Comm. Pure Appl. Math., 65 (2012): 1331-1398. [6] A. Krizhevsky, I. Sutskever, G. E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems 2 (2012): 1097-1105. [7] C. Szegedy et al, Going Deeper with Convolutions, Open Access Version available at http://www.cvfoundation.org/openaccess/content_cvpr_2015/papers/szegedy_going_deeper_with_2015_cvpr_paper.pdf [8] A. Vedaldi and K. Lenc, MatConvNet - Convolutional Neural Networks for MATLAB, Proc. of the ACM Int. Conf. on Multimedia, 2015. [9] T. Wiatowski and H. Bölcskei, Deep Convolutional Neural Networks Based on Semi-Discrete Frames, Proc. of IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, pp. 1212-1216, June 2015.

THANK YOU