On the road to Affine Scattering Transform

Similar documents
Image classification via Scattering Transform

Invariant Scattering Convolution Networks

Quantum Chemistry Energy Regression

arxiv: v2 [cs.cv] 30 May 2015

Harmonic Analysis of Deep Convolutional Neural Networks

Invariant Scattering Convolution Networks

RegML 2018 Class 8 Deep learning

Topics in Harmonic Analysis, Sparse Representations, and Data Analysis

Building a Regular Decision Boundary with Deep Networks

Classification of Hand-Written Digits Using Scattering Convolutional Network

Spatial Transformer Networks

Neural networks and optimization

Classification goals: Make 1 guess about the label (Top-1 error) Make 5 guesses about the label (Top-5 error) No Bounding Box

Global Optimality in Matrix and Tensor Factorizations, Deep Learning and More

SGD and Deep Learning

STA 414/2104: Lecture 8

Multi-scale Geometric Summaries for Similarity-based Upstream S

Deep Learning: Approximation of Functions by Composition

Machine Learning for Signal Processing Neural Networks Continue. Instructor: Bhiksha Raj Slides by Najim Dehak 1 Dec 2016

Stable and Efficient Representation Learning with Nonnegativity Constraints. Tsung-Han Lin and H.T. Kung

arxiv: v1 [cs.cv] 7 Mar 2014

A HYBRID NETWORK: SCATTERING AND CONVNET

Deep Convolutional Neural Networks on Cartoon Functions

Wavelet Scattering Transforms

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Jakub Hajic Artificial Intelligence Seminar I

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels

Deep Learning of Invariant Spatiotemporal Features from Video. Bo Chen, Jo-Anne Ting, Ben Marlin, Nando de Freitas University of British Columbia

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

What is semi-supervised learning?

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Tasks ADAS. Self Driving. Non-machine Learning. Traditional MLP. Machine-Learning based method. Supervised CNN. Methods. Deep-Learning based

CITS 4402 Computer Vision

Lecture 7 Convolutional Neural Networks

What Do Neural Networks Do? MLP Lecture 3 Multi-layer networks 1

MULTISCALE SCATTERING FOR AUDIO CLASSIFICATION

Feature Design. Feature Design. Feature Design. & Deep Learning

Scattering.m Documentation

Lecture 13 Visual recognition

Sajid Anwar, Kyuyeon Hwang and Wonyong Sung

RAGAV VENKATESAN VIJETHA GATUPALLI BAOXIN LI NEURAL DATASET GENERALITY

Deep Haar scattering networks

Deep Learning: a gentle introduction

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Deep Learning (CNNs)

Derived Distance: Beyond a model, towards a theory

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Apprentissage, réseaux de neurones et modèles graphiques (RCP209) Neural Networks and Deep Learning

<Special Topics in VLSI> Learning for Deep Neural Networks (Back-propagation)

Optical Character Recognition of Jutakshars within Devanagari Script

Stochastic Gradient Descent: The Workhorse of Machine Learning. CS6787 Lecture 1 Fall 2017

Intelligent Systems Discriminative Learning, Neural Networks

Introduction to Convolutional Neural Networks 2018 / 02 / 23

Unsupervised Feature Learning from Temporal Data

JOINT TIME-FREQUENCY SCATTERING FOR AUDIO CLASSIFICATION

When Dictionary Learning Meets Classification

Linear Classifiers as Pattern Detectors

SIMPLE GABOR FEATURE SPACE FOR INVARIANT OBJECT RECOGNITION

Multiscale methods for neural image processing. Sohil Shah, Pallabi Ghosh, Larry S. Davis and Tom Goldstein Hao Li, Soham De, Zheng Xu, Hanan Samet

Brief Introduction of Machine Learning Techniques for Content Analysis

CSC321 Lecture 16: ResNets and Attention

Higher-order Statistical Modeling based Deep CNNs (Part-I)

Learning Methods for Linear Detectors

STA 414/2104: Lecture 8

BACKPROPAGATION. Neural network training optimization problem. Deriving backpropagation

arxiv: v1 [cs.lg] 25 Sep 2018

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net

Deep Neural Networks (1) Hidden layers; Back-propagation

Review: Learning Bimodal Structures in Audio-Visual Data

Statistical Machine Learning

Introduction to Machine Learning Midterm Exam

Kernel Density Topic Models: Visual Topics Without Visual Words

Lecture 14: Deep Generative Learning

Neural networks and optimization

Machine Learning. Support Vector Machines. Manfred Huber

Improved Local Coordinate Coding using Local Tangents

Introduction to Mathematical Programming

Normalization Techniques in Training of Deep Neural Networks

Is Robustness the Cost of Accuracy? A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

Multimodal context analysis and prediction

Dimensionality Reduction

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Theories of Deep Learning

Multi-Layer Boosting for Pattern Recognition

Wavelet-based Salient Points with Scale Information for Classification

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

EE 6882 Visual Search Engine

Modularity Matters: Learning Invariant Relational Reasoning Tasks

CS5670: Computer Vision

Group Equivariant Convolutional Networks. Taco Cohen

Global Optimality in Matrix and Tensor Factorization, Deep Learning & Beyond

Neural Networks. Nicholas Ruozzi University of Texas at Dallas

Tensor Methods for Feature Learning

Overview. Introduction to local features. Harris interest points + SSD, ZNCC, SIFT. Evaluation and comparison of different detectors

TUTORIAL PART 1 Unsupervised Learning

Maxout Networks. Hien Quoc Dang

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Fisher Vector image representation

Compressing the Input for CNNs with the First-Order Scattering Transform

Transcription:

1 On the road to Affine Scattering Transform Edouard Oyallon with Stéphane Mallat following the work of Laurent Sifre, Joan Bruna,

High Dimensional classification (x i,y i ) 2 R 5122 {1,...,100},i=1...10 4! ŷ(x)? 22 Training set to predict labels Rhino Caltech 101, etc Rhino class from Caltech 101 Not a rhino

3 Translation x y kx yk 2 =2 Rotation Averaging is the key to get invariants x y High dimensionality issues

Fighting the curse of dimensionality 4 Geometric variability Groups acting on images: translation, rotation, scaling Class variability Intraclass variability Not informative Other sources : luminosity, occlusion, small deformations x (u) =x(u (u)), 2C 1 I Extraclass variability Can be carefully handled Needs to be learned x x L reduces class variability L x Classifier

5 Existing approach Ref.: Discovering objects and their location in images. Sivic et al. High-dimensional Signature Compression for Large-Scale Image Classification, Perronnin et al. Unsupervised learning: Bag of Words, Fisher Vector, {x 1,...,x N }! Supervised learning: Deep Learning, {(x 1,y 1 ),...,(x N,y N )}!L Non learned: HMax, Scattering Transform. {G 1,...}! Ref.: Robust Object Recognition with Cortex-Like Mechanisms. Serre et al.

DeepNet? 6 It is a cascade based on a lot of linear operators followed by non linearities. Each operator is supervisedly learned Ref.: Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al. Convolutional network and applications in vision. Y. LeCun et al. Sort of Super SIFT State-of-the arts on ImageNet and most of the benchmarks. % accuracy SIFT+FV DeepNet 100 90 human 80 70 60 2010 2011 2012 2013 2014 2015 Ref.: image-net.org

Complexity of the architecture 7 Requires a huge amount of data Need many engineering to select the hyper parameters and to optimise it Interpreting the learned operators is hard when the network is deep(i.e. more than 3 layers) Few theoretical results, yet outstanding numerical results. Ref.: Intriguing properties of neural networks, C. Szegedy et al.

Operators of a Deep architecture 8 Linear operators are often convolutional whose kernels are small filters. x j (u, 2) =f( X 1 x j 1 (., 1)?h j, 1 (u))! x j = f(f j x j 1 ) Deep. Contains a phase information f is often a ReLu: x! max(0,x) Sometimes pooling which leads to a down sampling.

9 Classifier f(f 1 ) f(f 2 ) f(f J ) x 0 x 1 x 2 x J DeepNetwork Ref.: ImageNet Classification with Deep Convolutional Neural Networks. A Krizhevsky et al.

Does everything need to be learned? A Scattering Network is a deep architecture, 10 where all the filters are predefined. Ref.: Invariant Convolutional Scattering Network, J. Bruna et Mallat S We challenge the necessity to learn the weights of every filters of a deep architecture. Scattering gets state-of-the-art results on unsupervised learning for some complex datasets. Ref.: Deep Rototranslation Scattering Network for complex Image Recognition, EO, S Mallat We highlight the similarity of the Scattering Network with the DeepNet architectures.

<latexit sha1_base64="newmldbod1kkandkku6+pt7tzd8=">aaace3icbzbns0jbfibpts+zr1st2wxkochyb4tqe0gtammqkxhf5o6jds6de5mzg4r4i9q07l+0avhrtk27/k3jr1dacwmp7zmhoef1i86udpwvk7g0vlk6llxpbwxube/yu3u3kowlorus8ldwfkwoz4jwnnoc1ijjcebzwvv7f+n69y5kxujxowcrbqs4i1ibeayn1btzxjuumhpul/xqx2mcxraqv+6ybkfyz51nqj9r2hmn6eyefsgdqaau9vkpafbu2p9ekyrxqiumhctvd51in4zyaky4haw8wneikx7u0lpbgqoqgspjusn0ajwwmjuzjzsaul8nhjhqahd4pjpauqvma2pzv1o91u3txpcjknzukolh7zgjhajxqqjfjcwadwxgipnzfzeulphok2pkhodon7wilapicdg5djolc5gqcqeqhiy4cailuiiyvidaptzbc7xad9az9wa9t1st1mxmh/7i+vggeckd9g==</latexit> sha1_base64="srdvla/owkcv7kmocnqmjpznjn0=">aaace3icbzdlssnafiynxmu9rv26gvqelpasufa3qtcflisyw2hcmuyn7ddjjmxmpkh0iqtxvdy4uhhrxl3fxulf0nyfbj7+cw5zzh/ejeplwsnjaxlldw09s5hd3nre2tx39u9klahmxbyxsnqdjamjnlikkkbqssaodbipbb3lcb12t4skeb9vauz8ehu4bvomllaazslrrwixbvvlh+p4lmormvsqxvrovprf8wn1i00zb1wsieai2dpiozmv9dhy0mrt/pjaeu5cwhvmsmqgbcxkhychkgzkmpusswkee6hdgho5con0b5ojhvbioy2od9kpkzhxf08muchlgga6m0sqk+dry/o/winr7tn/qhmckmlx9kn2wqck4dgh2kkcymvsdqglqnefuisewkrnmnuh2pmnl4j7xdmpwdd23rkau2xaicibardbkxdanagcf2dwaj7bk3gznowx4934mlyugboza/bhxuc3gaaffa==</latexit> <latexit sha1_base64="zbot7jba2tu1iy/+rhd2jwirzvo=">aaacfhicbva9bxnbej0zstamhwzkmlwsslaswhcpgabjgoyihspfiswfsfbwe/bke7un3tnw6eq/quplz6chiiiwgi7/juupirhzpjge3pvrzlwolcki7996psdb2ztpy88qz3f39g+ql15ewz0zxjtms226ebvccsu7kfdybmo4tsljr6pjx7l//yubk7s6xdzl/ysoligfo+ikqfukfx+2x6i+bzdwxkur5deamrojnuz/jdof+7k4dwb1vdgo1vymvwdzjmgk1fqh4fepaggpqv/dowzzwhuysa3tbx6k/yiafezywsxmle8pm9ar7zmqamjtv1h8nsnhthmswbtxcslcvt9r0mtapilcz0jxbne9ufiy18swftcvheoz5iotf8wzjkjjpciyfiyzllkjlbnhbivsta1l6iksubcc9zc3sees+abpxws11gdyogyv4rdqembbamenaemhghydn/abbrzv3i/vj/d32vryvjov4ag8f3ebxz+f</latexit> sha1_base64="yaq/o5/iled4o5cpingjcktvrwy=">aaacfhicbva9swnben3z2/gvtbrzeosigu4s1eyi2lhyrdaq5gly2+wls/z2j9059tjyj0twr9hyqnha2oxfueks/how8hhvhpl5qsy4adfto2pje5nt0zozubn5hcwl/plkuvgjpqxglvd6micgcs5zdtgidhlrrqjasiugeztwl66znlzjm0hj1ohiw/kquwjwaua30go/2ugl2w3snyjzfiwezdsdifqrg3w7dk+yba9xsjea+ajbdofaf4n3ryqvgr9536+k1wb+w28pmkrmahxemlrnxtdiiazobevl/mswmnauabo6pzjezdsy4vc9vg6vfg6vtiubd9xvexmjjemjwhzgbdrmtzcq//pqcyt7jyzloaem6whrmagmcg8iwi2ugqwrwkko5vzwtdteewo2yjwnwfv98l9s2ynvlt1tr1g5rcpmodvuqcxkot1uqceoimqiojv0ij7ri/pgpdmvztuodcz5mllfp+c8fwkjqael</latexit> <latexit sha1_base64="mmw7b12yit5ppp6c5e2nqa1b1iu=">aaacjxicbvdlsgmxfm3uv62vqks3wsk0ogwmiloriik4rgaf0cldjr1t02yyy5irsunxupfx3lioirjyv0wfilyecjyccy/33unhnclt259wyml5zxutuz7a2nza3knv7lvugeskzrryunz8ooazawxnnidajieepoeq37se+9vhkiqf4l73i2gepc1yi1gijesll91wkannmlkchnd3glsvkbrfzn1sh2vrhsvltn4+3vxuynsf7lbb2phsgttvt4axitmjgtrdyuup3gzi4wceppwovxfssdcgrgpgoqxtbqwgirrh2la3vjaavgmwoxoij4zsxgzf84tge/v3x4aesvud31qgrhfuvdcw//pqsw5dnazmrlegqaedwjhhostjzhctsaca9w0hvdkzk6ydignvjtmuccgzp3mrlav5s7x9d5opxs3sskiddiiyyehnqihuuqmveuvp6awn0jv1bl1a79bhtdrhzxr20r9yx99o7qi0</latexit> <latexit sha1_base64="1xfmdujgtthl4upbizcwn8io4km=">aaaccnicbvdlssnafl3xwesr6tln0ck0kcvxos6lblxwmlbqhdczttuhk0mymqgldo/gf/al3lhqcesxupnvnd4w2nrgwugce7n3nijltgnh+bawlldw19ylg8xnre2dxxtv/04lmstuiwlpzcvcinimqkez5rsvsorjinnmnlga+817khvlxk0epjsicu+wlinygym0s343kzhzxe6qr1gvxhw/0wevvsiqveqbww3tslnzjkclxj2rcr3khz8bqco0v/xoqrkyck04vqrtoqkociw1i5yoin6mairjapdo21cby6qcfpllcb0zpypmuaaerhp190soy6wgcwq6y6z7at4bi/957ux3l4kcitttvjdpom7gku7qobjuyziszyegyc sha1_base64="hbbaurjdhcixnua1kcmznk8/ebi=">aaaccnicbvdlssnafj3uv62vqes3q4vqoptehbomunfzwdhce8jkommhtizhzike0l0bd36hgxcqbv0cd/0bp4+fth64cdjnxu69j0wzlcqyxkzpzxvtfao8wdna3tndm/cp7mwscuxcnlbedeikcaocuioqrjqpicgogwmhw+uj334gqtke36k8jx6m+pxgfcolpccseleiegoqnkjp0n6m6l5rqoudgdya0noghzg1q2lnazejpsc1p+qdpi+dvbwy314vwvlmumimsdm1rvt5brkkykzgfs+tjev4ipqkqylhmzf+mf1lbi+10op6kf1cwan6e6jaszr5horogkmbxpqm4n9en1prpv9qnmakcdxbfgumqgrogoe9kghwlnceyu <latexit sha1_base64="gm84j/6ma8kbnher0bkewwdughc=">aaab7xicbza9swnbeibn4lemx1e7braje sha1_base64="ig/w51dvf9qc2vz4ewqmcwtiwzk=">aaab7xicbzc7sgnbfibpxlumt6idnoole x<latexit sha1_base64="ho sha1_base64="ft DATA 11 Desirable properties of a representation Invariance to group G of transformation 8x, 8g 2 G, (g.x) = (x) Stability to noise 8x, y, k (x) (y)k 2 applekx yk 2 <latexit sha1_base64="5afmzqtj9eto92b9zxcfxndiztg=">aaacjxicbvdlsgnbeoyn7/ikevqyjaikmezmof6eobepeywjzeoynxtmknmhm7osjfgvghd/xyshfcgtv+jkk4mmfgxtvhxt3evfgitt219wzm5+yxfpesw7ura+sznb2r5ryswz1lgoqtnwqelba6xprgu2ionu9wtwvf7fyk/fo1q8dk51emhlp7cb73jgtzhauto3g0oqbbkukyih7g1k7vz7fh9wcjt+yugqtcuuwluuksfrmthauyjdsloqwejmskgsdw8faadazr25nzdfpgaacapu07ej3rpsqtkt+jb1y4urzx16i01da+qjag3tmx/inle6xoxrxqbjqv7ugfjfqct3tkvpdu9neypxp68z6+5pa8idknyyspggbiyidskom9lhepkwisgusw52jaxhjwxajjs1itjtj8+swrl0xlkvnellhmzyhl3iwz44caivuiqq1idbe7zag7xbz9ar9wf9jksz1qrnb/7a+v4bxdqlla==</latexit> sha1_base64="6gqom5yxwlht4qvbk4xrawsglro=">aaacjxicbvdltgixfo34rhyhlt00eboiqgzyqbstohuxmmgjyqjplas0db62hcoe8bf+grt/xy0lncas/bxlwelbkzq9oefe3hupe3amlwlojbx1jc2t7crocndv/+awdxrck34okfspz33rcigezjyokqy4naibxhu41j3b7cyvp4gqzpcevbrayyu9j3uzjupl7ds13fuf4rwp81eey7sgqtmvpssoc4x4j3kx1i7zhb5jioefakg1uxmzambaq8rakew5bz8/t8trpz2a2b2fhi54iniizdmya9uaeaey5tbo2qgegnab6uftu4+4ifuj+mwxptnkb+t99fmujtxfhspishm5jq50ierlzw8m/uc1q9w9ao2yf4qkpdof1a05vj6ezyy7tabvpnkeumh0rpj2isbu6wstogrr+ervui0vl4rmvzup36a5eugupvewwegsldedqqaqougfvaej+jbejxfj0/ial64zi54t9afg9w9khqca</latexit> Reconstruction properties y = (x) () x = 1 (y) (x) Linear separation of the different classes 8i 6= j, ke( (X i )) E( (X j ))k 2 1 8i, ( (X i )) 1

perfectly understood. DATA Success story Scattering for Textures&Digits Non-learned representation have been successively used on: Digits (patterns): Ref.: Invariant Convolutional Scattering Network, J. Bruna and S Mallat 12 Textures (stationary process): Ref.: Rotation, Scaling and Deformation Invariant Scattering for texture discrimination, Sifre L and Mallat S. Small deformations Rotation+Scale +Translation However all the variabilities(groups) here are

<latexit sha1_base64="mrzq5xavndn46w+jqs8gobqt3zu=">aaacqnicbvbbaxnbfh7baptgw9n69di0camusjtd7auqmoshd1gajpinyxz2kgydnv1m3pagzae/zyt/wjt49+khivtmwdkkqkx8mpb933vfzjsvskqw6lpfnk3tr493dkt75sdp9w+evq6prkycasa7ljax7gfuccku76jayfuj5jqkjo8f1+2i37vh2ohyxeis4coitpqyc0brsqpkez8xgvhcet+iogvuzm/yd83anavb9s6sk7+kndd95leyeapcklsxjrlvyzwu19xsepieu6nk1w248ykbwfucaqv+8puoadqjymc/jfkacyvmummgnpvgmkmabzm8l/up4qll13tcbxyqgnezzoyz5oslvuiyjru9cslcxxvkndjmfgv2sljyrpck8x+9qyrjs2emvjiiv2zx0divbgnsbepcotldoboami3srornqaymbexlg4k3/uvn0g02thvuw6/auobfleafhemnphgflxgnhegcg4/wde7hh/pj+e78dh4trrecpec5/fpo7z/pllqc</latexit> sha1_base64="hbb0ca2ujpjcwtzdix9084r07wg=">aaacqnicbvc7tsmwfhv4u14frhalcqmvujuwaatsbqsdq0guippsoy4lvh0nsm8qvzsnp2jkyeeh2ba7cwmgxmaa04le60qwzjn3hvv6ejhggmz7zhoyhboegr0bz01mtk3p5gfndnqyk8pqnbshovsizojlvgmogh1giphae6zudbayfv2mkc1duq/didudcij5m1mcrmrlj9xic+xyid2awcklitljj1ekpez5yz4hy19kky25wm4hwut6oduuacxfj1ocxvems368ybfybbts9wr/bc4nkfrkb/dnl1cx1vb+1vvdggdmahve64zjr9bmiajobutzbqxzrgihnlcggzietdetxgypxjkkj9uhmkcc7qnfhqkjto4gnpnmfta/e5n4x68rq3u9mxazxcak7t/ujgwgegebyp8rrkf0dsbucbmrpqdeeqom9pwjwfn95b+gtljeldu7tqgyifo1hhbqiioib62hctpgvvrdff2jb/senq0b69f6sv77owpwp2ce/sjr/qnzflxq</latexit> <latexit sha1_base64="vzm3p0o sha1_base64="mkresqa <latexit sha1_base64="mduu63bohmqgld23v sha1_base64="koiyl79c8jdw2tvoq DATA Wavelets 13 Wavelets help to describe signal structures. wavelet iff 2 L 2 (R 2, C) and R R 2 (u)du =0 Learned in the first layers of a DeepNet. is a Wavelets can be dilated in order to be a multi-scale representation of signals, rotated to describe rotations. j, = 1 2 2j (r (u) 2 j ) <latexit sha1_base64="4ywyx8g1ftsjdjseziepbvzr7ik=">aaacjxicbvdlssnafj34rpvvdekmwiqwpcrf1e2h6mzlbwsltq2t6asddvjg5kyoiv/jxl9x46kk4mpfcdjkoa0hbg7nnmude5yqmwmg8awtrk6tb2wwtorbo7t7+6wdwwczrilqngl4iloolpqzn7abaafdufdsozx2nmln6neeqjas8o9hgtk+h4c+cxnbocs71lbcyex4fgbbiajogpyrminnjk4/xvvxkqr+jrofnyuqutw1x0nvlpwnmjghvkzmnjrrjpzdmlmdgeqe9yfwlgxpnelox1gai5wmrsusnmrkgoe0p6ippsr78fzmrd9vykb3a6ged/pc/t0ry0/kqeeopidhjbe9vpzp60xgxvvj5ocruj9ki9yi6xdoawf6galkge8vwuqw9vedjlcqbfszrvwcuxjymmnxaxc14+683lzo2yigy3sckshel6ij Design wavelets selective to an informative Ref.: ImageNet Classification with Deep Convolutional Neural Networks. j, <latexit sha1_base64="h0os0qpccwgody3iax5x9vow0te="> A Krizhevsky et al. Isotropic variability. ˆ VS ˆ Non-Isotropic

14 (u) = 1 2 e kuk 2 2 (e i.u apple) (u) = 1 2 e kuk 2 2 (for sake of simplicity, formula are given in the isotropic case) The Gabor wavelet

15 Wavelet Transform Wavelet transform : Wx = {x? j,,x? J },japplej! 2 Isometric and linear operator, with kwxk 2 X Z Z = x? j, 2 +,japplej Covariant with translation x? 2 J! 1 W (x =c )=(Wx) =c Nearly commutes with the action of diffeomorphism k[w,. ]kappleckr k Ref.: Group Invariant Scattering, Mallat S Why wavelets are not enough? Invariance

Filter bank implementation 16 of a Fast WT Ref.: Fast WT, Mallat S, 89 Assume it is possible to find h and g such that Set: ˆ (!) = 1 p 2 ĝ (! 2 ) ˆ(! 2 ) and ˆ(!) = 1 p2 ĥ(! 2 ) ˆ(! 2 ) x j (u, 0) = x? j (u) =h?(x? j 1 )(2u) and x j (u, ) =x? j, (u) =g? (x? j 1 )(2u) The WT is then given by Wx = {x j (., ),x J (., 0)} japplej, A WT can be interpreted as a deep cascade of linear operator, which is approximatively verified for the Gabor Wavelets.

ˆj = 1 p 2 ĥ(. 2 ) ˆj 1 ˆj, = 1 p 2 ĝ (. 2 ) ˆj 1 DATA 17 g h g h g h x 0 x 1 x 2 x 3 There is an oversampling h 0 Deep implementation of a WT

<latexit sha1_base64="uml5obii+vze+lgikfghfeb1kh4=">aaacdxicbva9swnbej3z2/gvtdnmmqoweu4svaqhycnweywrcuhy20zm6t6hu3ncofja2fhxbcxubo3t/dduegu/hiy8ew+g2xlhqqqh1/1wrkbhxicmp6ylm7nz8wvfxauzk2raye0kkthniteoziw1kqtwpnxio1bhpbw67pv1g9rgjvepdvnsrvwilm0pofkpkg74yja3ecap/pwykfs+dzbs6fe2mk19hdfsociw3li7aptlvc9sqqzcronfnsi++61ezbhgjbq3pug5ktvzrkkkhb2cnxlmubjif9iwnoyrmmy+ok sha1_base64="kbet0gcnjjehonqwcqcfkvxlyrc=">aaab43icbvbns8naej34wenx9eplsqiesujbpqpepfywttcgstls2qwb3ba7euroh/dgrfhqf/lmv3h7awrrg4hhezpmzeskkswgwze3tr6xubvd2/f39/z9g8p60apvpwe8ylpq00mo5viohqfaytuf4trpjg8no9up337ixgqthnbc8dinayuywsg6qdwvn4jmmanzjegcngcbfv2zl2pw5lwhk9tabhgugffuogcst/xeaxlb2ygoendrrxnu42p25oscosulmtaufjkz+nuiorm14zxxntnfov32puj/xrfe7dquhc sha1_base64="gwxpyxqnm5vzokdkiviilii+kji=">aaacdxicbva9swnbej2lxzf+rs1tfivgeclfqm0ewuasfiwkuxds7c2znxsf7s4j4cgvspgv2fio2nrb+w/cxicfdxbevdfd7lwgu9kq6747pynjqemz8mxlbn5hcam6vhjm0lwlbilupfoi4aavtlbfkhrezbp5hcg8d3ohq//8brwraxjk/qw7mb9mzcqfjyv51zqnbhpifbnnfve+rhvurbkln6gzw3skr9mrx91wg+4i7c9pjskgjhhsv9+8mbv5jakjxy1pn92mogxxjixcqcxldwzc9pglti1neiymu4zogbcavu sha1_base64="jgrkmmn/j+mnpd5nimdb/px4pni=">aaacdxicbvc7sgnbfj31gemramkzgaiwiwwsvaqhycnwevwtyc7l7oqmgtp7coauejz8gy2/ymohymtv5984ersaegdg3hpu5c49qskfrtv+thywl5zxvnnr+fwnza3tws7ury5txchhsyxvm2aapijaqyesmokcfgysgkh/yuq3hkbpeuc3oejac1k3eh3bgrrjl5rcazrbzbfnbnbni7klpubtusmynbur4z5e+ywixbhhopokoivfmkxdl3y57ziniutijdo6vbut9dkmuhajw7ybakgy77mutaynwajay8bndgnjkg sha1_base64="bsw+/z7xfttmfprw/becc1nu/m8=">aaacdxicbvc7sgnbfj31genr1djmmaqsqti1ubgegi1yrtamka3l7oqmgtp7coauejz8gy2/ymohymtv5984ersaegdg3hpu5c49qskfrsf5thywl5zxvnnr+fwnza1te2f3vsep4ldjsyxvi2aapiighgilnbiflawk1ip+xcivp4dsio5ucjbak2tdshqez2gk3y560js3ms/ovezofyupe4cm9iylamppwj298u2cu3bgoppenzicmalq219eo+zpcbfyybruuk6crywpffzcmo+lghlg+6wltumjfojuzenzhrrold sha1_base64="xzvxeir+93c6votukygk/emnraa=">aaacdxicbvc7sgnbfj31bxxf7bqzxaqlcbswkoiqsbfbigcmkf2w2cmngz19ohnxceuwsrlxv2wsvnla2/kj1k4eha8da+eecy937gltktq6zoc1mjo2pje5nv2ymz2bxygulp3pjfmcqjyritopmqypyqiiqannqqiwhrjq4dvbz6/dgniiiu+xnyifsytynavnaksguo5j09xggdj38stabhryajsl19mkpvykxnojogg7jacp+pe4q2kxv+7sz9vjbiuovnunhgcrxmgl07ruoin6ovmouirowcs0pixfsquogxqzcl DATA 18 Scattering Transform Scattering transform at scale J is the cascading of complex WT with modulus non-linearity, followed by a low pass-filtering: S J x = {x? J, x?? 1 J, with x? 1? 2? J } Ref.: Group Invariant Scattering, Mallat S i = {j i, i },j i apple J Depth J x J. J. Mathematically well defined for a large class of wavelets.

19 For people into computer vision: SIFT performs a histogram of gradient h( ) = X kgk Gradient \g2[, + ] = X g k1 \g2[, + ] gk Quantification x?? The averaging leads to a loss of information Relations with SIFT

x<latexit sha1_base64="vbwja9ly6q sha1_base64="a9ccdmxj8b <latexit sha1_base64="ukn0lywidnxbhx5lbgpech2pnps=">aaab/3icbzdlssnafizpvnz6i7pw4wzoeqshjc7uzdgnywrgfppqjtnjo3qyctmtsatd+cscgxcqbn0nd76n04ugrt8mfpznhm6cp0w5u9pxvqyfxaxlldxcwnf9y3nr297zvvvjjgn1smit2qixopwj6mmmow2kkui45lqe9i5h9fodlyol4kb3uxreucnyxajwxmr sha1_base64="dabp+plfweonh5yldjzxj4ckpke=">aaab/3icbzdlssnafiyn9vbrlerchzuhrrcekrhql0e3lisyw2hcmuwn7ddjhzktmbtd+ay+grsxkm59dxd9g6cxqvt/gpj4zzmcox+qcq7askzgywl5zxwtuf7a2nza3jf39+5ukknkxjqirdycopjgmxobg2cnvdisbylvg97vuf6/z1lxjl6fpgv+rdoxdzklok2 DATA Feature map 20 x? <latexit sha1_base64="op27vgr2x2ta/4mkmq73cw5enzi=">aaab8xicbzc7sgnbfibpxlumt6ilfonbsaq7fmoztlfmwdxc7hjmj7pjknklm2ffsoqxbcxubh sha1_base64="b6t5l4p5wmchnzolv7uguqh4nks=">aaab8xicbzdlssnafiyn9vbrrepskceiucqjc3vzdooybwmlssit6aqdopmemroxhc59bdcuvn x?? 1st order coefficients Example of Scattering coefficients

21 Properties Non-linear Ref.: Group Invariant Scattering, Mallat S Isometric ks J xk = kxk Stable to noise ks J x S J ykapplekx yk Covariant with translation S J (x =c )=S J (x) =c Invariant to small translation c apple2 J ) S J (x =c ) S J (x) Sensitive to the action of rotation S J (r x) 6= S J (x) Linearize the action of small deformation ks J x S J xkappleckr k Reconstruction properties Ref.: Reconstruction of images scattering coefficient. Bruna J allow a linear classifier to build class invariant on informative variability

Parameters&dimensionality of the Scattering 22 For 256x256 images, x 2 R 105 Yet it is possible to reduce it up to L x 2 R 2000 It depends on a few parameters of the mother wavelet L, J, the shape Identical parameters can be chosen for natural images on different datasets. It is a generic representation

23 J =3, 2{0, 4, 2, 3 4 } Order 2 Order 1 Order 0 g h g g g g h h h h x 3 x 0 x 1 x 2 Modulus h 0 Scattering coefficients are only at the output 2nd order Translation Scattering

Affine Scattering Transform Observe that W is covariant with the affine group Aff(E): 24 W [ ](g.x) = (g.x)? = W [g. ]x Ref.: PhD, L Sifre See W as a signal parametrised by some elements of the affine group: x? j, (u) = x(g),g =(u, r,j) We can define a WT on any compact Lie group(even not commutative) via: x? G (g) = Z G (g 0 )x(g 0 1.g)dg 0 Ref.: Topics in harmonic analysis related to the Littlewood-Paley theory Stein EM The same previous properties hold for this WT/Scattering. Ref.: Group Invariant Scattering, Mallat S

Separable Roto-Translation 25 Scattering Ref.: Deep Roto-Translation Scattering for Object Classification. EO and S Mallat Roto-translation group is not separable yet we used a separable wavelet transform on it. No averaging along angle (sensitivity increased) Separable (simple to implement and fast) Equal (slightly better) results as with non separable

26 J =3, 2{0, 4, 2, 3 4 } g h g g g g h h h h x 3 x 0 x 1 x 2 Separable convolution that Ref.: Deep Roto-Translation Scattering for Object Classification. EO and S Mallat recombines angles on 2nd order Separable Roto-Translation scattering

Linearization of the rotation Work in progress For a deformation: u (u) v (v)+(i r )(v)(u v) In fact the the affine group acts also on the deformation diffeomorphism: g.(i )(u) g.(i )(v)+g.(i r )(v)(u v) 27 Decompose it on the affine group: (I r )(v) =r (v)k(v) A way to linearize the action of the rotation? ksx Sx kapplec(kr k +...)

28 Classification pipeline x! x! L x! hl x, wi 0 Scattering Transform on YUV +Log Supervised dimensionality Reduction Gaussian SVM We learn L(select features) and w(select samples) from the data Getting L is the most costly part of the algorithm (Orthogonal Least Square)

29 Ref.: On the difference between orthogonal OLS? Supervised forward selection of features. The matching pursuit and orthogonal least squares. T. Blumensath and M. E. Davies. selection is done class per class. (similar to OMP) Principle: given a dictionary of feature { k } k Set y i =1ifi is in the class, 0 otherwise Find the most correlated feature k with y Pull it from the dictionary, orthogonalize the dictionary and normalize the dictionary. Select it. Iterate.

30 Dataset Type Paper Accuracy Caltech101 Scattering 79.9 Unsupervised Ask the locals 77.3 Numerical results RFL 75.3 M-HMP 82.5 Supervised DeepNet 91.4 Caltech256 Scattering 43.6 Unsupervised Ask the Locals 41.7 M-HMP 50.7 Supervised DeepNet 70.6 CIFAR10 Scattering 82.3 Unsupervised RFL 83.1 Supervised DeepNet 91.8 CIFAR100 Scattering 56.8 Unsupervised RFL 54.2 Supervised DeepNet 65.4 Identical Representation

31 Method Caltech101 CIFAR10 Translation Scattering order 1 59.8 72.6 Translation Scattering order 2 70.0 80.3 Translation Scattering order 2+ OLS 75.4 81.6 Roto-translation Scattering order 2 74.5 81.5 Roto-translation Scattering order 2+ OLS 79.9 82.3 Improvement layer wise

32 How to fill in the Gap? Adding more supervision in the pipeline Building a DeepNet on top of it? Initialising a DeepNet with wavelets filters? Question is opened. A more complex classifier could help to handle class variabilities. Fisher vector with scattering? Adding a layer means identifying the next complex source of variabilities. Are they geometric? classes?

33 Other application of ST Audio Vincent Lostanlen Quantum chemistry Matthew Hirn Temporal data, video Reconstruction of WT Irène Waldspurger Unstructured data Mia Chen Xu Cheng

34 Conclusion We provide: Mathematical analysis and algorithms Competitive numerical results Software: http://www.di.ens.fr/~oyallon/ (or send me an email to get the latest!)