1 On the road to Affine Scattering Transform Edouard Oyallon with Stéphane Mallat following the work of Laurent Sifre, Joan Bruna,
High Dimensional classification (x i,y i ) 2 R 5122 {1,...,100},i=1...10 4! ŷ(x)? 22 Training set to predict labels Rhino Caltech 101, etc Rhino class from Caltech 101 Not a rhino
3 Translation x y kx yk 2 =2 Rotation Averaging is the key to get invariants x y High dimensionality issues
Fighting the curse of dimensionality 4 Geometric variability Groups acting on images: translation, rotation, scaling Class variability Intraclass variability Not informative Other sources : luminosity, occlusion, small deformations x (u) =x(u (u)), 2C 1 I Extraclass variability Can be carefully handled Needs to be learned x x L reduces class variability L x Classifier
5 Existing approach Ref.: Discovering objects and their location in images. Sivic et al. High-dimensional Signature Compression for Large-Scale Image Classification, Perronnin et al. Unsupervised learning: Bag of Words, Fisher Vector, {x 1,...,x N }! Supervised learning: Deep Learning, {(x 1,y 1 ),...,(x N,y N )}!L Non learned: HMax, Scattering Transform. {G 1,...}! Ref.: Robust Object Recognition with Cortex-Like Mechanisms. Serre et al.
DeepNet? 6 It is a cascade based on a lot of linear operators followed by non linearities. Each operator is supervisedly learned Ref.: Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al. Convolutional network and applications in vision. Y. LeCun et al. Sort of Super SIFT State-of-the arts on ImageNet and most of the benchmarks. % accuracy SIFT+FV DeepNet 100 90 human 80 70 60 2010 2011 2012 2013 2014 2015 Ref.: image-net.org
Complexity of the architecture 7 Requires a huge amount of data Need many engineering to select the hyper parameters and to optimise it Interpreting the learned operators is hard when the network is deep(i.e. more than 3 layers) Few theoretical results, yet outstanding numerical results. Ref.: Intriguing properties of neural networks, C. Szegedy et al.
Operators of a Deep architecture 8 Linear operators are often convolutional whose kernels are small filters. x j (u, 2) =f( X 1 x j 1 (., 1)?h j, 1 (u))! x j = f(f j x j 1 ) Deep. Contains a phase information f is often a ReLu: x! max(0,x) Sometimes pooling which leads to a down sampling.
9 Classifier f(f 1 ) f(f 2 ) f(f J ) x 0 x 1 x 2 x J DeepNetwork Ref.: ImageNet Classification with Deep Convolutional Neural Networks. A Krizhevsky et al.
Does everything need to be learned? A Scattering Network is a deep architecture, 10 where all the filters are predefined. Ref.: Invariant Convolutional Scattering Network, J. Bruna et Mallat S We challenge the necessity to learn the weights of every filters of a deep architecture. Scattering gets state-of-the-art results on unsupervised learning for some complex datasets. Ref.: Deep Rototranslation Scattering Network for complex Image Recognition, EO, S Mallat We highlight the similarity of the Scattering Network with the DeepNet architectures.
<latexit sha1_base64="newmldbod1kkandkku6+pt7tzd8=">aaace3icbzbns0jbfibpts+zr1st2wxkochyb4tqe0gtammqkxhf5o6jds6de5mzg4r4i9q07l+0avhrtk27/k3jr1dacwmp7zmhoef1i86udpwvk7g0vlk6llxpbwxube/yu3u3kowlorus8ldwfkwoz4jwnnoc1ijjcebzwvv7f+n69y5kxujxowcrbqs4i1ibeayn1btzxjuumhpul/xqx2mcxraqv+6ybkfyz51nqj9r2hmn6eyefsgdqaau9vkpafbu2p9ekyrxqiumhctvd51in4zyaky4haw8wneikx7u0lpbgqoqgspjusn0ajwwmjuzjzsaul8nhjhqahd4pjpauqvma2pzv1o91u3txpcjknzukolh7zgjhajxqqjfjcwadwxgipnzfzeulphok2pkhodon7wilapicdg5djolc5gqcqeqhiy4cailuiiyvidaptzbc7xad9az9wa9t1st1mxmh/7i+vggeckd9g==</latexit> sha1_base64="srdvla/owkcv7kmocnqmjpznjn0=">aaace3icbzdlssnafiynxmu9rv26gvqelpasufa3qtcflisyw2hcmuyn7ddjjmxmpkh0iqtxvdy4uhhrxl3fxulf0nyfbj7+cw5zzh/ejeplwsnjaxlldw09s5hd3nre2tx39u9klahmxbyxsnqdjamjnlikkkbqssaodbipbb3lcb12t4skeb9vauz8ehu4bvomllaazslrrwixbvvlh+p4lmormvsqxvrovprf8wn1i00zb1wsieai2dpiozmv9dhy0mrt/pjaeu5cwhvmsmqgbcxkhychkgzkmpusswkee6hdgho5con0b5ojhvbioy2od9kpkzhxf08muchlgga6m0sqk+dry/o/winr7tn/qhmckmlx9kn2wqck4dgh2kkcymvsdqglqnefuisewkrnmnuh2pmnl4j7xdmpwdd23rkau2xaicibardbkxdanagcf2dwaj7bk3gznowx4934mlyugboza/bhxuc3gaaffa==</latexit> <latexit sha1_base64="zbot7jba2tu1iy/+rhd2jwirzvo=">aaacfhicbva9bxnbej0zstamhwzkmlwsslaswhcpgabjgoyihspfiswfsfbwe/bke7un3tnw6eq/quplz6chiiiwgi7/juupirhzpjge3pvrzlwolcki7996psdb2ztpy88qz3f39g+ql15ewz0zxjtms226ebvccsu7kfdybmo4tsljr6pjx7l//yubk7s6xdzl/ysoligfo+ikqfukfx+2x6i+bzdwxkur5deamrojnuz/jdof+7k4dwb1vdgo1vymvwdzjmgk1fqh4fepaggpqv/dowzzwhuysa3tbx6k/yiafezywsxmle8pm9ar7zmqamjtv1h8nsnhthmswbtxcslcvt9r0mtapilcz0jxbne9ufiy18swftcvheoz5iotf8wzjkjjpciyfiyzllkjlbnhbivsta1l6iksubcc9zc3sees+abpxws11gdyogyv4rdqembbamenaemhghydn/abbrzv3i/vj/d32vryvjov4ag8f3ebxz+f</latexit> sha1_base64="yaq/o5/iled4o5cpingjcktvrwy=">aaacfhicbva9swnben3z2/gvtbrzeosigu4s1eyi2lhyrdaq5gly2+wls/z2j9059tjyj0twr9hyqnha2oxfueks/how8hhvhpl5qsy4adfto2pje5nt0zozubn5hcwl/plkuvgjpqxglvd6micgcs5zdtgidhlrrqjasiugeztwl66znlzjm0hj1ohiw/kquwjwaua30go/2ugl2w3snyjzfiwezdsdifqrg3w7dk+yba9xsjea+ajbdofaf4n3ryqvgr9536+k1wb+w28pmkrmahxemlrnxtdiiazobevl/mswmnauabo6pzjezdsy4vc9vg6vfg6vtiubd9xvexmjjemjwhzgbdrmtzcq//pqcyt7jyzloaem6whrmagmcg8iwi2ugqwrwkko5vzwtdteewo2yjwnwfv98l9s2ynvlt1tr1g5rcpmodvuqcxkot1uqceoimqiojv0ij7ri/pgpdmvztuodcz5mllfp+c8fwkjqael</latexit> <latexit sha1_base64="mmw7b12yit5ppp6c5e2nqa1b1iu=">aaacjxicbvdlsgmxfm3uv62vqks3wsk0ogwmiloriik4rgaf0cldjr1t02yyy5irsunxupfx3lioirjyv0wfilyecjyccy/33unhnclt259wyml5zxutuz7a2nza3knv7lvugeskzrryunz8ooazawxnnidajieepoeq37se+9vhkiqf4l73i2gepc1yi1gijesll91wkannmlkchnd3glsvkbrfzn1sh2vrhsvltn4+3vxuynsf7lbb2phsgttvt4axitmjgtrdyuup3gzi4wceppwovxfssdcgrgpgoqxtbqwgirrh2la3vjaavgmwoxoij4zsxgzf84tge/v3x4aesvud31qgrhfuvdcw//pqsw5dnazmrlegqaedwjhhostjzhctsaca9w0hvdkzk6ydignvjtmuccgzp3mrlav5s7x9d5opxs3sskiddiiyyehnqihuuqmveuvp6awn0jv1bl1a79bhtdrhzxr20r9yx99o7qi0</latexit> <latexit sha1_base64="1xfmdujgtthl4upbizcwn8io4km=">aaaccnicbvdlssnafl3xwesr6tln0ck0kcvxos6lblxwmlbqhdczttuhk0mymqgldo/gf/al3lhqcesxupnvnd4w2nrgwugce7n3nijltgnh+bawlldw19ylg8xnre2dxxtv/04lmstuiwlpzcvcinimqkez5rsvsorjinnmnlga+817khvlxk0epjsicu+wlinygym0s343kzhzxe6qr1gvxhw/0wevvsiqveqbww3tslnzjkclxj2rcr3khz8bqco0v/xoqrkyck04vqrtoqkociw1i5yoin6mairjapdo21cby6qcfpllcb0zpypmuaaerhp190soy6wgcwq6y6z7at4bi/957ux3l4kcitttvjdpom7gku7qobjuyziszyegyc sha1_base64="hbbaurjdhcixnua1kcmznk8/ebi=">aaaccnicbvdlssnafj3uv62vqes3q4vqoptehbomunfzwdhce8jkommhtizhzike0l0bd36hgxcqbv0cd/0bp4+fth64cdjnxu69j0wzlcqyxkzpzxvtfao8wdna3tndm/cp7mwscuxcnlbedeikcaocuioqrjqpicgogwmhw+uj334gqtke36k8jx6m+pxgfcolpccseleiegoqnkjp0n6m6l5rqoudgdya0noghzg1q2lnazejpsc1p+qdpi+dvbwy314vwvlmumimsdm1rvt5brkkykzgfs+tjev4ipqkqylhmzf+mf1lbi+10op6kf1cwan6e6jaszr5horogkmbxpqm4n9en1prpv9qnmakcdxbfgumqgrogoe9kghwlnceyu <latexit sha1_base64="gm84j/6ma8kbnher0bkewwdughc=">aaab7xicbza9swnbeibn4lemx1e7braje sha1_base64="ig/w51dvf9qc2vz4ewqmcwtiwzk=">aaab7xicbzc7sgnbfibpxlumt6idnoole x<latexit sha1_base64="ho sha1_base64="ft DATA 11 Desirable properties of a representation Invariance to group G of transformation 8x, 8g 2 G, (g.x) = (x) Stability to noise 8x, y, k (x) (y)k 2 applekx yk 2 <latexit sha1_base64="5afmzqtj9eto92b9zxcfxndiztg=">aaacjxicbvdlsgnbeoyn7/ikevqyjaikmezmof6eobepeywjzeoynxtmknmhm7osjfgvghd/xyshfcgtv+jkk4mmfgxtvhxt3evfgitt219wzm5+yxfpesw7ura+sznb2r5ryswz1lgoqtnwqelba6xprgu2ionu9wtwvf7fyk/fo1q8dk51emhlp7cb73jgtzhauto3g0oqbbkukyih7g1k7vz7fh9wcjt+yugqtcuuwluuksfrmthauyjdsloqwejmskgsdw8faadazr25nzdfpgaacapu07ej3rpsqtkt+jb1y4urzx16i01da+qjag3tmx/inle6xoxrxqbjqv7ugfjfqct3tkvpdu9neypxp68z6+5pa8idknyyspggbiyidskom9lhepkwisgusw52jaxhjwxajjs1itjtj8+swrl0xlkvnellhmzyhl3iwz44caivuiqq1idbe7zag7xbz9ar9wf9jksz1qrnb/7a+v4bxdqlla==</latexit> sha1_base64="6gqom5yxwlht4qvbk4xrawsglro=">aaacjxicbvdltgixfo34rhyhlt00eboiqgzyqbstohuxmmgjyqjplas0db62hcoe8bf+grt/xy0lncas/bxlwelbkzq9oefe3hupe3amlwlojbx1jc2t7crocndv/+awdxrck34okfspz33rcigezjyokqy4naibxhu41j3b7cyvp4gqzpcevbrayyu9j3uzjupl7ds13fuf4rwp81eey7sgqtmvpssoc4x4j3kx1i7zhb5jioefakg1uxmzambaq8rakew5bz8/t8trpz2a2b2fhi54iniizdmya9uaeaey5tbo2qgegnab6uftu4+4ifuj+mwxptnkb+t99fmujtxfhspishm5jq50ierlzw8m/uc1q9w9ao2yf4qkpdof1a05vj6ezyy7tabvpnkeumh0rpj2isbu6wstogrr+ervui0vl4rmvzup36a5eugupvewwegsldedqqaqougfvaej+jbejxfj0/ial64zi54t9afg9w9khqca</latexit> Reconstruction properties y = (x) () x = 1 (y) (x) Linear separation of the different classes 8i 6= j, ke( (X i )) E( (X j ))k 2 1 8i, ( (X i )) 1
perfectly understood. DATA Success story Scattering for Textures&Digits Non-learned representation have been successively used on: Digits (patterns): Ref.: Invariant Convolutional Scattering Network, J. Bruna and S Mallat 12 Textures (stationary process): Ref.: Rotation, Scaling and Deformation Invariant Scattering for texture discrimination, Sifre L and Mallat S. Small deformations Rotation+Scale +Translation However all the variabilities(groups) here are
<latexit sha1_base64="mrzq5xavndn46w+jqs8gobqt3zu=">aaacqnicbvbbaxnbfh7baptgw9n69di0camusjtd7auqmoshd1gajpinyxz2kgydnv1m3pagzae/zyt/wjt49+khivtmwdkkqkx8mpb933vfzjsvskqw6lpfnk3tr493dkt75sdp9w+evq6prkycasa7ljax7gfuccku76jayfuj5jqkjo8f1+2i37vh2ohyxeis4coitpqyc0brsqpkez8xgvhcet+iogvuzm/yd83anavb9s6sk7+kndd95leyeapcklsxjrlvyzwu19xsepieu6nk1w248ykbwfucaqv+8puoadqjymc/jfkacyvmummgnpvgmkmabzm8l/up4qll13tcbxyqgnezzoyz5oslvuiyjru9cslcxxvkndjmfgv2sljyrpck8x+9qyrjs2emvjiiv2zx0divbgnsbepcotldoboami3srornqaymbexlg4k3/uvn0g02thvuw6/auobfleafhemnphgflxgnhegcg4/wde7hh/pj+e78dh4trrecpec5/fpo7z/pllqc</latexit> sha1_base64="hbb0ca2ujpjcwtzdix9084r07wg=">aaacqnicbvc7tsmwfhv4u14frhalcqmvujuwaatsbqsdq0guippsoy4lvh0nsm8qvzsnp2jkyeeh2ba7cwmgxmaa04le60qwzjn3hvv6ejhggmz7zhoyhboegr0bz01mtk3p5gfndnqyk8pqnbshovsizojlvgmogh1giphae6zudbayfv2mkc1duq/didudcij5m1mcrmrlj9xic+xyid2awcklitljj1ekpez5yz4hy19kky25wm4hwut6oduuacxfj1ocxvems368ybfybbts9wr/bc4nkfrkb/dnl1cx1vb+1vvdggdmahve64zjr9bmiajobutzbqxzrgihnlcggzietdetxgypxjkkj9uhmkcc7qnfhqkjto4gnpnmfta/e5n4x68rq3u9mxazxcak7t/ujgwgegebyp8rrkf0dsbucbmrpqdeeqom9pwjwfn95b+gtljeldu7tqgyifo1hhbqiioib62hctpgvvrdff2jb/senq0b69f6sv77owpwp2ce/sjr/qnzflxq</latexit> <latexit sha1_base64="vzm3p0o sha1_base64="mkresqa <latexit sha1_base64="mduu63bohmqgld23v sha1_base64="koiyl79c8jdw2tvoq DATA Wavelets 13 Wavelets help to describe signal structures. wavelet iff 2 L 2 (R 2, C) and R R 2 (u)du =0 Learned in the first layers of a DeepNet. is a Wavelets can be dilated in order to be a multi-scale representation of signals, rotated to describe rotations. j, = 1 2 2j (r (u) 2 j ) <latexit sha1_base64="4ywyx8g1ftsjdjseziepbvzr7ik=">aaacjxicbvdlssnafj34rpvvdekmwiqwpcrf1e2h6mzlbwsltq2t6asddvjg5kyoiv/jxl9x46kk4mpfcdjkoa0hbg7nnmude5yqmwmg8awtrk6tb2wwtorbo7t7+6wdwwczrilqngl4iloolpqzn7abaafdufdsozx2nmln6neeqjas8o9hgtk+h4c+cxnbocs71lbcyex4fgbbiajogpyrminnjk4/xvvxkqr+jrofnyuqutw1x0nvlpwnmjghvkzmnjrrjpzdmlmdgeqe9yfwlgxpnelox1gai5wmrsusnmrkgoe0p6ippsr78fzmrd9vykb3a6ged/pc/t0ry0/kqeeopidhjbe9vpzp60xgxvvj5ocruj9ki9yi6xdoawf6galkge8vwuqw9vedjlcqbfszrvwcuxjymmnxaxc14+683lzo2yigy3sckshel6ij Design wavelets selective to an informative Ref.: ImageNet Classification with Deep Convolutional Neural Networks. j, <latexit sha1_base64="h0os0qpccwgody3iax5x9vow0te="> A Krizhevsky et al. Isotropic variability. ˆ VS ˆ Non-Isotropic
14 (u) = 1 2 e kuk 2 2 (e i.u apple) (u) = 1 2 e kuk 2 2 (for sake of simplicity, formula are given in the isotropic case) The Gabor wavelet
15 Wavelet Transform Wavelet transform : Wx = {x? j,,x? J },japplej! 2 Isometric and linear operator, with kwxk 2 X Z Z = x? j, 2 +,japplej Covariant with translation x? 2 J! 1 W (x =c )=(Wx) =c Nearly commutes with the action of diffeomorphism k[w,. ]kappleckr k Ref.: Group Invariant Scattering, Mallat S Why wavelets are not enough? Invariance
Filter bank implementation 16 of a Fast WT Ref.: Fast WT, Mallat S, 89 Assume it is possible to find h and g such that Set: ˆ (!) = 1 p 2 ĝ (! 2 ) ˆ(! 2 ) and ˆ(!) = 1 p2 ĥ(! 2 ) ˆ(! 2 ) x j (u, 0) = x? j (u) =h?(x? j 1 )(2u) and x j (u, ) =x? j, (u) =g? (x? j 1 )(2u) The WT is then given by Wx = {x j (., ),x J (., 0)} japplej, A WT can be interpreted as a deep cascade of linear operator, which is approximatively verified for the Gabor Wavelets.
ˆj = 1 p 2 ĥ(. 2 ) ˆj 1 ˆj, = 1 p 2 ĝ (. 2 ) ˆj 1 DATA 17 g h g h g h x 0 x 1 x 2 x 3 There is an oversampling h 0 Deep implementation of a WT
<latexit sha1_base64="uml5obii+vze+lgikfghfeb1kh4=">aaacdxicbva9swnbej3z2/gvtdnmmqoweu4svaqhycnweywrcuhy20zm6t6hu3ncofja2fhxbcxubo3t/dduegu/hiy8ew+g2xlhqqqh1/1wrkbhxicmp6ylm7nz8wvfxauzk2raye0kkthniteoziw1kqtwpnxio1bhpbw67pv1g9rgjvepdvnsrvwilm0pofkpkg74yja3ecap/pwykfs+dzbs6fe2mk19hdfsociw3li7aptlvc9sqqzcronfnsi++61ezbhgjbq3pug5ktvzrkkkhb2cnxlmubjif9iwnoyrmmy+ok sha1_base64="kbet0gcnjjehonqwcqcfkvxlyrc=">aaab43icbvbns8naej34wenx9eplsqiesujbpqpepfywttcgstls2qwb3ba7euroh/dgrfhqf/lmv3h7awrrg4hhezpmzeskkswgwze3tr6xubvd2/f39/z9g8p60apvpwe8ylpq00mo5viohqfaytuf4trpjg8no9up337ixgqthnbc8dinayuywsg6qdwvn4jmmanzjegcngcbfv2zl2pw5lwhk9tabhgugffuogcst/xeaxlb2ygoendrrxnu42p25oscosulmtaufjkz+nuiorm14zxxntnfov32puj/xrfe7dquhc sha1_base64="gwxpyxqnm5vzokdkiviilii+kji=">aaacdxicbva9swnbej2lxzf+rs1tfivgeclfqm0ewuasfiwkuxds7c2znxsf7s4j4cgvspgv2fio2nrb+w/cxicfdxbevdfd7lwgu9kq6747pynjqemz8mxlbn5hcam6vhjm0lwlbilupfoi4aavtlbfkhrezbp5hcg8d3ohq//8brwraxjk/qw7mb9mzcqfjyv51zqnbhpifbnnfve+rhvurbkln6gzw3skr9mrx91wg+4i7c9pjskgjhhsv9+8mbv5jakjxy1pn92mogxxjixcqcxldwzc9pglti1neiymu4zogbcavu sha1_base64="jgrkmmn/j+mnpd5nimdb/px4pni=">aaacdxicbvc7sgnbfj31gemramkzgaiwiwwsvaqhycnwevwtyc7l7oqmgtp7coauejz8gy2/ymohymtv5984ersaegdg3hpu5c49qskfrtv+thywl5zxvnnr+fwnza3tws7ury5txchhsyxvm2aapijaqyesmokcfgysgkh/yuq3hkbpeuc3oejac1k3eh3bgrrjl5rcazrbzbfnbnbni7klpubtusmynbur4z5e+ywixbhhopokoivfmkxdl3y57ziniutijdo6vbut9dkmuhajw7ybakgy77mutaynwajay8bndgnjkg sha1_base64="bsw+/z7xfttmfprw/becc1nu/m8=">aaacdxicbvc7sgnbfj31genr1djmmaqsqti1ubgegi1yrtamka3l7oqmgtp7coauejz8gy2/ymohymtv5984ersaegdg3hpu5c49qskfrsf5thywl5zxvnnr+fwnza1te2f3vsep4ldjsyxvi2aapiighgilnbiflawk1ip+xcivp4dsio5ucjbak2tdshqez2gk3y560js3ms/ovezofyupe4cm9iylamppwj298u2cu3bgoppenzicmalq219eo+zpcbfyybruuk6crywpffzcmo+lghlg+6wltumjfojuzenzhrrold sha1_base64="xzvxeir+93c6votukygk/emnraa=">aaacdxicbvc7sgnbfj31bxxf7bqzxaqlcbswkoiqsbfbigcmkf2w2cmngz19ohnxceuwsrlxv2wsvnla2/kj1k4eha8da+eecy937gltktq6zoc1mjo2pje5nv2ymz2bxygulp3pjfmcqjyritopmqypyqiiqannqqiwhrjq4dvbz6/dgniiiu+xnyifsytynavnaksguo5j09xggdj38stabhryajsl19mkpvykxnojogg7jacp+pe4q2kxv+7sz9vjbiuovnunhgcrxmgl07ruoin6ovmouirowcs0pixfsquogxqzcl DATA 18 Scattering Transform Scattering transform at scale J is the cascading of complex WT with modulus non-linearity, followed by a low pass-filtering: S J x = {x? J, x?? 1 J, with x? 1? 2? J } Ref.: Group Invariant Scattering, Mallat S i = {j i, i },j i apple J Depth J x J. J. Mathematically well defined for a large class of wavelets.
19 For people into computer vision: SIFT performs a histogram of gradient h( ) = X kgk Gradient \g2[, + ] = X g k1 \g2[, + ] gk Quantification x?? The averaging leads to a loss of information Relations with SIFT
x<latexit sha1_base64="vbwja9ly6q sha1_base64="a9ccdmxj8b <latexit sha1_base64="ukn0lywidnxbhx5lbgpech2pnps=">aaab/3icbzdlssnafizpvnz6i7pw4wzoeqshjc7uzdgnywrgfppqjtnjo3qyctmtsatd+cscgxcqbn0nd76n04ugrt8mfpznhm6cp0w5u9pxvqyfxaxlldxcwnf9y3nr297zvvvjjgn1smit2qixopwj6mmmow2kkui45lqe9i5h9fodlyol4kb3uxreucnyxajwxmr sha1_base64="dabp+plfweonh5yldjzxj4ckpke=">aaab/3icbzdlssnafiyn9vbrlerchzuhrrcekrhql0e3lisyw2hcmuwn7ddjhzktmbtd+ay+grsxkm59dxd9g6cxqvt/gpj4zzmcox+qcq7askzgywl5zxwtuf7a2nza3jf39+5ukknkxjqirdycopjgmxobg2cnvdisbylvg97vuf6/z1lxjl6fpgv+rdoxdzklok2 DATA Feature map 20 x? <latexit sha1_base64="op27vgr2x2ta/4mkmq73cw5enzi=">aaab8xicbzc7sgnbfibpxlumt6ilfonbsaq7fmoztlfmwdxc7hjmj7pjknklm2ffsoqxbcxubh sha1_base64="b6t5l4p5wmchnzolv7uguqh4nks=">aaab8xicbzdlssnafiyn9vbrrepskceiucqjc3vzdooybwmlssit6aqdopmemroxhc59bdcuvn x?? 1st order coefficients Example of Scattering coefficients
21 Properties Non-linear Ref.: Group Invariant Scattering, Mallat S Isometric ks J xk = kxk Stable to noise ks J x S J ykapplekx yk Covariant with translation S J (x =c )=S J (x) =c Invariant to small translation c apple2 J ) S J (x =c ) S J (x) Sensitive to the action of rotation S J (r x) 6= S J (x) Linearize the action of small deformation ks J x S J xkappleckr k Reconstruction properties Ref.: Reconstruction of images scattering coefficient. Bruna J allow a linear classifier to build class invariant on informative variability
Parameters&dimensionality of the Scattering 22 For 256x256 images, x 2 R 105 Yet it is possible to reduce it up to L x 2 R 2000 It depends on a few parameters of the mother wavelet L, J, the shape Identical parameters can be chosen for natural images on different datasets. It is a generic representation
23 J =3, 2{0, 4, 2, 3 4 } Order 2 Order 1 Order 0 g h g g g g h h h h x 3 x 0 x 1 x 2 Modulus h 0 Scattering coefficients are only at the output 2nd order Translation Scattering
Affine Scattering Transform Observe that W is covariant with the affine group Aff(E): 24 W [ ](g.x) = (g.x)? = W [g. ]x Ref.: PhD, L Sifre See W as a signal parametrised by some elements of the affine group: x? j, (u) = x(g),g =(u, r,j) We can define a WT on any compact Lie group(even not commutative) via: x? G (g) = Z G (g 0 )x(g 0 1.g)dg 0 Ref.: Topics in harmonic analysis related to the Littlewood-Paley theory Stein EM The same previous properties hold for this WT/Scattering. Ref.: Group Invariant Scattering, Mallat S
Separable Roto-Translation 25 Scattering Ref.: Deep Roto-Translation Scattering for Object Classification. EO and S Mallat Roto-translation group is not separable yet we used a separable wavelet transform on it. No averaging along angle (sensitivity increased) Separable (simple to implement and fast) Equal (slightly better) results as with non separable
26 J =3, 2{0, 4, 2, 3 4 } g h g g g g h h h h x 3 x 0 x 1 x 2 Separable convolution that Ref.: Deep Roto-Translation Scattering for Object Classification. EO and S Mallat recombines angles on 2nd order Separable Roto-Translation scattering
Linearization of the rotation Work in progress For a deformation: u (u) v (v)+(i r )(v)(u v) In fact the the affine group acts also on the deformation diffeomorphism: g.(i )(u) g.(i )(v)+g.(i r )(v)(u v) 27 Decompose it on the affine group: (I r )(v) =r (v)k(v) A way to linearize the action of the rotation? ksx Sx kapplec(kr k +...)
28 Classification pipeline x! x! L x! hl x, wi 0 Scattering Transform on YUV +Log Supervised dimensionality Reduction Gaussian SVM We learn L(select features) and w(select samples) from the data Getting L is the most costly part of the algorithm (Orthogonal Least Square)
29 Ref.: On the difference between orthogonal OLS? Supervised forward selection of features. The matching pursuit and orthogonal least squares. T. Blumensath and M. E. Davies. selection is done class per class. (similar to OMP) Principle: given a dictionary of feature { k } k Set y i =1ifi is in the class, 0 otherwise Find the most correlated feature k with y Pull it from the dictionary, orthogonalize the dictionary and normalize the dictionary. Select it. Iterate.
30 Dataset Type Paper Accuracy Caltech101 Scattering 79.9 Unsupervised Ask the locals 77.3 Numerical results RFL 75.3 M-HMP 82.5 Supervised DeepNet 91.4 Caltech256 Scattering 43.6 Unsupervised Ask the Locals 41.7 M-HMP 50.7 Supervised DeepNet 70.6 CIFAR10 Scattering 82.3 Unsupervised RFL 83.1 Supervised DeepNet 91.8 CIFAR100 Scattering 56.8 Unsupervised RFL 54.2 Supervised DeepNet 65.4 Identical Representation
31 Method Caltech101 CIFAR10 Translation Scattering order 1 59.8 72.6 Translation Scattering order 2 70.0 80.3 Translation Scattering order 2+ OLS 75.4 81.6 Roto-translation Scattering order 2 74.5 81.5 Roto-translation Scattering order 2+ OLS 79.9 82.3 Improvement layer wise
32 How to fill in the Gap? Adding more supervision in the pipeline Building a DeepNet on top of it? Initialising a DeepNet with wavelets filters? Question is opened. A more complex classifier could help to handle class variabilities. Fisher vector with scattering? Adding a layer means identifying the next complex source of variabilities. Are they geometric? classes?
33 Other application of ST Audio Vincent Lostanlen Quantum chemistry Matthew Hirn Temporal data, video Reconstruction of WT Irène Waldspurger Unstructured data Mia Chen Xu Cheng
34 Conclusion We provide: Mathematical analysis and algorithms Competitive numerical results Software: http://www.di.ens.fr/~oyallon/ (or send me an email to get the latest!)