Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

Image classfcaton Gven te bag-of-features representatons of mages from dfferent classes ow do we learn a model for dstngusng tem?

Classfers Learn a decson rule assgnng bag-offeatures representatons of mages to dfferent classes Decson ecso boundary Zebra Non-zebra

Classfcaton Assgn nput vector to one of two or more classes Any decson rule dvdes nput space nto decson regons separated by decson boundares

Nearest Negbor Classfer Assgn label of nearest tranng data pont to eac test data pont from Duda et al. Vorono parttonng of feature space for two-category D and 3D data Source: D. Lowe

K-Nearest Negbors For a new pont fnd te k closest ponts from tranng data Labels of te k ponts vote to classfy Works well provded tere s lots of data and te dstance functon s good k = 5 Source: D. Lowe

Functons for comparng stograms N L dstance: = = N D N χ dstance: = + = N D Quadratc dstance cross-bn dstance: = j A D Hstogram ntersecton smlarty functon: = j j j A D g y = N I mn =

Lnear classfers Fnd lnear functon yperplane to separate postve and negatve eamples postve : w + b negatve : w + b < 0 0 Wc yperplane s best?

Support vector macnes Fnd yperplane tat mamzes te margn between te postve and negatve eamples C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Support vector macnes Fnd yperplane tat mamzes te margn between te postve and negatve eamples postve y negatve y = : = : w + b w + b Support vectors Margn For support vectors w + b = ± w Dstance between pont and yperplane: w + Terefore te margn s / w b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane. Mamze margn / w. Correctly classfy all tranng data: postve y = : negatve y = : w + b w + b Quadratc optmzaton problem: Mnmze w T w Subject to y w +b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane Soluton: w = α y learned wegt Support vector C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Fndng te mamum margn yperplane Soluton: w = α y b = y w for any support vector Classfcaton functon decson boundary: w + b = α y Notce tat t reles on an nner product between te test pont and te support vectors Solvng te optmzaton problem also nvolves computng te nner products j between all pars of tranng ponts + b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Nonlnear SVMs Datasets tat are lnearly separable work out great: 0 But wat f te dataset s just too ard? 0 We can map t to a ger-dmensonal space: 0 Slde credt: Andrew Moore

Nonlnear SVMs General dea: te orgnal nput space can always be mapped to some ger-dmensonal feature space were te tranng set s separable: Φ: φ Slde credt: Andrew Moore

Nonlnear SVMs Te kernel trck: nstead of eplctly computng te lftng transformaton φ defne a kernel functon K suc tatt K j = φ φ j to be vald te kernel functon must satsfy Mercer s condton Ts gves a nonlnear decson boundary n te orgnal feature space: α yϕ ϕ + b = α yk + b C. Burges A Tutoral on Support Vector Macnes for Pattern Recognton Data Mnng and Knowledge Dscovery 998

Nonlnear kernel: Eample Consder te mappng = ϕ y y y K y y y y y + = + = = ϕ ϕ y y y K + =

Kernels for bags of features Hstogram ntersecton kernel: N I = mn = Generalzed Gaussan kernel: K ep D A = D can be L dstance Eucldean dstance χ dstance etc. J. Zang M. Marszalek S. Lazebnk and C. Scmd Local Features and Kernels for Classfcaton of Teture and Object Categores: A Compreensve Study IJCV 007

Summary: SVMs for mage classfcaton. Pck an mage representaton n our case bag of features. Pck a kernel functon for tat representaton 3. Compute te matr of kernel values between every par of tranng eamples 4. Feed te kernel matr nto your favorte SVM solver to obtan support vectors and wegts 5. At test tme: compute kernel values for your test eample and eac support vector and combne tem wt te learned wegts to get te value of te decson functon

Wat about mult-class SVMs? Unfortunately tere s no defntve multclass SVM formulaton In practce we ave to obtan a mult-class SVM by combnng multple two-class SVMs One vs. oters Tranng: learn an SVM for eac class vs. te oters Testng: apply eac SVM to test eample and assgn to t te class of te SVM tat returns te gest decson value One vs. one Tranng: learn an SVM for eac par of classes Testng: eac learned SVM votes for a class to assgn to te test eample

SVMs: Pros and cons Pros Many publcly avalable SVM packages: ttp://www.kernel-macnes.org/software Kernel-based framework s very powerful fleble SVMs work very well n practce even wt very small tranng sample szes Cons No drect mult-class SVM must combne two-class SVMs Computaton memory Durng tranng tme must compute matr of kernel values for every par of eamples Learnng can take a very long tme for large-scale problems

Summary: Classfers Nearest-negbor and k-nearest-negbor classfers L dstance χ dstance quadratc dstance stogram ntersecton Support vector macnes Lnear classfers Margn mamzaton Te kernel trck Kernel functons: stogram ntersecton generalzed Gaussan pyramd matc Mult-class l Of course tere are many oter classfers out tere Neural networks boostng decson trees