Nonlnear Classfers II Nonlnear Classfers: Introducton Classfers Supervsed Classfers Lnear Classfers Perceptron Least Squares Methods Lnear Support Vector Machne Nonlnear Classfers Part I: Mult Layer Neural Networks Part II: Polynomal Classfer, RF, Nonlnear SVM Decson rees Unsupervsed Classfers
Nonlnear Classfers: genda 3 Part II: Nonlnear Classfers Polynomal Classfer Specal case of a wo-layer Perceptron ctvaton functon wth non lnear nput Radal ass Functon Network Specal case of a two-layer network Radal ass actvaton Functon ranng s smpler and faster Nonlnear Support Vector Machne Polynomal Classfer: OR problem 4 OR problem wth polynomal functon. Wth nonlnear polynomal functon classes can be classfed. Example OR-Problem: x lnear not separable! x
Polynomal Classfer: OR problem 5 OR problem wth polynomal functon. Wth nonlnear polynomal functons, classes can be classfed. Example OR-Problem: x : z z x x z z 3 but wth a polynomal functon! Polynomal Classfer: OR problem 6 z x Wth x z x xx we obtan: (,) (,,) (,) (,,) (,) (,,) (,) (,,) that s separable n by the yperplane: g ( z ) z z z 3 4 3
Polynomal Classfer: OR problem 7 z x yperplane: g ( y ) w y w g ( z ) z z z 3 4 s yperplane n g ( x) x x x x 4 s Polynom n z z z 3 x x x x x x (true) (false) (false) (true) Polynomal Classfer: OR problem 8 z x Decson Surface n g ( x ) x x x x 4 x x x (x -.5)/(x -) MatLab: >> x=[-.5:.:.5]; >> x=( )./(*x-); >> plot(x,x); 4
Polynomal Classfer: OR problem 9 Wth nonlnear polynomal functons, classes can be classfed n orgnal space Example: OR-Problem x z x z z x z 3 was not lnear separable! but lnear separable n! and separable n wth a polynomal functon! x x Polynomal Classfer more general Decson functon s approxmated by a polynomal functon g(x), of order p e.g. p = : l l l l m m m g ( x ) w w x w x x w x g ( x) w z w, w th w w, w, w, w, w,,,,, and, z x x x x x x x x x Specal case of a wo-layer Perceptron ctvaton functon wth polynomal nput 5
Nonlnear Classfers: genda Part II: Nonlnear Classfers Polynomal Classfer Radal ass Functon Network Specal case of a two-layer network Radal ass actvaton Functon ranng s smpler and faster Nonlnear Support Vector Machne pplcaton: ZIP Code, OCR, FD (W-RVM) Demo: lbsvm, DS or lavac Radal ass Functon Radal ass Functon Networks (RF) Choose g ( x ) w w g ( x ) k w th g ( x) exp x c 6
Radal ass Functon 3 g ( x ) w w g ( x ) k w th g ( x) exp x c Examples: c.5,.,.,.5,.,,..., k, k 5, / c.5,.,.,.5,.,..., k, k 5, / ow to choose c, k?, Radal ass Functon 4 Radal ass Functon Networks (RF) Equvalent to a sngle layer network, wth RF actvatons and lnear output node. 7
Radal ass Functon: OR problem 5 x (,) (,) x (, ) (, ) x (,) (,) z x z (, ) (, ) (,) (,) z exp( x c ) z ( x) exp( x c ) c, c, :.35.35.368.368 (, ) x (, ).368.368 g ( z ) z z g ( x ) exp( x c ) exp( x c ) not lnear separable pattern set n. separable usng a nonlnear functon (RF) n that separates the set n wth a lnear decson hyperplane! Radal ass Functon 6 Decson functon as summaton of k RF s k ( x c ) ( x c ) g ( x) w w exp ranng of the RF networks. Fxed centers: Choose centers randomly among the data ponts. lso fx σ s. hen g ( x ) w w z s a typcal lnear classfer desgn.. ranng of the centers: hs s a nonlnear optmzaton task. 3. Combne supervsed and unsupervsed learnng procedures. 4. he unsupervsed part reveals clusterng tendences of the data and assgns the centers at the cluster representatves. 8
Nonlnear Classfers: genda 7 Part II: Nonlnear Classfer Polynomal Classfer Radal ass Functon Network Nonlnear Support Vector Machne pplcaton: ZIP Code, OCR, FD (W-RVM) Demo: lbsvm, DS or lavac Nonlnear Classfers: SVM OR problem: lnear separaton n hgh dmensonal space va nonlnear functons (polynomal and RF s) n the orgnal space. 8 for ths we found nonlnear mappngs : x drect? z x lnear Is that possble wthout knowng the mappng functon?!? 9
Non-lnear Support Vector Machnes 9 Recall that, the probablty of havng lnearly separable classes ncreases as the dmensonalty of feature vectors ncreases. ssume the mappng: l x R z R, k l k k -> hen use lnear SVM n R Non-lnear SVM Support Vector Machnes: wth x z R k Recall that n ths case the dual problem formulaton wll be m ax N ( y y z z ) j j, j j k w here z R, y, (class labels) the classfer wll be g ( z ) w z w N s y z z w
Non-lnear SVM hus, only nner products n a hgh dmensonal space are needed! => Somethng clever (kernel trck): Compute the nner products n the hgh dmensonal space as functons of nner products performed n the low dmensonal space!!! Non-lnear SVM Is ths POSSILE?? Yes. ere s an example Let x x, x R x Let x z x x R x 3 It s easy to show that z z ( x x ) j j j j j ( x x ) x x x x j j j j x x x x x x x x x j x, x x, x x x j j x j z z j
Non-lnear SVM 3 Mercer s heorem Let x ( x) o guarantee that the symmetrc functon represented as K ( x, x ) j (kernel) can be ( x ) ( x ) K ( x, x ) r r j j that s an nner product n, t s necessary and suffcent that r K ( x, x j ) g ( x ) g ( x j ) d x d x j () for any g(x) : g ( x) d x () Non-lnear SVM 4 Kernel Functon So, any kernel K(x,y) satsfyng () & (), corresponds to an nner product n SOME space!!! Kernel trck: We do not have to know the mappng functon, but for some kernel functons we try to lnearly separate pattern sets n a hgh dmensonal space only usng a functon of the nner product n the orgnal space.
Non-lnear SVM 5 Kernel Functons: Examples Polynomal: K ( x, x ) ( x x ), q j j q Radal ass Functons: K ( x, x j) exp yperbolc angent: x x j K ( x, x ) tanh( x x ) j j for approprate values of b, g (e.g. b = and g =). Non-lnear SVM 6 Support Vector Machnes Formulaton Step : Choose approprate kernel. hs mplctly assumes a mappng to a hgher dmensonal (yet, not known) space. 3
Non-lnear SVM 7 SVM Formulaton Step : arg m ax ( y y K ( x, x j )) j j, j subject to: C,,,..., N y hs results to an mplct combnaton w N s y ( x ) Non-lnear SVM 8 SVM Formulaton Step 3: ssgn x to N s f g ( x) y K ( x, x) w N s f g ( x) y K ( x, x) w 4
Non-lnear SVM 9 SVM: he non-lnear case he SVM rchtecture SVM specal case of a two-layer neural network wth specal actvaton functon and a dfferent learnng method. her attractveness comes from ther good generalzaton propertes and smple learnng. Non-lnear SVM 3 Lnear SVM Pol. SVM n the nput space 5
Non-lnear SVM 3 Pol. SVM RF SVM n the nput space Nonlnear Classfers: SVM 3 Pol. SVM RF SVM n the nput space 6
Nonlnear Classfers: SVM 33 Software 7