CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae. Support vectors. Support vector maches. Etesos to the o-separable case. Kerel fuctos. CS 75 Mache Learg
Learly separable classes here s a hyperplae that separates trag staces th o error Hyperplae: + = Class (+ + > Class (- + < CS 75 Mache Learg Logstc regresso Separatg hyperplae: + = y >.5? d d We ca use gradet methods or Neto Rhapso for sgmodal stchg fuctos ad lear the eghts Recall that e lear the lear decso boudary CS 75 Mache Learg
Perceptro algorthm Perceptro algorthm: Smple teratve procedure for modfyg the eghts of the lear model Italze eghts Loop through eamples, y the dataset D. Compute y ˆ =. If y yˆ = the + 3. If y yˆ = + the Utl all eamples are classfed correctly Propertes: guarateed covergece CS 75 Mache Learg Solvg va LP Lear program soluto: Fds eghts that satsfy the follog costrats: + For all, such that y = + + For all, such that = y ogether: y ( + Property: f there s a hyperplae separatg the eamples, the lear program fds the soluto CS 75 Mache Learg
Optmal separatg hyperplae here are multple hyperplaes that separate the data pots Whch oe to choose? Mamum marg choce: mamzes dstace d + d + here d + s the shortest dstace of a postve eample from the hyperplae (smlarly for egatve eamples d Marg dstace d d + CS 75 Mache Learg Mamum marg hyperplae For the mamum marg hyperplae oly eamples o the marg matter (oly these affect the dstaces hese are called support vectors CS 75 Mache Learg
Fdg mamum marg hyperplaes Assume that eamples the trag set are, y such that y { +, } Assume that all data satsfy: + for y = + + for y = he equaltes ca be combed as: y ( + for all Equaltes defe to hyperplaes: + = + = CS 75 Mache Learg Fdg the mamum marg hyperplae Dstace of a pot th label from the hyperplae: d ( = ( + / - ormal to the hyperplae.. L - Eucldea orm L Dstace of a pot th label -: d ' = ( ' + / Dstace of a pot th label y: L ρ,, y = y( + / L CS 75 Mache Learg
Fdg the mamum marg hyperplae Geometrcal marg: ρ,, y = y( + / L For pots satsfyg: y ( + = he dstace s L Wdth of the marg: d + + d = L CS 75 Mache Learg Mamum marg hyperplae We at to mamze d We do t by mmzg + + d = L, - varables = L / / But e also eed to eforce the costrats o pots: [ y ( + ] CS 75 Mache Learg
Mamum marg hyperplae Soluto: Icorporate costrats to the optmzato Optmzato problem (Lagraga [ y ( + ] = α - Lagrage multplers J (,, α = / α Mmze th respect to, (prmal varables Mamze th respect to α (dual varables Lagrage multplers eforce the satsfacto of costrats [ y ( + ] > If α Else α > Actve costrat CS 75 Mache Learg Ma marg hyperplae soluto Set dervatves to (Karush-Kuh-ucker (KK codtos J (,, α = α y = J (,, α = = α y = = No e eed to solve for Lagrage parameters (Wolfe dual J ( α = = α, = α α y Quadratc optmzato problem: soluto αˆ for all y Subect to costrats α for all, ad α = y = mamze CS 75 Mache Learg
Mamum hyperplae soluto he resultg parameter vector ŵ ca be epressed as: ˆ = ˆ α αˆ s the soluto of the dual problem = y he parameter s obtaed through Karush-Kuh-ucker codtos αˆ y ( ˆ + = [ ] Soluto propertes αˆ = for all pots that are ot o the marg ŵ s a lear combato of support vectors oly he decso boudary: ˆ + = αˆ y + = SV CS 75 Mache Learg he decso boudary: he decso: Support vector maches ˆ ˆ = α y SV + yˆ = sg αˆ y SV + + CS 75 Mache Learg
he decso boudary: ˆ he decso: Support vector maches ˆ = α y SV + yˆ = sg αˆ y + SV (!!: Decso o a e requres to compute the er product betee the eamples Smlarly, the optmzato depeds o J ( α = α α α y y =, = CS 75 Mache Learg + Eteso to a learly o-separable case Idea: Allo some fleblty o crossg the separatg hyperplae CS 75 Mache Learg
Eteso to the learly o-separable case Rela costrats th varables + ξ + + ξ for for = + Error occurs f ξ, ξ s the upper boud o the = umber of errors Itroduce a pealty for the errors mmze Subect to costrats / + C ξ = ξ y y = C set by a user, larger C leads to a larger pealty for a error CS 75 Mache Learg Eteso to learly o-separable case Lagrage multpler form (prmal problem Dual form after, are epressed ( ξ s cacel out J ( α = α α α y y = he parameter, = [ y ( + + ξ ], α = / + C ξ α = = J (, µ ξ Subect to: α C for all, ad α y = = Soluto: ˆ = αˆ y = he dfferece from the separable case: α C s obtaed through KK codtos = CS 75 Mache Learg
he decso boudary: ˆ he decso: Support vector maches ˆ = α y SV + yˆ = sg αˆ y + SV (!!: Decso o a e requres to compute the er product betee the eamples Smlarly, the optmzato depeds o J ( α = α α α y y =, = CS 75 Mache Learg + Nolear case he lear case requres to compute ( he o-lear case ca be hadled by usg a set of features. Essetally e map put vectors to (larger feature vectors φ( It s possble to use SVM formalsm o feature vectors Kerel fucto φ( φ( ' Crucal dea: If e choose the kerel fucto sely e ca compute lear separato the feature space mplctly such that e keep orkg the orgal put space!!!! K, ' = φ( φ( ' CS 75 Mache Learg
Kerel fucto eample Assume = [ ad a feature mappg that maps the put, ] to a quadratc feature set φ( = [,,,,,] Kerel fucto for the feature space: K ', = φ( ' φ( = ' + ' + ' ' + ' + ' + = ' + ' + = ( + ' he computato of the lear separato the hgher dmesoal space s performed mplctly the orgal put space CS 75 Mache Learg Nolear eteso Kerel trck Replace the er product th a kerel A ell chose kerel leads to effcet computato CS 75 Mache Learg
Kerel fucto eample Lear separator the feature space No-lear separator the put space CS 75 Mache Learg Polyomal kerel Kerel fuctos Lear kerel K, ' = ' [ ] ' k K, ' = + Radal bass kerel K, ' = ep ' CS 75 Mache Learg
Kerels SVM researchers have proposed kerels for comparso of varety of obects: Strgs rees Graphs Cool thg: SVM algorthm ca be o appled to classfy a varety of obects CS 75 Mache Learg