An Introduction to Support Vector Machines

Similar documents
18.7 Artificial Neural Networks

Machine Learning Support Vector Machines SVM

UNIVERSITY OF IOANNINA DEPARTMENT OF ECONOMICS. M.Sc. in Economics MICROECONOMIC THEORY I. Problem Set II

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 9

Rank One Update And the Google Matrix by Al Bernstein Signal Science, LLC

4. Eccentric axial loading, cross-section core

Least squares. Václav Hlaváč. Czech Technical University in Prague

Principle Component Analysis

Machine Learning. Support Vector Machines. Le Song. CSE6740/CS7641/ISYE6740, Fall Lecture 8, Sept. 13, 2012 Based on slides from Eric Xing, CMU

Lecture 4: Piecewise Cubic Interpolation

Dennis Bricker, 2001 Dept of Industrial Engineering The University of Iowa. MDP: Taxi page 1

Multiple view geometry

Review of linear algebra. Nuno Vasconcelos UCSD

CISE 301: Numerical Methods Lecture 5, Topic 4 Least Squares, Curve Fitting

Chapter Newton-Raphson Method of Solving a Nonlinear Equation

DCDM BUSINESS SCHOOL NUMERICAL METHODS (COS 233-8) Solutions to Assignment 3. x f(x)

Math 497C Sep 17, Curves and Surfaces Fall 2004, PSU

Remember: Project Proposals are due April 11.

The Schur-Cohn Algorithm

Linear and Nonlinear Optimization

Quiz: Experimental Physics Lab-I

Chapter Newton-Raphson Method of Solving a Nonlinear Equation

7.2 Volume. A cross section is the shape we get when cutting straight through an object.

SVMs for regression Multilayer neural networks

COMPLEX NUMBER & QUADRATIC EQUATION

INTRODUCTION TO COMPLEX NUMBERS

Pyramid Algorithms for Barycentric Rational Interpolation

GAUSS ELIMINATION. Consider the following system of algebraic linear equations

Katholieke Universiteit Leuven Department of Computer Science

Multilayer Perceptron (MLP)

Partially Observable Systems. 1 Partially Observable Markov Decision Process (POMDP) Formalism

Support Vector Machines CS434

Model Fitting and Robust Regression Methods

Natural Language Processing and Information Retrieval

ESCI 342 Atmospheric Dynamics I Lesson 1 Vectors and Vector Calculus

6 Roots of Equations: Open Methods

Introduction to Numerical Integration Part II

Kernel Methods and SVMs Extension

INTERPOLATION(1) ELM1222 Numerical Analysis. ELM1222 Numerical Analysis Dr Muharrem Mercimek

Variable time amplitude amplification and quantum algorithms for linear algebra. Andris Ambainis University of Latvia

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Lecture 10 Support Vector Machines II

Applied Statistics Qualifier Examination

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

SVMs for regression Non-parametric/instance based classification method

More metrics on cartesian products

Electrochemical Thermodynamics. Interfaces and Energy Conversion

Problem Set 9 Solutions

Demand. Demand and Comparative Statics. Graphically. Marshallian Demand. ECON 370: Microeconomic Theory Summer 2004 Rice University Stanley Gilbert

Jens Siebel (University of Applied Sciences Kaiserslautern) An Interactive Introduction to Complex Numbers

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

1 Matrix representations of canonical matrices

Generalized Linear Methods

Lecture Notes on Linear Regression

Physics 121 Sample Common Exam 2 Rev2 NOTE: ANSWERS ARE ON PAGE 7. Instructions:

Feature Selection: Part 1

Two Coefficients of the Dyson Product

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

NUMERICAL DIFFERENTIATION

Bi-level models for OD matrix estimation

p 1 c 2 + p 2 c 2 + p 3 c p m c 2

Math1110 (Spring 2009) Prelim 3 - Solutions

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Proof that if Voting is Perfect in One Dimension, then the First. Eigenvector Extracted from the Double-Centered Transformed

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

PHYS 705: Classical Mechanics. Calculus of Variations II

LOCAL FRACTIONAL LAPLACE SERIES EXPANSION METHOD FOR DIFFUSION EQUATION ARISING IN FRACTAL HEAT TRANSFER

Definition of Tracking

A New Algorithm Linear Programming

Support Vector Machines. Jie Tang Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University 2012

Lecture 10 Support Vector Machines. Oct

APPENDIX A Some Linear Algebra

Complex Numbers. x = B B 2 4AC 2A. or x = x = 2 ± 4 4 (1) (5) 2 (1)

VECTORS AND TENSORS IV.1.1. INTRODUCTION

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

ACTM State Calculus Competition Saturday April 30, 2011

Name: SID: Discussion Session:

Numerical Heat and Mass Transfer

MMA and GCMMA two methods for nonlinear optimization

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Effects of polarization on the reflected wave

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

Lecture 12: Discrete Laplacian

Affine transformations and convexity

HMMT February 2016 February 20, 2016

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Maximal Margin Classifier

Difference Equations

Physics 5153 Classical Mechanics. D Alembert s Principle and The Lagrangian-1

Partial Derivatives. Limits. For a single variable function f (x), the limit lim

Study of Trapezoidal Fuzzy Linear System of Equations S. M. Bargir 1, *, M. S. Bapat 2, J. D. Yadav 3 1

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Transform Coding. C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Lecture 36. Finite Element Methods

C4B Machine Learning Answers II. = σ(z) (1 σ(z)) 1 1 e z. e z = σ(1 σ) (1 + e z )

Torsion, Thermal Effects and Indeterminacy

Transcription:

An Introducton to Support Vector Mchnes

Wht s good Decson Boundry? Consder two-clss, lnerly seprble clssfcton problem Clss How to fnd the lne (or hyperplne n n-dmensons, n>)? Any de? Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Perceptron Perceptrons cn be used to fnd the seprtng hyperplne Feed-forwrd rchtectures, where the neurons re orgnzed nto herrchcl lyers nd the sgnl flows n just drecton. Perceptrons lyers: Input nd Output w j z j g wj x j Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Response of n output neuron Gven step-lke trnsfer functon, the output neuron of perceptron s ctvted f the ctvton s postve: d z 0 w x 0 The nput spce s then dvded nto two regons by hyperplne wth equton w x In vectorl notton: d 0 W X 0 where X s the vector of nput vlues nd W s the vector of weghts connectng the nput neurons wth the output. Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Wht s good Decson Boundry? Consder two-clss, lnerly seprble clssfcton problem Mny decson boundres! The Perceptron lgorthm cn be used to fnd such boundry Are ll decson boundres eqully good? Clss Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmples of Bd Decson Boundres Clss Clss Clss Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Lrge-mrgn Decson Boundry The decson boundry should be s fr wy from the dt of both clsses s possble We should mxmze the mrgn, m Clss Clss m Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors..

Vectors: nottons x B A A vector n n-dmensonl spce n descrbed by n-uple of rel numbers Vector symbols cn be wrtten: Wth n rrow up to the vector nmes: Wth bold chrcter: As column mtrces. A B A A A B A B x B B A T B T A B A B Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: sum The components of the sum vector re the sums of the components C A B C C A A B B x C C B A A B A B C x Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: dfference The components of the sum vector re the sums of the components x C B A C C B B A A B C A A C B A C B x -A Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: product by sclr The components of the sum vector re the dfference of the components C A C C A A x C 3A A A A C x Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: Norm The most smple defnton for norm s the euclden module of the components A A. X Y X Y x. 3. X X X 0 se X 0 A A A A A A x Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: dstnce between two ponts The dstnce between two ponts s the norm of the dfference vector d A, B A B B A x B C A A B C d A, B B A B A A C B x -A Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: Sclr product c AB A T B A B. X, Y Y, X x. 3. X Y,Z X, Z X,Y X, Y Y, Z nd nd X, Y X,Y Z X, Y X, Y X, Z 4. X, X 0 B A B A θ A B x c A B cos Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: Sclr product x A Consder reference frme where B s collner to the x xs (you cn ALWAYS fnd t) θ A A B A A A x B B 0 c AB A T B A B A B B A cos Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: Sclr product A A B 90 A, B 0 90 A, B 0 B A B 90 A, B 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Vectors: Norm nd sclr product The components of the sum vector re the sums of the components A T A A A A, A Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Defnton of hyperplne pssng through the orgn In R, hyperplne s lne A lne pssng through the orgn cn be defned wth s the set of ponts defned by the vectors tht re perpendculr to gven vector W x X W x W XW X W W T X 0 X 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Defnton of hyperplne pssng through the orgn In R 3, hyperplne s plne A plne pssng through the orgn cn be defned wth s the set of the vectors tht re perpendculr to gven vector W x 3 XW W T X 0 W X W X W 3 X 3 0 W x x Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Defnton of hyperplne pssng out the orgn In R, hyperplne s lne Consder vector W: t defnes n nfnty of strght lnes perpendculr to t. A prtculr lne s fxed when the projecton of ponts of the lne on vector W s fxed to vlue p: x X W p x X cos( ) p T XW W X p W W p 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Defnton of hyperplne pssng out the orgn In R, hyperplne s lne Consder vector W: t defnes n nfnty of strght lnes perpendculr to t. A prtculr lne s fxed when the projecton of ponts of the lne on vector W s fxed to vlue p: X x X cos( ) p T XW W X p W W p W x p 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Defnton of hyperplne pssng out the orgn In R, hyperplne s lne Cllng: 0 b X W X W W b W X W W XW T Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn p W X W W XW T W b p

Defnton of hyperplne pssng through the orgn In R, hyperplne s lne W X W X b 0 x x X X W -b/ W x W -b/ W x b<0 b>0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Generle defnton of hyperplne In R n, n hyperplne s defned by XW b W T X b 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

An hyperplne dvdes the spce x X <BW>/ W W A <AW>/ W -b/ W x AW W T A B b AW b 0 BW W T B b BW b 0 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dstnce between hyperplne nd pont A x <AW>/ W <BW>/ W X W B -b/ W x d( A, r) d( B, r) AW W BW W b b Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dstnce between two prllel hyperplnes W T X b 0 x W T X b' 0 -b / W W -b/ W x d ( r, r') b b' W Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Wht s good Decson Boundry? Consder two-clss, lnerly seprble clssfcton problem Mny decson boundres! The Perceptron lgorthm cn be used to fnd such boundry Are ll decson boundres eqully good? Clss Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Lrge-mrgn Decson Boundry The decson boundry should be s fr wy from the dt of both clsses s possble We should mxmze the mrgn, m Clss Mxmze m Mnmze w Clss m Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Hyperplne Clssfers() W W X X b b for for y y - Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Fndng the Decson Boundry Let {x,..., x n } be our dt set nd let y {,-} be y=- y=- y=- y=- Clss the clss lbel of x y= y= y= y= y=- y=- m y= Clss For y = For y =- So: T w x b T w x b T w x b, x y y, Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Fndng the Decson Boundry The decson boundry should correctly clssfy ll ponts The decson boundry cn be found by solvng the followng constrned optmzton problem Ths s constrned optmzton problem. Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Constrned optmzton problems: Lgrnge Multplers

Am We wnt to mxmse the functon z = f(x,y) subject to the constrnt g(x,y) = c (curve n the x,y plne) 9/6/05 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn 35

Smple soluton Solve the constrnt g(x,y) = c nd express, for exmple, y=h(x) The substtute n functon f nd fnd the mxmum n x of f(x, h(x)) Anlytcl soluton of the constrnt cn be very dffcult Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Contour lnes of functon Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Contour lnes of f nd constrnt Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Lgrnge Multplers Suppose we wlk long the constrnt lne g (x,y)= c. In generl the contour lnes of f re dstnct from the constrnt g (x,y)= c. Whle movng long the constrnt lne g (x,y)= c the vlue of f vry (tht s, dfferent contour levels for f re ntersected). Only when the constrnt lne g (x,y)= c touches the contour lnes of f n tngentl wy, we do not ncrese or decrese the vlue of f: the functon f s t ts locl mx or mn long the constrnt. Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Geometrcl nterpretton Contour lne nd constrnt re tngentl: ther locl perpendculr to re prllel Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Gven curve g(x,y) = c the grdent of g s: Consder ponts of the curve: (x,y); (x+ε x, x+ε y ), for smll ε The locl perpendculr to curve: Grdent y g x g g, (x,y) (x+ε x, y+ε y ) ), ( ), ( ), (,,, y x T y x y y x x y x g y x g y g x g y x g y x g ε Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The locl perpendculr to curve: Grdent Gven curve g(x,y) = c the grdent of g s: (x,y) (x+ε x, x+ε y ) ε grd (g) Snce both ponts stsfy the curve equton: T c ε T g ( x, y) g The grdent s perpendculr to ε. For smll ε, ε s prllel to the curve nd,by consequence, the grdent s perpendculr to the curve c ε 0 ( x, y) Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Norml to curve Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Lgrnge Multplers On the pont of g(x,y)=c tht Mx-mn-mze f(x,y), the grdent of f s perpendculr to the curve g(x,y) =c, otherwse we should ncrese or decrese f by movng loclly on the curve. So, the two grdents re prllel for some sclr λ (where s the grdent). Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Lgrnge Multplers Thus we wnt ponts (x,y) where g(x,y) = c nd, To ncorporte these condtons nto one equton, we ntroduce n uxlry functon (Lgrngn) nd solve. F( x, y, ) f ( x, y) g( x, y) c Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Recp of Constrned Optmzton Suppose we wnt to: mnmze/mxmze f(x) subject to g(x) = 0 A necessry condton for x 0 to be soluton: - : the Lgrnge multpler For multple constrnts g (x) = 0, =,, m, we need Lgrnge multpler for ech of the constrnts - Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn Constrned Optmzton: nequlty We wnt to mxmze f(x,y) wth nequlty constrnt g(x,y)c. The serch must be confned n the red porton (grdent of functon ponts towrds the drecton long whch t ncreses) g(x,y) c

Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn Constrned Optmzton: nequlty mxmze f(x,y) wth nequlty constrnt g(x,y)c. If the grdents re opposte (<0) the functon ncreses n the llowed porton The mxmum cnnot be on the curve g(xy)=c (the constrnt do not ct) The mxmum s on the constnt only f >0 g(x,y) c f ncreses, 0 F( x, y, ) f ( x, y) g( x, y) c

Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn Constrned Optmzton: nequlty Mnmze f(x,y) wth nequlty constrnt g(x,y)c. If the grdents re prllel (>0) the functon decreses n the llowed porton The mnmum cnnot be on the curve g(xy)=c (the constrnt do not ct) The mnmum s on the constrnt only f <0 g(x,y) c f decrese, 0 F( x, y, ) f ( x, y) g( x, y) c

Constrned Optmzton: nequlty mxmze f(x,y) wth nequlty constrnt g(x,y) c. If the grdents re prllel (>0) the functon ncreses n the llowed porton The mnmum cnnot be on the curve g(xy)=c (the constrnt do not ct) The mxmum s on the constrnt only f <0 F( x, y, ) f ( x, y) g( x, y) c g(x,y) c 0 f ncreses, Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Constrned Optmzton: nequlty Mnmze f(x,y) wth nequlty constrnt g(x,y) c. If the grdents re opposte (<0) the functon decreses n the llowed porton The mnmum cnnot be on the curve g(xy)=c (the constrnt do not ct) The mnmum s on the constrnt only f >0 F( x, y, ) f ( x, y) g( x, y) c g(x,y) c 0 f decreses, Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Krush-Kuhn-Tucker condtons wth α stsfyng the followng condtons: nd The functon f(x) subject to constrnts g (x) 0 or g (x) 0 s mx-mnmzed by optmzng the Lgrnge functon F( x, ) f ( x) g (x) 0 g (x) 0 MIN α 0 α 0 MAX α 0 α 0 ( x 0 ) 0, g g ( x) Ether the constrnt ct (x 0 s on the curve:g=0) or not (=0) Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Constrned Optmzton: nequlty Krush-Kuhn-Tucker complementrty condton ( x 0 ) 0, mens tht g 0 g ( x ) o 0 The constrnt s ctve only on the border, nd cncel out n the nternl regons Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dul problem If f(x) s functon Is solved by: From the frst equton we cn fnd x s functon of the These reltons cn be substtuted n the Lgrngn functon obtnng the dul Lgrngn functon L( ) nf L( x, ) nf f ( x) g( x) x x Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dul problem KKT condtons mposes to serch for 0 L( ) nf L( x, ) nf f ( x) g( x) x x 0 for ech Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Concve-Convex functons Concve Convex Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dul problem: f convex dul L concve L( ) nf L( x, ) nf f ( x) g( x) x x The dul Lgrngn s concve: mxmsng t wth respect to, wth >0, solve the orgnl constrned mnmzton problem. We compute s: mx L( ) mx nf L( x, ) mx nf f ( x) g ( x) x x Then we cn obtn x by substtutng usng the expresson of x s functon of Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Dul problem:trvl exmple Mnmze the functon f(x)=x wth the constrnt x - (trvl: x=-) The Lgrngn s L( x, ) x ( x ) Mnmsng wth respect to x L 0 x 0 x x The dul Lgrngn s Mxmsng t gves: = Then substutng, L( ) 4 - x 4 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Wht s good Decson Boundry? Consder two-clss, lnerly seprble clssfcton problem Mny decson boundres! The Perceptron lgorthm cn be used to fnd such boundry Are ll decson boundres eqully good? Clss Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Fndng the Decson Boundry Let {x,..., x n } be our dt set nd let y {,-} be y=- y=- y=- y=- Clss the clss lbel of x y= y= y= y= y=- y=- m y= Clss For y = For y =- So: T w x b T w x b T w x b, x y y, Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Fndng the Decson Boundry The decson boundry should clssfy ll ponts correctly The decson boundry cn be found by solvng the followng constrned optmzton problem Ths s constrned optmzton problem. Solvng t requres to use Lgrnge multplers Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Fndng the Decson Boundry The Lgrngn s 0 Note tht w = w T w Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Settng the grdent of w.r.t. w nd b to zero, we hve Grdent wth respect to w nd b 0 0, b L k w L k n m k k k m k k k n T T b x w y w w b x w y w w L n: no of exmples, m: dmenson of the spce Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Dul Problem If we substtute to, we hve Snce Ths s functon of only Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Dul Problem The new objectve functon s n terms of only It s known s the dul problem: f we know w, we know ll ; f we know ll, we know w The orgnl problem s known s the prml problem The objectve functon of the dul problem needs to be mxmzed The dul problem s therefore: Propertes of when we ntroduce the Lgrnge multplers The result when we dfferentte the orgnl Lgrngn w.r.t. b Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Dul Problem Ths s qudrtc progrmmng (QP) problem A globl mxmum of cn lwys be found w cn be recovered by Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Chrcterstcs of the Soluton Mny of the re zero w s lner combnton of smll number of dt ponts Ths sprse representton cn be vewed s dt compresson s n the constructon of knn clssfer x wth non-zero re clled support vectors (SV) The decson boundry s determned only by the SV Let t j (j=,..., s) be the ndces of the s support vectors. We cn wrte Note: w need not be formed explctly Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

A Geometrcl Interpretton Clss 8 =0.6 0 =0 5 =0 7 =0 =0 4 =0 9 =0 Clss 3 =0 6 =.4 =0.8 Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Chrcterstcs of the Soluton For testng wth new dt z Compute nd clssfy z s clss f the sum s postve, nd clss otherwse Note: w need not be formed explctly Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Qudrtc Progrmmng Problem Mny pproches hve been proposed Loqo, cplex, etc. (see http://www.numercl.rl.c.uk/qp/qp.html) Most re nteror-pont methods Strt wth n ntl soluton tht cn volte the constrnts Improve ths soluton by optmzng the objectve functon nd/or reducng the mount of constrnt volton For SVM, sequentl mnml optmzton (SMO) seems to be the most populr A QP wth two vrbles s trvl to solve Ech terton of SMO pcks pr of (, j ) nd solve the QP wth these two vrbles; repet untl convergence In prctce, we cn just regrd the QP solver s blck-box wthout botherng how t works Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple X: (0;0), clss + X: (;), clss - X(+) X(-) X(+) 0 0 X(-) 0 Sclr products 0 ) ( 0 0,, ), ( L L L Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple X: (0;0), clss + X: (;), clss - X(+) X(-) X(+) 0 0 X(-) 0 Sclr products Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn 0 0 0 0 0 b b b X X hyperplne: W

Exmple X: (0;0), clss + X: (;), clss - Sclr products X(+) X(-) X(+) 0 0 X(-) 0 m W W hyperplne Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple X: (0;0), clss + X: (;), clss - X3: (-;0), clss + X(+) X(-) X3(+) X(+) 0 0 0 X(-) 0 - X3(+) 0 - Sclr products 3 3 3 3 3 3 3 3 ), ( 0 0, ),, ( L wth L Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

3 Exmple X: (0;0), clss + X: (;), clss - X3: (-;0), clss + L(, ) L L 3 3 3 0 0 3 Sclr products X(+) X(-) X3(+) X(+) 0 0 0 X(-) 0 - X3(+) 0 - Cnnot be both strctly > 0 Ether α or α 3 s equl to 0 (t lest one constrnt cts) 3 let s try free mxmzton Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple X: (0;0), clss + X: (;), clss - X3: (-;0), clss + Sclr products X(+) X(-) X3(+) X(+) 0 0 0 X(-) 0 - X3(+) 0 - L 3, 3 0 If α =0 nd α 3 >0 α <0 If α 3 =0 nd α >0 α >0 NO OK α 3 =0, α =, α = Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple Sclr products X: (0;0), clss + X: (;), clss - X3: (-;0), clss + X(+) X(-) X3(+) X(+) 0 0 0 X(-) 0 - X3(+) 0 - Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn 0 0 0 0 0 b b b X X hyperplne: W

Exmple X: (0;0), clss + α =, X: (;), clss - α =, X3: (-;0), clss + α 3 =0 Sclr products X(+) X(-) X3(+) X(+) 0 0 0 X(-) 0 - X3(+) 0 - NOT Support W hyperplne Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Exmple 3 X: (0;0), clss + X: (;), clss + X3: (0;), clss - X4: (/;0), clss - Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Non-lnerly Seprble Problems We llow error x n clssfcton; t s bsed on the output of the dscrmnnt functon w T x+b Clss Clss Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Soft Mrgn Hyperplne The new condtons become x re slck vrbles n optmzton Note tht x =0 f there s no error for x x s n upper bound of the llowed errors We wnt to mnmze w C n x C : trdeoff prmeter between error nd mrgn Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Optmzton Problem n n T n T b x w y C w w L x x x 0 n j j j x y w w L 0 n y x w 0 j j j C L x 0 n y b L Wth α nd μ Lgrnge multplers, POSITIVE Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Dul Problem n j n n j j y y L j T x x n n n j j j n j n n j j b y y C y y L x x x T j j T x x x x j j C 0 n y Wth Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

The Optmzton Problem The dul of ths new constrned optmzton problem s New constrns derve from re postve. w s recovered s C j j snce μ nd α Ths s very smlr to the optmzton problem n the lner seprble cse, except tht there s n upper bound C on now Once gn, QP solver cn be used to fnd Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

w C n x The lgorthm try to keep ξ null, mxmsng the mrgn The lgorthm does not mnmse the number of error. Insted, t mnmses the sum of dstnces fron the hyperplne When C ncreses the number of errors tend to lower. At the lmt of C tendng to nfnte, the soluton tends to tht gven by the hrd mrgn formulton, wth 0 errors Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn

Soft mrgn s more robust Per Lug Mrtell - Systems nd In Slco Bology 05-06- Unversty of Bologn