Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues

Similar documents
Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Support Vector Machine. Industrial AI Lab.

Convex Optimization in Classification Problems

Support Vector Machine

Support Vector Machines: Maximum Margin Classifiers

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Statistical Pattern Recognition

Jeff Howbert Introduction to Machine Learning Winter

Support Vector Machine Classification via Parameterless Robust Linear Programming

Support Vector Machine (continued)

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machines for Classification and Regression

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

Incorporating detractors into SVM classification

Support Vector Machine (SVM) and Kernel Methods

Unsupervised Classification via Convex Absolute Value Inequalities

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

CS , Fall 2011 Assignment 2 Solutions

The Perceptron Algorithm, Margins

CS145: INTRODUCTION TO DATA MINING

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

ML (cont.): SUPPORT VECTOR MACHINES

Linear Support Vector Machine. Classification. Linear SVM. Huiping Cao. Huiping Cao, Slide 1/26

CS798: Selected topics in Machine Learning

FIND A FUNCTION TO CLASSIFY HIGH VALUE CUSTOMERS

c 4, < y 2, 1 0, otherwise,

Mining Classification Knowledge

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University

Support Vector Machine (SVM) and Kernel Methods

Discriminative Models

Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

CS269: Machine Learning Theory Lecture 16: SVMs and Kernels November 17, 2010

Unsupervised Classification via Convex Absolute Value Inequalities

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Support Vector Machines

Chapter 6 Classification and Prediction (2)

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Fisher s Linear Discriminant Analysis

Kernel Methods and Support Vector Machines

SVMs, Duality and the Kernel Trick

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Learning SVM Classifiers with Indefinite Kernels

Introduction to Support Vector Machines

Unsupervised and Semisupervised Classification via Absolute Value Inequalities

Machine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang

Multi-class SVMs. Lecture 17: Aykut Erdem April 2016 Hacettepe University

SVMs: nonlinearity through kernels

Support Vector Machines

SVM May 2007 DOE-PI Dianne P. O Leary c 2007

Linear & nonlinear classifiers

Discriminative Models

Discriminative Direction for Kernel Classifiers

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Linear & nonlinear classifiers

MATH 829: Introduction to Data Mining and Analysis Support vector machines

CS 590D Lecture Notes

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

Linear, Binary SVM Classifiers

Support Vector Machines

Support Vector Machines

Introduction to Support Vector Machines

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

6.036 midterm review. Wednesday, March 18, 15

Support vector machines

Machine Learning : Support Vector Machines

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Lecture 7: Kernels for Classification and Regression

Machine Learning and Data Mining. Support Vector Machines. Kalev Kask

Training Support Vector Machines: Status and Challenges

Linear Classification and SVM. Dr. Xin Zhang

Brief Introduction to Machine Learning

Learning with kernels and SVM

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Support Vector Machines, Kernel SVM

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Kernel Methods. Machine Learning A W VO

Probabilistic Machine Learning. Industrial AI Lab.

Support vector machines Lecture 4

Joint distribution optimal transportation for domain adaptation

Lecture Notes on Support Vector Machine

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines

Announcements - Homework

Perceptron Revisited: Linear Separators. Support Vector Machines

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

Dan Roth 461C, 3401 Walnut

SVMC An introduction to Support Vector Machines Classification

Support Vector Machines.

Support Vector and Kernel Methods

Kernel Methods. Charles Elkan October 17, 2007

CSC 411 Lecture 17: Support Vector Machine

Transcription:

Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues O. L. Mangasarian and E. W. Wild Presented by: Jun Fang Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 1/14

Outline 1. Conventional Support Vector Machine 2. Proximal Support Vector Machine 3. Multisurface Proximal Support Vector Machine 4. Simulation Results Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 2/14

1. Conventional SVM We consider the problem of classifying m points {x i } in the n dimensional real space R n, represented by the m n matrix A. Let A+ denote the points whose membership belongs to class 1; A denotes the points whose membership belongs to class 0. D is a diagonal matrix whose i th diagonal entry is the label y i. The conventional SVM with a linear kernel is given by the following quadratic programming problem with parameter c > 0 1 minimize 2 w 2 + ce T ρ (1) s.t. D(Aw be) e ρ ρ i 0, i = 1,..., m Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 3/14

1. Conventional SVM (Cont.) x T =- w b 1 =+ x T w b 1 A- o o o o o o o o o o o o o o o o o o o o o A+ 2 Margin= w x T w = b Figure 1: The conventional SVM classifier: two parallel bounding planes are devised to approximately separating A from A+. Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 4/14

2. Proximal SVM A simple and fundamental change to the optimization problem (1) results in the proximal SVM. We replace the inequality constraint by an equality and replace ρ with ρ 2. The proximal SVM is given by the following optimization problem: 1 minimize 2 w 2 + c ρ 2 (2) s.t. D(Aw be) = e ρ The planes are not bounding planes any more, but can be thought of as proximal planes, around which the points of each class are clustered, and the planes are pushed apart as far apart as possible. The proximal SVM optimization problem offers an analytical solution. Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 5/14

2. Proximal SVM (Cont.) x T w=- b 1 A- o o o o o o o o o o o o o o o o o o o o o =+ x T w b 1 A+ 2 Margin= w x T w = b Figure 2: The proximal SVM classifier: the two parallel planes are devised so that around which points of the sets A and A+ cluster. Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 6/14

3. Multisurface Proximal SVM The multisurface proximal SVM drops the parallelism condition on the proximal planes and require that each plane be as close as possible to one of the data sets and as far as possible from the other one. The problem involves seeking two unparallel planes: x T w 1 b 1 = 0 and x T w 2 b 2 = 0. To obtain the first plane, minimize the distances between the points in class 0 and the plane, at the same time maximize the distances between the points in class 1 and the plane. This leads to the following optimization problem: A w 1 + b 1 e 2 + c( w 1 2 + b 2 1 min ) w 1,b 1 A + w 1 + b 1 e 2 (3) Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 7/14

3. Multisurface Proximal SVM (Cont.) Let G = [A e] T [A e] + ci Let H = [A + e] T [A e], z = [ wb ] The optimization problem (3) becomes: min z 0 z T Gz z T Hz (4) The above objective function is known as Rayleigh quotient. Under the condition that H is positive definite, the Rayleigh quotient ranges over the interval [λ 1, λ n+1 ] for normalized z, where λ 1 and λ n+1 are the minimum and maximum eigenvalues of the generalized eigenvalue problem: Gz = λhz (5) Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 8/14

3. Multisurface Proximal SVM (Cont.) The requirement of H to be positive definite means that the columns of matrix [A + e] are linearly independent. The linear independence condition is not restrictive for many classification problems for which m 0 n and m 1 n, where m 0 and m 1 denote the number of data points belonging to class 0 and 1, respectively. The authors also claim that the linearly independent condition is a sufficient but not necessary condition. The second plane x T w 2 b 2 = 0 can be obtained in a similar way by solving the following optimization problem: A + w 2 + b 2 e 2 + c( w 2 2 + b 2 2 min ) w 2,b 2 A w 2 + b 2 e 2 (6) Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 9/14

3. Multisurface Proximal SVM (Cont.) The above results can be extended to nonlinear multisurface classifiers using kernels. A kernel-based nonlinear surface can be obtained by generalizing (3) to the following: K(A, A T )u 1 + b 1 e 2 + c( u 1 2 + b 2 1 min ) u 1,b 1 K(A +, A T )u 1 + b 1 e 2 (7) The optimization problem (7) can also be solved as (3). The multisurface proximal SVM has the following two advantages: first, it is more effective in solving the cross-plane problem (Fig. 3); second, for the linear kernel classifier, very large data sets can be handled by multisurface proximal SVM provided the input space dimension n is moderate in size. Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 10/14

3. Multisurface Proximal SVM (Cont.) Figure 3: The cross plane learned by multisurface proximal SVM and linear proximal SVM respectively. Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 11/14

4. Simulation Results Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 12/14

4. Simulation Results (Cont.) Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 13/14

4. Simulation Results (Cont.) Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues p. 14/14