Incorporating detractors into SVM classification

Similar documents
Support Vector Machine (continued)

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machines

Support Vector Machines: Maximum Margin Classifiers

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Convex Optimization and Support Vector Machine

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

Jeff Howbert Introduction to Machine Learning Winter

CSC 411 Lecture 17: Support Vector Machine

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Machine Learning. Support Vector Machines. Manfred Huber

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Support Vector Machine (SVM) and Kernel Methods

Statistical Machine Learning from Data

ICS-E4030 Kernel Methods in Machine Learning

Support Vector Machines for Classification and Regression

Support Vector Machine. Industrial AI Lab.

Statistical Pattern Recognition

Support Vector Machine (SVM) and Kernel Methods

L5 Support Vector Classification

Support Vector Machines

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Support Vector Machine (SVM) and Kernel Methods

CS798: Selected topics in Machine Learning

SVM optimization and Kernel methods

Perceptron Revisited: Linear Separators. Support Vector Machines

SMO Algorithms for Support Vector Machines without Bias Term

Support Vector Machines.

Applied inductive learning - Lecture 7

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Introduction to Support Vector Machines

CS6375: Machine Learning Gautam Kunapuli. Support Vector Machines

Support Vector Machines

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Announcements - Homework

CS-E4830 Kernel Methods in Machine Learning

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Support Vector Machines

Neural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science

Machine Learning and Data Mining. Support Vector Machines. Kalev Kask

Support Vector Machines, Kernel SVM

Introduction to SVM and RVM

Introduction to Support Vector Machines

Lecture Notes on Support Vector Machine

Outline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22

Support Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar

Linear & nonlinear classifiers

Support Vector Machines

Introduction to Support Vector Machines

Machine Learning A Geometric Approach

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear & nonlinear classifiers

Support Vector Machines Explained

Support Vector Machine

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University

Classification and Support Vector Machine

Support Vector Machines. Maximizing the Margin

The Perceptron Algorithm, Margins

Support Vector Machines and Speaker Verification

Linear Support Vector Machine. Classification. Linear SVM. Huiping Cao. Huiping Cao, Slide 1/26

Support Vector Machines

18.9 SUPPORT VECTOR MACHINES

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Support Vector Machines. Machine Learning Fall 2017

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Support Vector Machines

Support Vector Machines and Kernel Methods

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Constrained Optimization and Support Vector Machines

Applied Machine Learning Annalisa Marsico

Support Vector Machines

Support vector machines

Support Vector Machine for Classification and Regression

Support Vector Machines

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

Max Margin-Classifier

Homework 3. Convex Optimization /36-725

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

An introduction to Support Vector Machines

Machine Learning. Support Vector Machines. Fabio Vandin November 20, 2017

Modeling Dependence of Daily Stock Prices and Making Predictions of Future Movements

Brief Introduction to Machine Learning

Lecture 16: Modern Classification (I) - Separating Hyperplanes

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

MACHINE LEARNING. Support Vector Machines. Alessandro Moschitti

Support Vector and Kernel Methods

Support Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Lecture 2: Linear SVM in the Dual

Support Vector Machines for Classification: A Statistical Portrait

Non-linear Support Vector Machines

LECTURE 7 Support vector machines

Multisurface Proximal Support Vector Machine Classification via Generalized Eigenvalues

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina

LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Kernel Methods and Support Vector Machines

Kernel Methods. Machine Learning A W VO

Transcription:

Incorporating detractors into SVM classification AGH University of Science and Technology

1 2 3 4 5

(SVM) SVM - are a set of supervised learning methods used for classification and regression SVM maximal margin classifier classifies data with a hyperplane, that has the largest distance to the closest training vectors Figure: SVM maximal margin classifier

(SVM) SVM soft margin classifier - able to classify nonseparable data Figure: SVM soft margin classifier

solution H : g ( x) = m w i x i +b i=1 soft ( SVM primal optimization problem: minimisation of f w,b, ξ ) where with constraints: f ( w,b, ξ ) = 1 2 w 2 +C y i g (A i ) 1 ξ i ξ i 0 for i {1..n}. C parameter - misclassification cost n i=1 ξ i

solution, cont. SVM dual optimisation problem: maximisation of d ( α) where n 1 2 i=1k=1 d ( α) = n α i i=1 n α k α i y k y i x i x k kernel trick with constraints n α i y i = 0 i=1 0 α i C Figure: SVM dual optimisation problem

refers to all information about the problem available in addition to the training data. types class invariance: transformation invariance, with respect to the domain of the input space knowledge on the data: imbalance of the training set, quality of the data incorporation to SVM sample methods: kernel methods optimization methods ex. class-invariance inside polyhedral regions

Weighting the samples expresses knowledge on the data, for example quality of the data weighted-svm - different misclassification costs C i for every sample inequality constraints form hyperrectangle 0 α i C i

Detractors Detractor is a point, for which we want to far away a decision boundary from it SVM classification preserve the maximal margin classifier and simultaneously incorporate detractors

Detractor examples 7 6 5 detractor 4 3 2 1 0-1 data from class 1 data from class -1 decision bound without detractors decision bound with detractors a) decision bound with detractors b) decision bound with detractors c) decision bound with detractors d) -1 0 1 2 3 4 5 6 7 Figure: Comparison of original SVM problem and SVM problem with detractors

Detractor examples 0.6 0.5 0.4 detractor 0.3 0.2 0.1 data from class 1 data from class -1 data without detractors data with detractors a) data with detractors b) 0 0.2 0.4 0.6 0.8 1 Figure: Comparison of original SVM problem and SVM problem with detractors for nonlinear case

primal ( optimization problem with detractors: minimisation of f w,b, ξ ) where f ( w,b, ξ ) = 1 n 2 w 2 +C i i=1 ξ i with constraints: y i g (A i ) 1 ξ i +b i ξ i 0 for i {1..n}.

dual optimization problem with detractors: maximisation of d ( α) where d ( α) = with constraints for i {1..n}. n α i (1+b i ) 1 2 i=1 n i=1 k=1 n α i y i = 0 i=1 α i 0 α i C i n α k α i y k y i K ik

Detractors efficient solver Sequential Minimal Optimisation (SMO) with incorporated detractors Karush-Kuhn-Tucker complementary condition with incorporated detractors based on this condition heuristic and stopping criteria was derived the efficiency of SVM solver with detractors is comparable with original SVM solver

Nonstationary time series classification It is hard to determine one unique feature set describing the whole time series, so decision bound depends on time periods dynamic of classification model could be expressed by adding or removing detractors, or changing their strength for specific periods online classification algorithms

Detractors use for stock prices prediction Highly complex and competitive environment About 25 % of orders are from automated systems based on some algorithms Every strategy should have ability to react on unexpected events or incorporate highly probable predictions Detractors is a way to adjust decision bound to chosen events, without need of changing SVM feature set

Testing detractors Test with NASDAQ daily data Feature set is chosen by computing past price differences Class is 1, when price is rising, and -1 when price is falling 2 detractors are chosen arbitrarily detractors effectiveness depends on ability to discover and react on highly probable events choosing detractors is highly domain specific task algo misclassified tr. data misclassified test data without detractors 3 182 with 2 detractors 5 156 Table: Comparison of SVM with SVM with detractors

Future work define detractors regions incorporate detractors to regression analysis testing on other time series data sets incorporate other type of a priori knowledge to SVM and other classifiers