Incorporating detractors into SVM classification AGH University of Science and Technology
1 2 3 4 5
(SVM) SVM - are a set of supervised learning methods used for classification and regression SVM maximal margin classifier classifies data with a hyperplane, that has the largest distance to the closest training vectors Figure: SVM maximal margin classifier
(SVM) SVM soft margin classifier - able to classify nonseparable data Figure: SVM soft margin classifier
solution H : g ( x) = m w i x i +b i=1 soft ( SVM primal optimization problem: minimisation of f w,b, ξ ) where with constraints: f ( w,b, ξ ) = 1 2 w 2 +C y i g (A i ) 1 ξ i ξ i 0 for i {1..n}. C parameter - misclassification cost n i=1 ξ i
solution, cont. SVM dual optimisation problem: maximisation of d ( α) where n 1 2 i=1k=1 d ( α) = n α i i=1 n α k α i y k y i x i x k kernel trick with constraints n α i y i = 0 i=1 0 α i C Figure: SVM dual optimisation problem
refers to all information about the problem available in addition to the training data. types class invariance: transformation invariance, with respect to the domain of the input space knowledge on the data: imbalance of the training set, quality of the data incorporation to SVM sample methods: kernel methods optimization methods ex. class-invariance inside polyhedral regions
Weighting the samples expresses knowledge on the data, for example quality of the data weighted-svm - different misclassification costs C i for every sample inequality constraints form hyperrectangle 0 α i C i
Detractors Detractor is a point, for which we want to far away a decision boundary from it SVM classification preserve the maximal margin classifier and simultaneously incorporate detractors
Detractor examples 7 6 5 detractor 4 3 2 1 0-1 data from class 1 data from class -1 decision bound without detractors decision bound with detractors a) decision bound with detractors b) decision bound with detractors c) decision bound with detractors d) -1 0 1 2 3 4 5 6 7 Figure: Comparison of original SVM problem and SVM problem with detractors
Detractor examples 0.6 0.5 0.4 detractor 0.3 0.2 0.1 data from class 1 data from class -1 data without detractors data with detractors a) data with detractors b) 0 0.2 0.4 0.6 0.8 1 Figure: Comparison of original SVM problem and SVM problem with detractors for nonlinear case
primal ( optimization problem with detractors: minimisation of f w,b, ξ ) where f ( w,b, ξ ) = 1 n 2 w 2 +C i i=1 ξ i with constraints: y i g (A i ) 1 ξ i +b i ξ i 0 for i {1..n}.
dual optimization problem with detractors: maximisation of d ( α) where d ( α) = with constraints for i {1..n}. n α i (1+b i ) 1 2 i=1 n i=1 k=1 n α i y i = 0 i=1 α i 0 α i C i n α k α i y k y i K ik
Detractors efficient solver Sequential Minimal Optimisation (SMO) with incorporated detractors Karush-Kuhn-Tucker complementary condition with incorporated detractors based on this condition heuristic and stopping criteria was derived the efficiency of SVM solver with detractors is comparable with original SVM solver
Nonstationary time series classification It is hard to determine one unique feature set describing the whole time series, so decision bound depends on time periods dynamic of classification model could be expressed by adding or removing detractors, or changing their strength for specific periods online classification algorithms
Detractors use for stock prices prediction Highly complex and competitive environment About 25 % of orders are from automated systems based on some algorithms Every strategy should have ability to react on unexpected events or incorporate highly probable predictions Detractors is a way to adjust decision bound to chosen events, without need of changing SVM feature set
Testing detractors Test with NASDAQ daily data Feature set is chosen by computing past price differences Class is 1, when price is rising, and -1 when price is falling 2 detractors are chosen arbitrarily detractors effectiveness depends on ability to discover and react on highly probable events choosing detractors is highly domain specific task algo misclassified tr. data misclassified test data without detractors 3 182 with 2 detractors 5 156 Table: Comparison of SVM with SVM with detractors
Future work define detractors regions incorporate detractors to regression analysis testing on other time series data sets incorporate other type of a priori knowledge to SVM and other classifiers