The Influenes of Smooth Approximation Funtions for SPTSVM Xinxin Zhang Liaoheng University Shool of Mathematis Sienes Liaoheng, 5059 P.R. China ldzhangxin008@6.om Liya Fan Liaoheng University Shool of Mathematis Sienes Liaoheng, 5059 P.R. China fanliya63@6.om Abstrat: The reently proposed smooth projetion twin support vetor mahine(sptsvm) gains a good generalization ability is suitable for many binary lassifiation problems. But we know that different smooth approximation funtions may bring different lassifiation auraies. In order to study the influene of smooth approximation funtions for SPTSVM, in this paper, we first overview eight known smooth approximation funtions desribe their differentiability error ranges by five lemmas one theorem. Then, we perform a series of omparative experiments on lassifiation auray running time by using SPTSVM with Newton-Armijo method on 0 UCI datasets 6 NDC datasets. From experiment results, we an get a hoie order of the eight approximation funtions in generally. Key Words: Smooth projetion TSVM; plus funtion; smooth approximation funtion; hoie order Introdution Reently, nonparallel hyperplane support vetor mahine (NHSVM) lassifiation methods, as the extension of the lassial SVM, have beome the researhing hot spots in the field of mahine learning. The study of NHSVM lassifiation methods originates from generalized eigenvalue proximal SVM (GEPSVM) [], twin support vetor mahine (TSVM) [] projetion twin support vetor mahine (PTSVM) [3]. For binary data lassifiation problems, NHSVM methods aim to find a hyperplane for eah lass, suh that eah hyperplane is proximal to the data points of one lass far from the data points of the other lass. GEPSVM, TSVM PTSVM are three representative algorithms of NHSVM, all the other NHSVM methods are improved versions based on them. GEPSVM obtains eah of nonparallel hyperplanes by solving the eigenvetor orresponding to a smallest eigenvalue of a generalized eigenvalue problem, so that eah hyperplane is as lose as possible to the points of its own lass as far as possible from the points of the other lass, in the meantime. Twin support vetor mahine (TSVM) onstruts a pair of nonparallel hyperplanes by solving two smaller size QPPs rather than a single quadrati programming problem (QPP) suh that eah one is as lose as possible to one lass, as far as possible from the other lass. A new input will be assigned to one of the lasses depending on its proximity to whih hyperplane. Experiments show that TSVM is faster than SVM [,4]. Different from GEPSVM TSVM, the entral idea in PTSVM is to find a projetion axis for eah lass, suh that withinlass variane of the projeted samples of its own lass is minimized; meanwhile, the projeted samples of the other lass satter away as far as possible. PTSVM is an improvement extension of multi-weight vetor projetion SVM (MVSVM) [5]. In order to further enhane the performane of PTSVM, Shao et al. [6] proposed a least squares version of PTSVM, alled least squares PTSVM (LSPTSVM). LSPTSVM works extremely faster than PTSVM beause the solutions of LSPTSVM an be attained by solving two systems of linear equations, whereas PTSVM needs to solve two QPPs. Beause of this, the least squares method is also reeived great attention in support tensor mahine. Later, Shao et al. [7-9] proposed a simple reasonable variant of PTSVM from theoretial point E-ISSN: 4-880 4 Volume 5, 06
of view, alled PTSVM with regularization term (RPTSVM), in whih the regularized risk priniple is implemented the nonlinear lassifiation ignored in PTSVM is also onsidered in RPTSVM. Ding Hua [0] formulated a nonlinear version of LSPTSVM for binary nonlinear lassifiation by introduing nonlinear kernel into LSPTSVM. This formulation leads to a novel nonlinear algorithm, alled nonlinear L- SPTSVM (NLSPTSVM). Ding et al. reviewed many known nonparallel hyperplane support vetor mahine algorithms in []. In addition, by means of the idea of smooth TSVM in [], the authors of the paper introdue smoothing tehnique into PTSVM propose smooth PTSVM (SPTSVM) in [3]. We know that by using smoothing tehniques, we an solve primal unonstrained d- ifferentiable optimization problems rather than dual QPPs, whih results that many optimization methods an be used in smooth versions of various variants of TSVM, suh as Newton method, quasi Newton method, Newton-Armijo method so on. But we disover that different smooth approximation funtions have different impats for lassifiation results even using the same lassifier. So, in this paper, we first overview eight smooth approximation funtions proposed in [4-] then ompare their influenes for SPTSVM on 6 datasets taken from UCI database NDC database. Taking into aount the length of the paper, we only disuss the influenes of smooth approximation funtions for linear version of SPTSVM. By means of kernel skill, we an disuss the influenes of smooth approximation funtions for nonlinear SPTSVM by using the similar way. Linear PTSVM SPTSVM In this setion, we reall linear PTSVM linear SPTSVM briefly, for details see [3,3]. Let T = {(x (i) j, y (i) j )} m i j=, i =, be a set of data samples for a binary lassifiation problem, where i = denotes the positive lass, i = denotes the negative lass, m i denotes the number of samples belonging to lass i, x (i) j R n y (i) j {±} are respetively the input lass label of jth sample in lass i. Let µ (i) = mi m i j= x (i) j be the mean of lass i for i =, A = [x (),, x () m ] T R m n B = [x (),, x () m ] T R m n denote the input matries of positive negative lasses, respetively, m = m + m. Let e R m e R m be vetors of ones.. Linear PTSVM The entral idea of linear PTSVM is to find a projetion axis for eah lass suh that withinlass variane of the projeted samples of its own lass is minimized meanwhile the projeted samples of the other lass satter away as far as possible. This leads to the following two optimization problems: min m w,ξ i= (w T x () i w T µ () ) + m k= ξ k k s.t. w T x () k w T µ () + ξ k, ξ k 0, k =,,, m, () min m w,η k i= (w T x () i w T µ () ) + m k= η k s.t. (w T x () k w T µ () ) + η k, η k 0, k =,,, m, () where, > 0 are trade-off parameters {ξ k } m k= {η k} m k= are slak variables. Put S = m j= (x () i µ () )(x () i µ () ) T R n n, S = m j= (x () i µ () )(x () i µ () ) T R n n, then the problems () () an be written as the following matrix forms, respetively: min w,η min w,ξ wt S w + e T ξ s.t. Bw m e e T Aw + ξ e, ξ 0. wt S w + e T η s.t. (Aw m e e T Bw ) + η e, η 0. (3) (4) By solving the Wolfe dual problems of the problems (3) (4), respetively, min α αt (B m e e T A)S (B T m A T e e T )α e T α s.t. 0 α e, min β βt (A m e e T B)S (A T m B T e e T )β e T β s.t. 0 β e, E-ISSN: 4-880 4 Volume 5, 06
we an obtain the optimal Lagrange multipliers vetors α β. Without loss of generality, we an let S S are nonsingular matries. Otherwise, sine they are symmetri nonnegative definite matries, we an regularize them by using S + εi n S + εi n to replae S S, respetively, where ε > 0 is a suffiient small number I n denotes the n order unit matrix. Consequently, we an dedue that w = S (B T m A T e e T )α, w = S (A T m B T e e T )β, then the lass label of a new input x R n an be assigned by lass(x) = arg min (wi ) T x (wi ) T µ (i). i=,. Linear SPTSVM The main idea of linear SPTSVM is to introdue smoothing tehnique into PTSVM, whih results in solving a pair of primal unonstraint differentiable optimization problems rather than a pair of dual QPPs. By introduing the plus funtions: x + = max {x, 0}, x R, x + = ((x ) +,, (x n ) + ) T, x R n, the onstraints of the primal problems () () an be rewritten as follows, respetively, ξ k = ( w T x () k + w T µ () ) +, k =,,, m, η k = ( + w T x () k w T µ () ) +, k =,,, m, ξ = (ξ,, ξ m ) T = (e + e Ãw Bw ) + R m, η = (η,, η m ) T = (e e Bw + Aw ) + R m. where à = (µ() ) T B = (µ () ) T. In order to avoid the singularities of the matries S S involved in linear PTSVM, we adding the generalization terms 3 w 4 w in problems () (), respetively. In addition, for obtaining differentiable optimization problems, we replae one penalty by two penalty for slak vetors ξ η. Consequently, we get two improved unonstraint optimization problems: min (e + e Ãw Bw ) + w Aw e Ãw + 3 w, + (5) min (e e Bw + Aw ) + w Bw e Bw + 4 w. + (6) Beause the plus funtion (x) + for x R is nondifferentiable, for effetively quikly solving the problems (5) (6) by using the known optimization methods, we need to introdue a s- moothing approximation funtion ρ(x, ) for the plus funtion (x) +, where is a smoothing parameter. Consequently, the problems (5) (6) an be further improved as the following two unonstraint differentiable optimization problems: min f (w ) = Aw w e Ãw + ρ(e + e Ãw Bw, ) + 3 w, min f (w ) = Bw w e Bw + ρ(e + Aw e Bw, ) + 4 w. (7) (8) In this paper, we mainly use Newton-Armijo method for solving the problems (7) (8)..3 Newton-Armijo method Newton-Armijo method is one of the most popular iterative algorithms for solving unonstraint smooth optimization problems has been shown to be quadratially onvergent (see [5]). In order to use Newton-Armijo method, firstly we need to alulate the gradient vetors f i (w i ) Hessian matries f i (w i ) of the objetive funtions of the problems (7) (8): f (w ) = m i= ρ(z i, )ρ (z i, )(ÃT x () i ) +(A e Ã) T (A e Ã)w + 3 w, f (w ) = m i=(ρ (z i, ) + ρ(z i, )ρ (z i, )) (ÃT x () ) T i )(ÃT x () i +(A e Ã) T (A e Ã) + 3 I, f (w ) = m i= ρ(z i, )ρ (z i, )(x () i B T ) +(B e B) T (B e B)w + 4 w, f (w ) = m i=(ρ (z i, ) + ρ(z i, )ρ (z i, )) (x () i B T )(x () i B T ) T +(B e B) T (B e B) + 4 I, E-ISSN: 4-880 43 Volume 5, 06
where z i = + Ãw (x () i ) T w, z i = Bw (x () i ) T w I is the identity matrix of appropriate dimension. Then we alulate the searh diretion d t by Newton method (alled Newton diretion) searh stepsize λ t by Armijo method (alled Armijo stepsize) for t-th step iteration. The speifi proedure is as follows, in whih we only solve the problem (7). With the similar way, we an solve the problem (8). Algorithm. The Newton-Armijo algorithm for solving linear SPTSVM Step. Initialization. For given parameter values, 3 the maximum number of iterations T, let t = 0 ε > 0 be small enough take arbitrarily nonzero vetor w t R n. Step. Calulate Newton diretion d t by solving the system of linear equations f (w)d t t = f (w). t Step 3. Calulate Armijo stepsize λ t by inexat linear searhing, that is, hoose λ t = max {,,, } satisfying 4 f (w t ) f (w t + λ t d t ) λt 4 f (w t ) T d t. Step 4. Update w. t Calulate the next iterative point by formula w t+ = w t + λ t d t. Step 5. If w t+ w t < ε or the maximum number of iterations T is ahieved, stop iteration take w = w t+ ; otherwise, put t t + return to step. Step 6. The lass label of a new input x R n is assigned by lass(x) = arg min (w i ) T x (wi ) T mi m i i=, j= x(i) j. 3 Smooth approximation funtions In this setion, we briefly overview eight smooth approximation funtions for the plus funtion x +, whih are taken from [4-], desribe the differentiability error ranges of these approximation funtions as five lemmas one theorem. In all approximation funtions, > 0 denotes the smooth parameter. In 980, Zhang [4] introdued a smooth approximation funtion as the integral of the sigmoid funtion for the plus funtion x +, whih is defined as follows. ρ (x, ) = x + ln( + e x ), x R. (9) Later, this approximation funtion is used in many SVM models, suh as [3-5]. It is evident that lim + ρ (x, ) = x +, x R. This indiates that the approximation effet will be better better with the inrease of the value of. The first- seond-order derivatives of ρ (x, ) are respetively ρ (x, ) = ρ (x, ) = + e x, e x ( + e x ), where ln( ) is the natural logarithms e is the base of natural logarithms. Lemma 3.. [] Let ρ (x, ) be defined by (9). Then () ρ (x, ) is arbitrary order smooth with respet to x; () ρ (x, ) x +, x R; (3) for arbitrarily k > 0 with x < k, one has ρ (x, ) x + ( ln ) + ( k ) ln. In 005, Yuan et al. [5] proposed the following quadrati pieewise polynomial smooth approximation funtion for x + got a quadrati polynomial smooth support vetor mahine (QPSSVM) model: ρ (x, ) = x, x, 4 x + x + 4, < x <, 0, x. (0) The first- seond-order derivatives of ρ (x, ) are respetively, x, ρ (x, ) = x +, < x <, 0, x, ρ (x, ) = {, x <, 0, x. In the same year, Yuan et al. [6] introdued the following fourth pieewise polynomial smooth approximation funtion for x + got E-ISSN: 4-880 44 Volume 5, 06
a fourth polynomial smooth support vetor mahine (FPSSVM) model: x, x, ρ 3 (x, ) = (x + 6 )3 (x 3), < x <, 0, x. () The first- seond-order derivatives of ρ 3 (x, ) are respetively, x, ρ 3 (x, ) = (x + 8 ) (x 5), < x <, 0, x, ρ 3 (x, ) = { 3 8 (x + )(x 3), x <, 0, x. Lemma 3. [6] Let ρ (x, ) ρ 3 (x, ) be defined by (0) (), respetively. Then () ρ (x, ) is -order smooth ρ 3 (x, ) is -order smooth with respet to x; () ρ (x, ) x + ρ 3 (x, ) x + for all x R; (3) for any x R, one has ρ (x, ) x + ρ 3 (x, ) x +. 9 In 007, Xiong et al. [7] derived an important reursive equation () proposed a lass of smooth approximation funtions using the interpolation tehnique. ρ d 4(x, ) = a I d dx I d = x(x )d (d ) I d (d ) d, d =, 3,, () where d is the number of iterations a R is a parameter. For example, taking d = a = 3, we an alulate that I = 3 x3 3 x 3 = 3 ( x 3 x ), then ρ 4(x, ) = 4 x4 x x, x R. The first- seond-order derivatives of ρ 4(x, ) are respetively (ρ 4) (x, ) = x 3 x, (ρ 4) (x, ) = 3 x. It notes that the larger the parameter d is, the higher the approximation auray is, but that will generate the additional omputation ost. So, we only onsider the ase of d =, that is, ρ 4(x, ). In the same year, Yuan et al. [9] proposed a three-order spline interpolation polynomial approximation funtion obtained a three-order spline smooth support vetor mahine (TSSVM) model: x, x >, 6 ρ 5 (x, ) = x3 + x + x +, 0 < x, 6 6 x3 + x + x +, < x 0, 6 0, x. (3) The first- seond-order derivatives of ρ 5 (x, ) are respetively, x >, ρ 5 (x, ) = x + x +, 0 < x, x + x +, < x 0, 0, x, { ρ x + > 0, x < 5 (x, ) =, 0, x. Lemma 3.3 [8] Let ρ 5 (x, ) be defined by (3). Then () ρ 5 (x, ) is -order smooth with respet to x; () ρ 5 (x, ) x +, x R; (3) for any x R, one has ρ 5 (x, ) x +. 4 In 03, Wu et al. [9] introdued a three order pieewise polynomial approximation funtion: 0, x <, 4 8(x+ 4 )3, ρ 6 (x, ) = x 0, 3 4 (4) x + 8( 4 x)3, 0 < x 3α, 4 x, x >. 4 The first- seond-order derivatives of ρ 6 (x, ) are respetively 0, x <, 4 8(x+ ρ 4 ), 6 (x, ) = x 0, 4 8(x 4 ), 0 < x, 4, x >, 4 E-ISSN: 4-880 45 Volume 5, 06
ρ 6 (x, ) = 0, x > 4, 6( 4 x ) 0, x 4. Lemma 3.4 [9] Let ρ 6 (x, ) be defined by (4). Then () ρ 6 (x, ) is -order smooth with respet to x; () ρ 6 (x, ) x +, x R; (3) for any x R, one has ρ 6 (x, ) x + 385. In the same year, Ding et al. [0] introdued a luster of polynomial approximation funtions obtained a polynomial smooth twin support vetor regression: x, x, ρ n 7(x, ) = ( + x n (l 3)!! l= ( + x ) l ) (l)!! + x, x <, 0, x, (5) where n =, 3,. The first- seond-order derivatives of ρ n (x, ) are respetively (ρ n 7) (x, ) = (ρ n 7) (x, ) =, x, x ( + n l= (l 3)!! (l)!! ( x ) l ) +, x <, 0, x, x, x, ( + n (l 3)!! l= x 0, x, ( (l )!! x ) l ) nl= (l 3)!! ( (l 4)!! x ) l, x <, Lemma 3.5 [5] Let ρ n 7(x, ) be defined by (5). Then () ρ n 7(x, ) is n-order smooth with respet to x; () n lim max(ρ n 7(x, ) x + ) = 0. In 04, a quadrati polynomial smooth approximation funtion was proposed in [] as follows: ρ(x, α) = 4 α x + x + α, x R, 4 where α R : α 0 is a smooth parameter. If letting = α > 0, one has ρ 8 (x, ) = x + x + 4 4 = ( x + 4 ), x R. (6) The first- seond-order derivatives of ρ 8 (x, ) are respetively ρ 8 (x, ) = ( x + ), ρ 8 (x, ) =. Theorem 3. Let ρ 8 (x, ) be defined by (6). Then () ρ 8 (x, ) is -order smooth with respet to x; () ρ 8 (x, ) x +, x R; (3) lim x (ρ 8(x, ) x + ) = 0. Proof. The first onlusion is obvious. Sine { 4 ρ 8 (x, ) x + = ( x ), x > 0, ( x + (7) 4 ), x 0, we an obtain the seond onlusion. From (7), we an get lim x (ρ 8(x, ) x + ) = lim x 4 ( x ) = 0 when x > 0, lim x (ρ 8(x, ) x + ) = lim x 4 ( x + ) = 0 when x 0, whih indiates that the third onlusion is true. 4 The Influenes for SPTSVM In this setion, in order to illustrate the influenes of eight smooth approximation funtions for linear SPTSVM, we perform a series of omparative experiments of binary lassifiation problems on lassifiation auray running time by using 0 datasets taken from UCI database [6] 6 datasets taken from NDC database [7] are listed in Table. In 0 UCI datasets, Iris, Vehile, Waveform Balane four datasets all have 3 E-ISSN: 4-880 46 Volume 5, 06
Table : Desription of NDC datasets Dataset Training data Test data Features NDC-00 00 40 3 NDC-500 500 00 3 NDC-700 700 40 3 NDC-000 000 00 3 NDC-000 000 400 3 NDC-3000 3000 600 3 lasses, we hoose the later two lasses for experiments, respetively. All experiments are implemented in Matlab (7..0) R00b environment on a PC with an Intel P4 proessor (.30 GHz) with 4 GB RAM SPTSVM is implemented by Newton-Armijo algorithm, that is, Algorithm, the fivefold ross-validation method. We know that the hoie of parameters have great impat on the performane of a lassifier, in order to failitate omparison, we take ε = 0 3, T = 50 in Algorithm = = 3 = 4 = after grid searhing from { 8,, 8 }. The lassifiation auray is defined by Auray = T P + T N T P + F P + T N + F N, where TP, TN, FP FN denote the numbers of true positive, true negative, false positive false negative, respetively. The experiment results on 0 UCI datasets are listed in Table on 6 NDC datasets are listed in Table 3, in whih the seventh s- mooth approximation funtion ρ n 7(x, ) is taken as ρ 4 7(x, ). In addition, in order to explain the influene of order n for ρ n 7(x, ), we perform omparative experiments with n =, 3, 4, respetively, the experiment results are listed in Table 4. It should also be pointed out that the fourth smooth approximation funtion ρ d 4(x, ) are not ommonly used, so we only take d =, that is ρ 4(x, ). From Table, we an see that on the lassifiation auray () ρ (x, ) is the best on Breast Waveform two datasets the next best on the rest datasets exept to Liver dataset; () ρ (x, ) ρ 3 (x, ) are ompletely the same, whih are the best on Breast, Pima Waveform three datasets, are slightly worse than ρ (x, ) on the rest datasets, are the worst on Liver dataset; (3) ρ 4(x, ) is the best on Balane, Liver, Iris Vehile four datasets is the worst on Breast, Pima Waveform three datasets; (4) ρ 5 (x, ) is almost the same as ρ (x, ) ρ 3 (x, ); (5) ρ 4 7(x, ) is the same as ρ (x, ) ρ 3 (x, ) exept to Liver dataset learly better than ρ (x, ), ρ 3 (x, ) ρ 5 (x, ) on Liver dataset; (6) although ρ 6 (x, ) ρ 8 (x, ) are the best on WBC Vote two datasets, respetively, but generally speaking, ρ 6 (x, ), ρ 8 (x, ) ρ 4 7(x, ) are omparable. On the running time, ρ 8 (x, ) osts the least time among these datasets exept for Vehile Waveform two datasets ρ 4(x, ) osts the most time exept for Liver dataset. From Table 3, we an see that () ρ (x, ) has the highest lassifiation auray on all datasets; () ρ (x, ), ρ 3 (x, ), ρ 5 (x, ) ρ 4 7(x, ) have the same the lassifiation auraies on all datasets, whih are slightly lower than the orresponding lassifiation auraies of ρ (x, ), respetively; (3) ρ 6 (x, ) ρ 8 (x, ) have the almost same the lassifiation auraies on all datasets, whih are omparable with the lassifiation auraies of ρ (x, ), ρ 3 (x, ), ρ 5 (x, ) ρ 4 7(x, ); (4) ρ 4(x, ) has the worst lassifiation auraies exept for NDC-700 dataset the longest running times exept for NDC-700 NDC-000 datasets; (5) ρ (x, ) ρ 3 (x, ) have the shortest running times on NDC-00, NDC-700 NDC- 3000 three datasets. From Table 4, we an see that ρ 7(x, ), ρ 3 7(x, ) ρ 4 7(x, ) have the same lassifiation auraies on 0 UCI datasets, just the running times are different, whih indiates that the lassifiation auray of SPTSVM may have only small hanges with inreasing of the order n. On the basis of the above analysis, we an onlude that when hoosing a smooth approximation funtion in order to improve the lassifiation auray of SPTSVM, in general, we an firstly onsider ρ (x, ), seondly onsider one of ρ (x, ), ρ 3 (x, ) ρ 5 (x, ), thirdly onsider E-ISSN: 4-880 47 Volume 5, 06
Table : Comparison results on 0 UCI datasets Dataset ρ (x, ) ρ (x, ) ρ 3 (x, ) ρ 4 (x, ) ρ 5(x, ) ρ 6 (x, ) ρ 8 (x, ) ρ 4 7 (x, ) Auray Auray Auray Auray Auray Auray Auray Auray (%) (%) (%) (%) (%) (%) (%) (%) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) Balane 9.8 90.3509 90.3509 9.985 90.3509 90.877 9.8 90.3509 (65 4).6683 0.9795.64 3.5855 0.877.469 0.703 0.803 Breast 7.3637 7.6464 7.6364 59.6364 7.0000 7.6364 70.5455 7.6364 (77 9) 0.8860 0.687 0.7097.643 0.5080 0.694 0.453.087 Heart 8.5000 8.967 8.967 75.83333 8.967 8.6667 8.6667 8.967 (303 3) 0.83.8574 0.5638.5075 0.5008 0.6890 0.46 0.9584 Pima 76.0784 76.09 76.09 7.94 76.09 75.94 75.448 76.09 (768 8).96.074.668 3.495.0680.8637 0.70.6805 Vote 96.0000 94.0000 94.0000 94.0000 94.0000 96.0000 96.50000 94.0000 (435 6) 0.6796 0.505 0.5667.3384 0.4773 0.5500 0.443.900 Liver 59.655 53.793 53.793 6.7586 54.379 60.0000 60.3448 60.0000 (345 6).5990.038.403 0.976.973 0.8570 0.7598.370 WBC 97.49 96.4706 96.4706 95.6 96.4706 97.309 44.0336 96.4706 (600 9) 4.595 0.749 0.74.956 0.6794.05 0.5944.7458 Iris 93.0000 93.0000 93.0000 96.0000 93.0000 9.0000 9.0000 93.0000 (50 4) 0.968 0.665 0.788 0.54 0.774 0.70 0.55 0.750 Vehile 86.0000 86.0000 86.0000 93.0000 86.0000 86.0000 86.0000 86.0000 (50 8) 0.504 0.4405 0.3836.4558 0.4668 0.563 0.569 0.6887 Waveform 93.0000 93.0000 93.0000 90.0000 93.0000 9.0000 9.0000 93.0000 (50 ) 0.8 0.4066 0.7379 0.9567 0.7906 0.4984 0.4839 0.5858 Table 3: Comparison results on NDC datasets with eight smoothing approximation funtions Dataset ρ (x, ) ρ (x, ) ρ 3 (x, ) ρ 4 (x, ) ρ 5(x, ) ρ 6 (x, ) ρ 8 (x, ) ρ 4 7 (x, ) Auray Auray Auray Auray Auray Auray Auray Auray (%) (%) (%) (%) (%) (%) (%) (%) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) Time(s) NDC-00 94.3590 9.3077 9.3077 8.053 9.3077 9.8 9.8 9.3077 0.778 0.5599 0.554.0489 0.6099 0.865 0.597 0.679 NDC-500 93.9394 9.33 9.33 9.77 9.33 93.33 9.773 9.33.987.045.3064 3.093.378.3987.93.7377 NDC-700 95.857 94.743 94.743 95.743 94.743 95.0000 94.743 94.743.446.3848.93 3.9989 6.389 5.8373 4.338.963 NDC-000 96.484 96.84 96.84 93.9698 96.84 96.389 95.9799 96.84 3.7086 4.3996 5.998 9.0389 6.449 7.9403.893.90 NDC-000 97.0500 96.60000 96.6000 8.053 96.6000 96.5000 96.5500 96.6000 0.9988 3.9 3.64 3.084.0087 8.650 5.649 9.376 NDC-3000 97.566 97.084 97.084 95.864 97.084 97.6 97.095 97.084 7.7454 6.330 7.09 47.598 33.484 34.4930 9.605 37.665 Table 4: Comparison results with ρ 7, ρ 3 7 ρ 4 7 Dataset ρ 7 (x, ) ρ3 7 (x, ) ρ4 7 (x, ) Auray(%) Auray(%) Auray(%) Time(s) Time(s) Time(s) Balane 90.3509 90.3509 90.3509 (65 4).4.9469 0.803 Breast 7.6364 7.6364 7.6364 (77 9) 0.438 0.646.087 Heart 8.967 8.967 8.967 (303 3) 0.760.4943 0.9585 Pima 76.09 76.09 76.09 (768 8).08.044.6805 Vote 94.0000 94.0000 94.0000 (435 6) 0.7866 0.76.900 Liver 60.0000 60.0000 60.0000 (345 6).5035.339.370 WBC 96.4706 96.4706 96.4706 (600 9).548.584.7459 Iris 93.0000 93.0000 93.0000 (50 4) 0.557 0.673 0.750 Vehile 86.0000 86.0000 86.0000 (50 8) 0.579 0.655 0.6887 Waveform 93.0000 93.0000 93.0000 (50 ) 0.6079 0.6053 0.5858 E-ISSN: 4-880 48 Volume 5, 06
one of ρ 6 (x, ), ρ 8 (x, ) ρ n 7(x, ) finally onsider ρ 4(x, ). Of ourse, different smooth approximation funtions will bring different lassifiation auraies. So, we should hoose a suitable smooth approximation funtion for underlying dataset. 5 Conlusions In this paper, we study the influene of eight smooth approximation funtions for SPTSVM on lassifiation auray running time by means of 0 UCI datasets 6 NDC datasets. From experiment results, we an get a hoie order of the eight approximation funtions in generally. But we know that different approximation funtions may bring different lassifiation auraies, we should hoose a suitable smooth approximation funtion for underlying dataset. As stated in Introdution, GEPSVM, TSVM PTSVM are three representative methods of NHSVM all the other NHSVM methods are improved versions based on them. In this paper, we only disuss the influene of eight known s- mooth approximation funtions for smooth version of PTSVM only disuss the linear version of SPTSVM. In the next step of work, we should do: firstly, we will investigate the influene of these approximation funtions for s- mooth versions of GEPSVM TSVM, respetively; seondly, we will onsider the nonlinear version of SPTSVM; thirdly, we will be ommitted to finding more smooth approximation funtions ompare the auraies of approximating with them to the plus funtion. Referenes: [] O.L. Mangasarian, E.W. Wild, Multisurfae proximal support vetor mahine lassifiation via generalized eigenvalues, IEEE Trans Pattern Anal Mah Intell. 8(), 006, pp. 69-74. [] Jayadeva, R. Khemhani, S. Chra, Twin support vetor mahines for pattern lassifiation, IEEE Trans Pattern Anal Mah Intell. 9(5), 007, pp. 905-90. [3] X. Chen, J. Yang, Q. Ye, J. Liang, Reursive projetion twin support vetor mahine via withinlass variane minimization, Pattern Reognition. 44, 0, pp. 643-655. [4] Y.H. Shao, C.H. Zhang, X.B. Wang, N.Y. Deng, Improvements on twin support vetor mahines, IEEE Transations on Neural Networks. (6), 0, pp. 96-968. [5] Q. Ye, C. Zhao, N. Ye, Y. Chen, Multiweight vetor projetion support vetor mahines, Pattern Reognition Letters. 3 (3), 00, pp. 006-0. [6] Y.H. Shao, N.Y. Deng, Z.M. Yang, Least squares reursive projetion twin support vetor mahine for lassifiation, Pattern Reogn. 45(6), 0, pp. 99-307. [7] Y.H. Shao, W.J. Chen, W.B. Huang, Z.M. Yang, N.Y. Deng, The best separating deision tree twin support vetor mahine for multi-lass lassifiation, Proedia Comput Si. 7, 03, pp. 03-038. [8] Y.H. Shao, Z. Wang, W.J. Chen, N.Y. Deng, A regularization for the projetion twin support vetor mahine, Knowl-Based Syst. 37, 03, pp. 03-0. [9] Y.H. Shao, C.H. Zhang, Z.M. Yang, L. Jing, N.Y. Deng, An etwin support vetor mahine for regression, Neural Comput Appli 3(), 03, pp. 75-85. [0] S.F. Ding, X.P. Hua, Reursive least squares projetion twin support vetor mahines for nonlinear lassifiation, Neuroomputing. 30(3), 04, pp. 3-9. [] S.F. Ding, X.P. Hua, J.Z. Yu, An overview on nonparallel hyperplane support vetor mahine algorithms, Neural Comput Appli. 5, 04, pp. 975-98. [] M. Arun Kumar, M. Gopal, Appliation of smoothing tehnique on twin support vetor mahine, Pattern Reogration Letters. 9, 008, pp. 84-848. [3] X.X. Zhang, L.Y. Fan, Appliation of s- moothing tehnique on projetive TSVM, International Journal of Applied Mathematis Mahine Learning. (), 05, pp. 7-45. [4] I. Zhang, A smoothing-out tehnique for min-max optimization, Math. Program. 9, 980, pp. 6-77. [5] Y.B. Yuan, J. Yan, C.X. Xu, Polynomial smooth support vetor mahine, Chinese Journal of Computers. 8(), 005, pp. 9-7. E-ISSN: 4-880 49 Volume 5, 06
[6] Y.B. Yuan, T.Z. Huang, A polynomial s- mooth support vetor mahine for lassifiation, Proeedings of the st International Conferene on Advaned Data Mining Appliations (ADMA 05). 005, pp.57-64. [7] J.Z. Xiong, J.L. Hu, H.Q. Yuan, Researh on a new lass of funtions for smoothing support vetor mahines, Ata Eletronia Sinia. 35(), 007, pp. 366-370. [8] Y.B. Yuan, W.G. Fan, D.M. Pu, Spline funtion smooth support vetor mahine for lassifiation, Journal of Industrial Management Optimization. 3(3), 007, pp. 59-54. [9] Q. Wu, J.L. Fan, Smooth support vetor mahine based on pieewise funtion, SieneDiret. 0(5), 03, pp. -8. [0] S.F. Ding, H.J. Huang, R. N, Foreasting method of stok prie based on polynomial smooth twin support vetor regression, Springer-Verlag Berlin Heidelberg. 7995, 03, pp. 96-05. [] S. Balasundaram, Deepak Gupta, Kapil, Lagrangian support vetor regression via unonstrained onvex minimization, Neural Networks. 5, 04, pp. 67-79. [] Lee, Y.J. Mangasarian, A smooth support vetor mahine for lassifiation, Computational Optimization Appliations. 0(), 00, pp. 5-. [3] X. Chen, J. Yang, J. Liang, Q. Ye, S- mooth twin support vetor regression, Neural Computation. (3), 0, pp. 505-53. [4] Z. Wang, Y. Shao, T. Wu, A GA-based model seletion for smooth twin parametrimargin support vetor mahine, Pattern Reogration. 46, 03, pp. 67-77. [5] Y.Q. Liu, S.Y. Liu, M.G. Gu, Selftraining polynomial support smooth semisupervised support vetor mahines, Journal of System Simulation. (8), 009, pp. 5740-5743. [6] C.L. Blake, C.J. Merz, UCI Repository for Mahine Learning Databases, 998. http:// www.is.ui.edu/mlearn/mlrepository.html. [7] D.R. Musiant, NDC: Normally distributed lustered datasets, 998. http:// www.s.wis.edu/ musiant/data/ nd. E-ISSN: 4-880 50 Volume 5, 06