An Iterative Modified Kernel for Support Vector Regression

Similar documents
Support Vector Machines. Vibhav Gogate The University of Texas at dallas

FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES

Kernel Methods and SVMs Extension

Natural Language Processing and Information Retrieval

The Study of Teaching-learning-based Optimization Algorithm

Lecture Notes on Linear Regression

Generalized Linear Methods

Linear Classification, SVMs and Nearest Neighbors

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

Support Vector Machines

An Improved multiple fractal algorithm

4DVAR, according to the name, is a four-dimensional variational method.

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

Semi-supervised Classification with Active Query Selection

Which Separator? Spring 1

Chapter 6 Support vector machine. Séparateurs à vaste marge

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis

Support Vector Machines

Multigradient for Neural Networks for Equalizers 1

Support Vector Machines

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

Lecture 3: Dual problems and Kernels

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu

Kristin P. Bennett. Rensselaer Polytechnic Institute

Multilayer Perceptron (MLP)

Image classification. Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing i them?

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

Solving Nonlinear Differential Equations by a Neural Network Method

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

10-701/ Machine Learning, Fall 2005 Homework 3

Transient Stability Assessment of Power System Based on Support Vector Machine

MAXIMUM A POSTERIORI TRANSDUCTION

Lecture 20: November 7

Regularized Discriminant Analysis for Face Recognition

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

Support Vector Machines

Atmospheric Environmental Quality Assessment RBF Model Based on the MATLAB

De-noising Method Based on Kernel Adaptive Filtering for Telemetry Vibration Signal of the Vehicle Test Kejun ZENG

CSE 252C: Computer Vision III

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Order Relation and Trace Inequalities for. Hermitian Operators

Statistical machine learning and its application to neonatal seizure detection

A Hybrid Variational Iteration Method for Blasius Equation

Determination of Compressive Strength of Concrete by Statistical Learning Algorithms

Feature Selection: Part 1

A MODIFIED METHOD FOR SOLVING SYSTEM OF NONLINEAR EQUATIONS

Study of Selective Ensemble Learning Methods Based on Support Vector Machine

Wavelet chaotic neural networks and their application to continuous function optimization

MMA and GCMMA two methods for nonlinear optimization

VQ widely used in coding speech, image, and video

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Advanced Introduction to Machine Learning

Tensor Subspace Analysis

Linear Feature Engineering 11

Intro to Visual Recognition

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

A new Approach for Solving Linear Ordinary Differential Equations

EEE 241: Linear Systems

Inexact Newton Methods for Inverse Eigenvalue Problems

Accurate Forecasting of Underground Fading Channel Based on Improved LS-SVM Anyi Wang1, a, Xi Xi1, b

Evaluation of simple performance measures for tuning SVM hyperparameters

Microwave Diversity Imaging Compression Using Bioinspired

CSC 411 / CSC D11 / CSC C11

Support Vector Machines CS434

Non-linear Canonical Correlation Analysis Using a RBF Network

Support Vector Machines CS434

Support Vector Machines

APPLICATION OF RBF NEURAL NETWORK IMPROVED BY PSO ALGORITHM IN FAULT DIAGNOSIS

Numerical Heat and Mass Transfer

Multi-step-ahead Method for Wind Speed Prediction Correction Based on Numerical Weather Prediction and Historical Measurement Data

Feature Selection in Multi-instance Learning

A Prediction Method of Spacecraft Telemetry Parameter Based On Chaos Theory

Speeding up the IRWLS convergence to the SVM solution

Homework Assignment 3 Due in class, Thursday October 15

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Lecture 23: Artificial neural networks

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

A Robust Method for Calculating the Correlation Coefficient

Orientation Model of Elite Education and Mass Education

1 Convex Optimization

The Minimum Universal Cost Flow in an Infeasible Flow Network

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

Scroll Generation with Inductorless Chua s Circuit and Wien Bridge Oscillator

Sparse Gaussian Processes Using Backward Elimination

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Using Immune Genetic Algorithm to Optimize BP Neural Network and Its Application Peng-fei LIU1,Qun-tai SHEN1 and Jun ZHI2,*

Lecture 6: Support Vector Machines

Lecture 10 Support Vector Machines II

FMA901F: Machine Learning Lecture 5: Support Vector Machines. Cristian Sminchisescu

Ensemble Methods: Boosting

A New Scrambling Evaluation Scheme based on Spatial Distribution Entropy and Centroid Difference of Bit-plane

Report on Image warping

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

A PROCEDURE FOR SIMULATING THE NONLINEAR CONDUCTION HEAT TRANSFER IN A BODY WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY.

Finite Element Modelling of truss/cable structures

Transcription:

An Iteratve Modfed Kernel for Support Vector Regresson Fengqng Han, Zhengxa Wang, Mng Le and Zhxang Zhou School of Scence Chongqng Jaotong Unversty Chongqng Cty, Chna Abstract In order to mprove the performance of a support vector regresson, a new method for modfed kernel functon s proposed. In ths method the nformaton of whole samples s ncluded n kernel functon by conformal mappng. So the Kernel functon s data-dependent. Wth random ntal parameter of kernel functon, teratve modfyng s not stopped untl satsfactory effect. Comparng wth the conventonal model, the mproved approach does not need selectng parameters of kernel functon. Smulaton results show that the mproved approach has better learnng ablty and forecastng precson than tradtonal model. Keywords support vector regresson, data-dependent, kernel, teraton I. INTRODUCTION In recent years, a new pattern classfcaton algorthm, called Support Vector Machne(SVM) has proposed by Vapnk [1] s a powerful methodology for solvng a wde varety of problems n nonlnear classfcaton, functon estmaton, and densty estmaton. Unlke tradtonal neural network models whch mnmze the emprcal tranng error, SVM mplements the structural rsk mnmzaton prncple whch seeks to mnmze the tranng error and a confdence nterval term. Ths results n a good generalzaton error. Because of ts good propertes such as global optmal soluton, good learnng ablty for small samples and so on, the SVM has receved ncreasng attenton n recent years [-4]. Moreover t has been successfully appled to the support vector regresson (SVR), especally for nonlnear tme seres [5-7]. The performance of SVM largely depends on the kernel. Smola [8] elucdated the relaton between the SVM kernel method and the standard regularzaton theory. However, there are no theores concernng how to choose good kernel functon n a data-dependent way. In paper [9], the authors propose a method of modfyng a kernel to mprove the performance of a Support Vector Machne classfer. It s based on the Remannan geometrcal structure nduced by the kernel functon. In order to ncrease the separablty, the dea s to enlarge the spatal resoluton around the separatng boundary surface wth a conformal mappng. In ths method, a prmary kernel s used to obtan support vectors. Then the kernel s modfed conformally n a data dependent way by usng the nformaton of the support vectors. The Fnal classfer s traned by the modfed kernel. Inspred by ths dea, lang yanchun [10] apply the modfed kernel to SVR and forecast fnancal tme seres. These methods have acheved better performance than the conventonal SVM, but we need choose parameters of the modfed kernel carefully. The goal of ths paper s to propose a new method to tran SVM, whch modfes the kernel functon repeatedly. Unlke tradtonal methods whch select parameters of the kernel functon carefully, the parameters of ths algorthm are random. Smulaton experments show that the new method s obvously superor to the tradtonal SVM n the precson of predcton. The remander of ths paper s organzed as follows. Secton brefly ntroduces the learnng of SVR. Secton 3 provdes a background to data-dependent Kernel. Our method to tran SVR by teratve learnng wll be ntroduced n Secton 4. Secton 5 presents some computatonal results to show the effectveness of our method. Secton 6 concludes our works. II. SUPPORT VECTOR REGRESSION Let { x, y}, = 1,,, m be a gven set of tranng data, n where ( x, y ) R R. The output of the SVR s f ( x) =< w, φ( x) > + b, (1) where w s the weght vector, b the bas and φ ( x) the nonlnear mappng from the nput space S to the hgh dmensonal feature space F. < >, represent the nner product. The commonly used ε -nsenstve loss functon ntroduced by Vapnk s f( x) y ε, f( x) y ε Lε ( x ) =. () 0, f( x) y < ε In order to tran w and b, the followng functon s mnmzed 978-1-444-1674-5/08 /$5.00 008 IEEE CIS 008

1 Z = w + C ξ + ξ + mn ( ) + y < w, φ( x) > b ε + ξ, (3) st.. y +< w, φ( x) > + b ε + ξ + ξ, ξ 0, = 1,, m where C s the regularzed constant determnng the trade-off between the emprcal error and the regularzaton term. After the ntroducton of postve slack varables and Lagrange multplers, (3) s equvalent to a standard quadratc programmng (QP) problem whch can be solved wth QP. After (3) s optmzed, (1) can be rewrtten as f ( x) = ( α α ) k( x, x) + b, (4) x SV where α and α are the optmzed soluton of QP. kxx (, ) s the followng kernel functon kxx (, ) =< φ( x), φ( x ) >. (5) It should be ponted out that not all functons can be taken as kernel functon n the SVR. It has been proved that the kernel functon should satsfy the condtons of Mercer s theorem. The kernel functon plays an mportant role n the SVR. It has great effect on the predctng precson. In the next secton the kernel functon s modfed by the conformal mappng, whch makes the kernel functon data-dependent. III. DATA-DEPENDENT KERNEL Before you begn to format your paper, frst wrte and save the content as a separate text fle. Keep your text and graphc fles separate untl after the text has been formatted and styled. Do not use hard tabs, and lmt use of hard returns to only one return at the end of a paragraph. Do not add any knd of pagnaton anywhere n the paper. Do not number text headsthe template wll do that for you. In the tradtonal SVM and SVR, there are no theores concernng how to choose kernel functon n a data-dependent way. Whle some tme seres, such as fnancal tme seres, weather data etc, are nherently nosy, non-statonary and determnstcally chaotc. In order to mprove the precson of forecastng, t s necessary to redefne the kernel functon from the tranng data. In ths secton, a background to datadependent Kernel s provded. The kernel functon s modfed based on nformaton geometry [9-1], n a data-dependent way. From the pont of geometry, nonlnear mappng defnes an embeddng of nput space S nto feature space F as a curved sub-manfold. Generally, F s a reproducng kernel Hlbert space(rkhs) whch s a subspace of Hlbert space. So a Remannan metrc [9] can be nduced n the nput space, and the Remannan metrc can be expressed n the closed form n terms of the kernel g ( x) =< φ( x), φ ( x) >= k( x, x ) =. (6) x x x x j x x j j The volume form n the feature space s defned as dv = g( x) dx(1) dx() dx( n), (7) where gx ( ) = det gj( x), x(1), x(), x( n) are the whole elements of x, the factor gx ( ) represents how a local area s magnfed n F under the mappng φ( x) [9]. After a conformal mappng s ntroduced to the kernel functon, the new kernel functon s defned as k ( x, x ) = c( x) c( x ) k( x, x ), (8) where cx ( ) s a postve conformal mappng functon. It s easy to see that the new kernel functon k ( x, x ) satsfes the condtons of Mercer s theorem. The conformal mappng s taken as [9] cx h x x τ, (9) ( ) = exp( /( )) I where I s the support vectors set [9]. For the new kernel functon k ( x, x ), the Remannan metrc can be expressed as cx ( ) cx ( ) g x c x g x. (10) j ( ) = + ( ) j ( ) x xj One of the typcal kernel functons s Gaussan RBF kernel : In ths case, we have 1, where δj = 0, k( x, x ) = exp( x x / σ ). (11) = j. j g () x δ / σ j = j, (1) In the regon of a neghborhood of a support vector x, we have n n 4 gx ( ) = ( h / σ )exp( nr /( τ )) 1 + σ r / τ, (13) where r = x x s the Eucldean dstance between x and x.

IV. ITERATIVE MODIFIED ALGORITHM After the text edt has been completed, the paper s ready for the template. Duplcate the template fle by usng the Save As command, and use the namng conventon prescrbed by your conference for the name of your paper. In ths newly created fle, hghlght all of the contents and mport your prepared text fle. You are now ready to style your paper; use the scroll down wndow on the left of the MS Word Formattng toolbar. In order to mprove the performance of SVM classfer [9], the kernel functon s modfed such that the factor gx ( ) s enlarged around the support vectors. But n SVR, the kernel functon s modfed such that gx ( ) should be compressed nearby support vectors. In order to make sure the magnfcaton s compressed. Let cx = h x x τ, (14) ( ) exp( /( )) I s s k ( x, x ) = c ( x) c ( x ) k( x, x ), (15) s g sj ( x) = k s ( x, x ) x = x. (16) x x where s s a postve nteger and I the whole tranng data. The summaton n (14) runs over all the tranng data such that the tranng procedure can be smplfed. In ths paper the RBF kernel s adopted. For those pont x whch s nearby the tranng data x, we have j g s( x) = det g sj( x).(17) ns n 4 = ( h / σ )exp( nsr /( τ )) 1 + s σ r / τ It s easy to check the followng conclusons: 1. when h < mn{ σ,1}, g s ( x) s compressed nearby the tranng data x for s = 1,,.. when 1 h < e, g s ( x) s compressed nearby the tranng data x for s > M. 3. g s ( x) s compressed for x whch s far away every tranng data x. In summary, the tranng process of the new method s as followes. Step 1. Gve random values for σε,, Ch,, τ and error ; Step. Let k ( x, x ) = k( x, x ), j = 0 ; 0 Step 3. Tranng SVR by k ( x, x ), computer the fgure of ; mert δ = ( y f( x )) ( y y) j Step 4. If δ > error Then k (, ) ( ) ( ) (, ) j 1 x x + = c x c x kj x x j = j + 1, return Step 3 Else Ext. After the procedure fnshed, the modfed kernel s s s k ( x, x ) = c ( x) c ( x ) k( x, x ), (18) where s s the fnal teraton number. V. SIMULATION In ths secton, we do some smulatons to evaluate the performance of ths method. We have compared the teratve modfed SVR and the tradtonal SVR on the same examples. There have the same tranng parameters n optmal problem (3). To assess the approxmaton results and the generalzaton ablty, a fgure of mert s defned as, (19) δ = ( y f( x )) ( y y) where {( x, y ) = 1,, m} s the test set, y = y / m. A. Approxmaton of One-dmensonal Functon For comparson we show the results wth the teratve modfed SVR and wth the tradtonal SVR. The sold lnes represent the orgnal functon and the dashed lnes show the approxmatons. Fg.1 shows the results for the approxmaton of the functon f( x) = sn x+ (sn 3 x) / 3 sn( x/ ), x [0, π ]. Approxmaton results for three dfferent values of h and τ are represented n table I, where the parameters of the tradtonal SVR and the testng error are σ = 0.1, ε = 0.1, C = 0.05 and δ = 0.895 respectvely. The fgure of mert for each approxmaton s computed on a unformly sampled test set of 16 ponts. The nput x s constructed by the unform dstrbuton on nterval [0, π ], The tranng and test data are composed of 63 ponts and 16 ponts respectvely. Fg. shows the another approxmaton for the same functon, where the parameters of the tradtonal SVR and the testng error are σ = 0.1, ε = 0.1, C = 0.1 and δ = 0.790 respectvely. Ths shows the modfed method possesses better performance of generalzaton than the tradtonal SVR.

B. Approxmaton of Mult-dmensonal Functons Table II shows the approxmaton results of the earthquake wave, whch s regarded as tme seres. The tranng data set s 3 {( x, y) R R = 1,,,50}, a hstorcal lag wth order 3. The fgure of mert for each approxmaton s computed on a unformly sampled test set of 50 ponts. Fg.3 llustrates part of TABLE I. Parameters & testng error of the tradtonal SVR orgnal data, ts approxmaton by the tradtonal SVR and modfed SVR. From comparson we can see the modfed method possesses better performance of generalzaton than the tradtonal SVR. APPROXIMATION RESULTS AND PARAMETERS OF SINGLE VARIA BLE FUNCTION Parameters of the modfed SVR testng error of the frst modfed SVR testng error of the second modfed SVR testng error of the 10 s modfed SVR σ = 0.1 h = 0.1, τ = 1 0.5719 0.1598 0.1014 ε = 0.1 h = 0.1, τ = 0.8 0.681 0.3491 0.1014 C = 0.05 δ = 0.895 h = 0.08, τ = 0.8 0.7841 0.6099 0.1439 σ = 0.1, ε = 0.1 C = 0.1 h = 0.8, τ = 1 0.1044 0.1038 0.1038 δ = 0.790 a. The approxmaton result wth the tradtonal SVR. b. The approxmaton results wth the frst and second modfed SVR.

c. The approxmaton result wth the ten s modfed SVR. Fgure 1. Approxmaton results of one-dmensonal functon. Fgure. Approxmaton results wth the tradtonal and frst modfed SVR. TABLE II. APPROXIMATION RESULTS AND PARAMETERS OF MULTI-DIMENSIONAL FUNCTION Parameters of the tradtonal SVR Parameters of the modfed SVR testng error of the tradtonal SVR testng error of the frst modfed SVR testng error of the second modfed SVR σ = 0.04 ε = 0.01 C = h = 4, τ = 0.06 0.9753 0.1931 0.19

Fgure 3. Approxmaton results of mult-dmensonal functon. VI. CONCLUSION In ths paper we present an teratve modfed kernel to mprove the performance of SVR. It s based on the conformal mappng n nformaton geometry, whch results n datadependent kernel. The whole tranng data are used n the constructon of the conformal mappng nstead of support vectors. The dea s to compress the spatal resoluton such that the performance of forecastng s mproved. It s mportant that the kernel s modfed by teraton and the parameters of the kernel could be set as random values. Smulaton results show the effectveness and generalzaton ablty of ths method. ACKNOWLEDGMENT Ths work s supported by NSF Grant #50578168, Natural Scence Foundaton Project of CQ CSTC 007BB396 and KJ070403. REFERENCES [1] Vapnk V, Nature of Statstcal Learnng Theory [M], Translated by Zhang Xue-gong. Bejng: Tsnghua Unversty Press, 000, pp. 9 136. [] Scholkopf B, Sung K, Burges C, Comparng support vector machnes wth gaussan kernels to radal bass functon classfers,. IEEE Trans Sgnal Processng, 1997, vol. 45, pp. 758 765. [3] Perez-Cruz F, Nava-Vazquez A, Fgueras-Vdal A R, Emprcal rsk mnmzaton for support vector classfers, IEEE Trans on Neural Networks, 003, vol. 14, pp. 96 303. [4] Belhumeur P N, Hespanha J P, Kregman D J, Egenfaces vs fsherfaces :recognton usng class specfc lnear projecton, IEEE Trans on Pattern Analyss and Machne Intellgence, 1997, vol. 19, pp. 711 70. [5] Cao L-juan, Tay Francs E H, Fnancal forecastng usng support vector machnes, Neural Computng &Applcatons, 001, vol. 10, pp. 184 19. [6] Tay Francs E H, Cao L-juan, E-descendng support vector machnes for fnancal tme seres forecastng, Neural Processng Letters, 00, vol. 15, pp.179 195. [7] Yang Ha-qn, Chan La-wan, Kng Irw n support vector machne regresson for volatle stock market predcton, Proceedngs of Intellgent Data Engneerng and Automated Learnng. Berln: Sprnger- Verlag, 00, pp.319 396. [8] Smola, A.J., Schölkopf, B. &MÜLler, K. R., The connecton between regularzaton operators and support vector kernels, Neural Network, 1998, vol. 11, pp.637 649. [9] Amar S, W u S, Improvng support vector machne classfers by modfyng kernel functons, Neural Networks, 1999, vol. 1, pp.783 789. [10] Lang Yan-chun, Sun Yan-feng, An mproved method of support vector machne and ts applcatons to fnancal tme seres forcestng, Progresss n Natural Scence, 003, vol. 13, pp.696 700. [11] Coln Campbell, Kernel Methods: A survey of current technques, Neurocomputng, 00, vol. 48, pp.63 84. [1] Amar S, W u S, An nformaton-geometrcal method for mprovng the performance of support vector machne classfers, Proccedngs of Internatonal Conference on Artfcal Neural Network s 99. London: IEEE conference Publcaton No. 470, 1999, pp. 85 90.