FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES

Similar documents
Lecture Notes on Linear Regression

Kernel Methods and SVMs Extension

Support Vector Machines. Vibhav Gogate The University of Texas at dallas

Which Separator? Spring 1

Lecture 10 Support Vector Machines II

Linear Classification, SVMs and Nearest Neighbors

NUMERICAL DIFFERENTIATION

Generalized Linear Methods

The Study of Teaching-learning-based Optimization Algorithm

A new Approach for Solving Linear Ordinary Differential Equations

A Hybrid Variational Iteration Method for Blasius Equation

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING

Kristin P. Bennett. Rensselaer Polytechnic Institute

Multilayer Perceptron (MLP)

Support Vector Machines

Linear Feature Engineering 11

Natural Language Processing and Information Retrieval

Non-linear Canonical Correlation Analysis Using a RBF Network

Orientation Model of Elite Education and Mass Education

Kernels in Support Vector Machines. Based on lectures of Martin Law, University of Michigan

CS 3710: Visual Recognition Classification and Detection. Adriana Kovashka Department of Computer Science January 13, 2015

An Iterative Modified Kernel for Support Vector Regression

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Regularized Discriminant Analysis for Face Recognition

Homework Assignment 3 Due in class, Thursday October 15

Maximal Margin Classifier

Semi-supervised Classification with Active Query Selection

Supporting Information

The Minimum Universal Cost Flow in an Infeasible Flow Network

Online Classification: Perceptron and Winnow

CHALMERS, GÖTEBORGS UNIVERSITET. SOLUTIONS to RE-EXAM for ARTIFICIAL NEURAL NETWORKS. COURSE CODES: FFR 135, FIM 720 GU, PhD

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Support Vector Machines

The Expectation-Maximization Algorithm

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Linear Approximation with Regularization and Moving Least Squares

Negative Binomial Regression

Chapter 9: Statistical Inference and the Relationship between Two Variables

10-701/ Machine Learning, Fall 2005 Homework 3

MULTICLASS LEAST SQUARES AUTO-CORRELATION WAVELET SUPPORT VECTOR MACHINES. Yongzhong Xing, Xiaobei Wu and Zhiliang Xu

1 Convex Optimization

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

EEE 241: Linear Systems

Statistics for Economics & Business

Evaluation of simple performance measures for tuning SVM hyperparameters

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

MMA and GCMMA two methods for nonlinear optimization

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

MAXIMUM A POSTERIORI TRANSDUCTION

Chapter 6 Support vector machine. Séparateurs à vaste marge

RBF Neural Network Model Training by Unscented Kalman Filter and Its Application in Mechanical Fault Diagnosis

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSE 252C: Computer Vision III

4DVAR, according to the name, is a four-dimensional variational method.

Chapter 11: Simple Linear Regression and Correlation

Comparison of Regression Lines

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

DETERMINATION OF TEMPERATURE DISTRIBUTION FOR ANNULAR FINS WITH TEMPERATURE DEPENDENT THERMAL CONDUCTIVITY BY HPM

Feature Selection: Part 1

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

Solving Nonlinear Differential Equations by a Neural Network Method

Support Vector Machines

Week 5: Neural Networks

The Order Relation and Trace Inequalities for. Hermitian Operators

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Economics 101. Lecture 4 - Equilibrium and Efficiency

Relevance Vector Machines Explained

Asymptotics of the Solution of a Boundary Value. Problem for One-Characteristic Differential. Equation Degenerating into a Parabolic Equation

Lecture 23: Artificial neural networks

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

On the correction of the h-index for career length

Support Vector Machines CS434

Lecture 10 Support Vector Machines. Oct

Pop-Click Noise Detection Using Inter-Frame Correlation for Improved Portable Auditory Sensing

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Errors for Linear Systems

Comparative Studies of Law of Conservation of Energy. and Law Clusters of Conservation of Generalized Energy

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Foundations of Arithmetic

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Support Vector Machines CS434

Intro to Visual Recognition

Support Vector Machines

Sparse Gaussian Processes Using Backward Elimination

Support Vector Machines

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

arxiv:cs.cv/ Jun 2000

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Parameter Estimation for Dynamic System using Unscented Kalman filter

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

VQ widely used in coding speech, image, and video

SVMs: Duality and Kernel Trick. SVMs as quadratic programs

FUZZY GOAL PROGRAMMING VS ORDINARY FUZZY PROGRAMMING APPROACH FOR MULTI OBJECTIVE PROGRAMMING PROBLEM

Transcription:

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES DING-ZHOU CAO, SU-LIN PANG, YUAN-HUAI BAI Department of Mathematcs, Jnan Unversty, Guangzhou,Guangdong 5063,Chna E-MAIL: qucbrd895@63.com, pangsuln@63.com Abstract: Recently, support vector machnes s a focus research feld n the world, support vector regresson whch s used as a technology to solve the regresson problems have the advantages of global optmal solutons, the solutons avodng overtranng and so on. Ths paper establshes a model of exchange rate predcton based on support vector machnes, collects the daly data of USD/GBP exchange rate and uses these data to tran the model and checs the predctve power of ths model. The result shows that SVM model has some predctve power; t can be used to forecast fnance tme seres. In addton, ths artcle also dscusses the ssue on fndng the optmal parameters of SVM and does lots of experments to fnd them. Keywords: Exchange rate forecastng; SVM; SVR; Tme seres predcton. Introducton These years varous nds of neural networs algorthms especally the BP algorthm becomes the most popular tool for exchange rate modelng and forecastng. There have been many studes n ths feld, such as Jntao Yao and Chewlm Tan [3] usng BP networs to predct several nds of long-term exchange rates, Francesco Ls and Rosa [6] proposed a comparson between neural networs and chaotc models for exchange rate predcton. However, some of these studes showed that ANN had some lmtatons n tself, t would have the overtranng problem due to mplementng the emprcal rs mnmzaton prncples, t would fall nto a local optmal soluton, t also requred to select a large number of controllng parameters whch s very dffculty to made, all of these have greatly lmted ts further applcaton. Recently, support vector machnes (SVM, whch s developed by Vapn [] and hs colleagues n md-990s, s a hot research feld n the world. It s a novel neural networs algorthm based on statstc learnng theory and structural rs mnmzaton prncple, but compared wth ANN, ts soluton may be global optmum, t can select the model automatcally and doesn t have the over fttng problem. After several years development, SVM has been success appled n some felds, such as pattern recognton and functon regresson. Nowadays, some scholars begn to apply t to fnance tme seres forecastng, Kyoung-ac Km n [] apples SVM to predctng stoc prce ndex of South Korea and the ht rate could reach 58%, whle the other author [8] used SVM to predct 5 nds of exchange rate ncludng GBP/USD, but hs purpose was ust to mae an comparson of SVM wth BP neural networs. Ths paper establshes a model of exchange rate predcton based on SVM learnng theory, uses the actual data to tran the model and maes one-step predcton. The result shows that the tendences of the predcted value curve are bascally dentcal to that of the actual value curve, though t rght devates from the actual one obvously. In addton, snce there s no structured way to choose the optmal parameters of SVM, the varablty n performance wth respect to the parameters s nvestgated n ths artcle.. The theory of Support vector machnes.. Support vector machnes Brefly, the tas of SVM s ust to map the nput space nto the hgh-dmensonal feature space by mplementng nonlnear transfers, and then fnd the optmal separatng Fgure the optmal separatng hyperplane hyperplane n the feature space. The so-called optmal 0-7803-909-/05/$0.00 005 IEEE 3448

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 separatng hyperplane s a super plane that no only accurately separates the entre tranng samples, but also mae the dstance of tranng samples whch s the closest to separatng hyperplane maxmum, such nd of dstance called margn. For example, n dmensons space, ust showng n Fg. [].The sold dots and hollow dots represent two nds of tranng samples respectably, H s the decson boundary, H and H are two parallel lnes; the dstance between these two lnes s the margn. Accordng to a theory from Learnng Theory, from all possble decson functons the one that maxmzes the margn of the tranng set wll mnmze the generalzaton error; here the one decson functon s the decson boundary H. We assume the functon of H s x w b = 0, after normalzaton, the tranng set (x,, =,, n, x R d, y {, } satsfes the y condton: y [( w x b] 0, =, n. At ths tme, the margn s / w, n order to maxmze the margn, you have to mnmze the w (, then the decson w and satsfes condton ( functon whch mnmze s the optmal hyperplane, whle the sample dots whch were crossed by, H called support vectors. H.. Support vector machnes n regresson approxmaton Gve a set of data ponts x, y }, d =,,, x R, y { R. The approxmated functon s as followng: f ( x = w φ ( x b ( Where w s the weght, b s the threshold, and φ ( s the nonlnear mappng functon. By usng ths φ (, SVM can map nput space nto hgh-dmensonal feature space, then n the new space, construct an optmal separatng hyperplane and mae the data lnearly separable. φ( can be replaced by the ernel functon K x, x ( whch s defned n nner space and satsfyng Mercer s condton. The coeffcents w and b are estmated by mnmzng the regularzed rs functon: Mnmze: f ( x y ς ε St: y f ( x ς ε ς, ς 0 Introducng Lagrange multplers: L( w, b, ς,ς, a, a, γ, γ = = = = a w w C = a [ ς ε y a w [ ς C l ε y ( ς γ ς γ l = ( ς ς L ( y, f ( ε f ( x ] x y f ( x ε y f ( x ε Lε ( y, f ( x = 0, otherwse In order to get the estmatons of w and b, Eq.(3 s transformed nto the followng equaton by ntroducng the slac varables ς 0, ς 0 Mnmze: w In Eq.(5, C = f ( x ] (3 ( ς ς (4 (5 and a are Lagrange multplers, they satsfy a 0 and γ, γ 0, =, and they, a, are obtaned by maxmzng the dual functon of Eq.(4, whch has the followng explct form: W ( a, a =, = ( a a ( a ( a a y = = a ( a a ( x x ε (6 ( a a = 0 St. = 0 a, a C Then we can get the approxmated functon 3449

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 Eq.( f (x as followng: f ( x = ( a a K( x, x b =.3. Kernel functon K( x, x s defned as the ernel functon. The value of the ernel functon s equal to the nner product of the two vectors x, x n the feature spaceφ x and φ x, ( ( φ( x φ( x ( that s K x, x =. SVM smles to an ANN n form of ernel functon. The output s the lnear assocaton of hdden neurons; every hdden neuron has a support vector, there are lots of ernel functons because any functon satsfyng Mercer s condton can be used as ernel functon, but only 3 are more useful, they are: The frst s the polynomal ernel: q K ( x, x = [( x x ] (8 The second s the Gaussan RBF ernel: K( x, x (7 x x = exp (9 σ The thrd s the tangent ernel: K ( x, x = tanh( v( x x c (0 3. Data Experments 3.. Research data Gven a tme seres{ x, x,, x n }, n order to mae some predcton on t usng SVM, t must be transferred nto an autocorrected dataset, that s to say f x } s the goal value of predcton, the prevous values x, x,, x should be the corrected { t t t p} varables of nput. Then we can map the autocorrected nput t = x t varables x {, x, } to the goal t x t p { t varable y, whch can be denoted as R d t = { x t } f : R, here p called embeddng dmenson. As to the choosng of embeddng dmenson, t must accord to the practcal problems. After transferrng the data le these, we can get the samples whch are sutable for SVM learnng, denotng n matrx form: xn p x = n x n X Y = p Before usng these samples to tran the SVM, the orgnal data are scaled nto the range of[ 0,]: x( t X mn x '( t =, t = n, n,, ( X max X mn The goal of lnear scalng s to ndependently normalze each feature component to the specfed range. It ensures the larger value nput attrbutes do not overwhelm the smaller value nputs, and then helps to reduce predcton errors. The predcton performance s evaluated usng MSE (mean square error, ts equaton s: here x, y N MSE = ( x y ( N = are the actual value and predcton value of each tme seres respectvely. As to the ernel functon, here we choose the Gaussan RBF ernel, because from the former wor [], we now that usng SVR to mae predcton, Gaussan RBF performances better than other ernel functons. In ths paper, we use LIBSVM software system [4] to perform our experments. 3.. Experments and results The data consdered n ths study were daly exchange rates of Brtsh Pound aganst Amercan Dollar, from Fgure the embeddng dmenson 3450

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 January of 003 to January 8 of 005; they are total daly exchange rates. Because we don t now the optmal embeddng dmenson p, after scalng the orgnal data, the frst thng we have to do s to fnd ths p. Fxng other condtons, we use p = {3,4,5,6,7,8, 9} to do our prelmnary experments respectably, the results showed n Fg.. From Fg., we now that when p = 4, MSE s the best, so we thn that 4 days lagged daly exchange rates s the most sutable for forecastng the next data s exchange rate. After wrtng the data nto matrx form mentoned above, we can get 57 sub tme seres. We dvded these 57 data nto 3 parts. The frst part nclude 350 sub tme seres called tranng set, whch are used to tran SVM, the second part nclude 00 called valdaton set, for fndng the optmal parameters of SVM, whle the test set (composed of the remanng 67 data, are used to chec the predctve power of SVM. In our experments, the ernel parameters,ε and C are selected based on the valdaton set. In the next paragraph, MSE and the number of support vectors wth respect to the three free parameters are nvestgated. Only the results of are llustrated, the same can be appled to the other two parameters. under-ft. Fgure4 the number of support vectors An approprate value for would be between 0 and 00. Fgure.4 shows that the number of support vector decreases frst and then ncrease wth as most of the data ponts are converged to the support vectors n the under-fttng cases. In the end, by summng up the analyss above and after several testng, we choose =00, C = 00, ε = 0. 00 as the best choce for our experment, and then use these parameters to tran the model agan, then to predct the test set. The fnal result are MSE=0.00300396, the number of support vectors s 9. Fg.6 shows the comparng of predcted value over actual value, the real lne presents the actual data and the dashed lne presents the predcted data. Fgure3 the MSE Fg.3 gves the MSE and the number of support vector of SVM at varous (0., 0000 respectvely, n whch C and ε are fxed at 00 and 0.00. The fgure shows that when (0.,00, the MSE decrease as ncrease, whle (00,0000, t ncrease as ncrease. Ths ndcates that too small a value of (0.,00 or too large value of (00,0000 can cause the SVM to Fgure5 the results From Fg.5, we now that: (.The tendences of the predcted value curve are basc dentcal to that of the actual value curve. ( Thought the predcted curve fts the actual curve very well, t s rght devates from the actual one obvous. 345

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 4. Conclusons Ths paper manly ntroduces the theory of support vector machnes, uses the closng prce of GBP/USD exchange rate, establshes a predcted model based on SVM and uses ths model to mae some predcton on the exchange rate. The tme span of the research data s from January of 003 to January 8 of 005(composed of 5 data, after transferrng t nto the matrx form whch s sutable for SVM learnng, there are 57 sub tme seres left. Among them the frst 350 sub tme seres looed as tranng set, whch are used to tran SVM, the successve 00, are valdaton set, for fndng the optmal parameters of SVM, whle the remanng 67 data are the test set, used to chec the predctve power of SVM. The research results show that: the SVM model has some predctve power; t can be used to forecast the tendences of the exchange rate very well. On the other hand, thought the predcted curve ft the actual curve very well, t rght devates from the actual one obvous, and ths s also the man tas for the author n hs further research wor. In the artcle, we also nvestgate the settng of parameters of SVM, and now that these parameters play an mportant role n the performance of SVM; mproper settng of them would brng a great dfferent output. Acnowledgements The research s supported by the Natural Scence Foundaton of Guangdong Provnce (3906, the Key Programs of Scence and Technology Bureau of Guangzhou (004Z3-D03 and the Key Programs of Scence and Technology Department of Guangdong Provnce (004B00033 References [] V.N. Vapn, Statstcal Learnng Theory, Wley, New Yor, 998 [] Kyoung-ae Km, Fnancal tme seres forecastng usng support vector machnes, Neurocomputng 55(003 307-39 [3] Jngtao Yao, Chew Lm Tan, A case study on usng neural networs to perform techncal forecastng of forex, Neurocomputng 34(000 79-98 [4] C-C Chang, C-J Ln, LIBSVM: a lbrary for support vector maches, Techncal Report, Department of Computer Scence and Informaton Engneerng, Natonal Tawan Unversty, 00 [5] Nello Crtann, John Shawe-Taylor, An Introducton to Support Vector Machnes and Other Kernel-based Learnng Methods, Publshng House of Electroncs Industry [6] Francesco Ls, Rosa A. Schavo, A comparson between neural networs and chaotc models for exchage rate predcton, Computatonal Statstcs & Data Analyss 30 (999 87-0 [7] V.M.Rvas,J.J.Merelo,P.A.Castllo,M.G.Arenas,J.G..Ca stellano, Evolvng RBF neural networs for tme-seres forecastng wth EvRBF, Informaton Scences 65 (004 07-0 [8] Francs E.H.Tay, Luan Cao, Applcaton of support vector machnes n fnancal tme seres forecastng, Omega 9(00 309-37 [9] Francs E.H.Tay, Luan Cao, Improved fnancal tme seres forecastng by combnng Support Vector Machnes wth self-organzng feature map, Intellgent Data Analyss 5 (00 339-354 IOS Press [0] Mona R.EI Shazly, Hassan E.EI Shazly, Comparng the forecastng performance of neural networs and forward exchange rate, Journal of Multnatonnal Fnancal Management 7(997 345-356 [] Zhang Xuegong, Introducton to statstcal learnng theory and support vector machnes, ACTA AUTOMATICA SINICA, 6(000. 345