FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 FORECASTING EXCHANGE RATE USING SUPPORT VECTOR MACHINES DING-ZHOU CAO, SU-LIN PANG, YUAN-HUAI BAI Department of Mathematcs, Jnan Unversty, Guangzhou,Guangdong 5063,Chna E-MAIL: qucbrd895@63.com, pangsuln@63.com Abstract: Recently, support vector machnes s a focus research feld n the world, support vector regresson whch s used as a technology to solve the regresson problems have the advantages of global optmal solutons, the solutons avodng overtranng and so on. Ths paper establshes a model of exchange rate predcton based on support vector machnes, collects the daly data of USD/GBP exchange rate and uses these data to tran the model and checs the predctve power of ths model. The result shows that SVM model has some predctve power; t can be used to forecast fnance tme seres. In addton, ths artcle also dscusses the ssue on fndng the optmal parameters of SVM and does lots of experments to fnd them. Keywords: Exchange rate forecastng; SVM; SVR; Tme seres predcton. Introducton These years varous nds of neural networs algorthms especally the BP algorthm becomes the most popular tool for exchange rate modelng and forecastng. There have been many studes n ths feld, such as Jntao Yao and Chewlm Tan [3] usng BP networs to predct several nds of long-term exchange rates, Francesco Ls and Rosa [6] proposed a comparson between neural networs and chaotc models for exchange rate predcton. However, some of these studes showed that ANN had some lmtatons n tself, t would have the overtranng problem due to mplementng the emprcal rs mnmzaton prncples, t would fall nto a local optmal soluton, t also requred to select a large number of controllng parameters whch s very dffculty to made, all of these have greatly lmted ts further applcaton. Recently, support vector machnes (SVM, whch s developed by Vapn [] and hs colleagues n md-990s, s a hot research feld n the world. It s a novel neural networs algorthm based on statstc learnng theory and structural rs mnmzaton prncple, but compared wth ANN, ts soluton may be global optmum, t can select the model automatcally and doesn t have the over fttng problem. After several years development, SVM has been success appled n some felds, such as pattern recognton and functon regresson. Nowadays, some scholars begn to apply t to fnance tme seres forecastng, Kyoung-ac Km n [] apples SVM to predctng stoc prce ndex of South Korea and the ht rate could reach 58%, whle the other author [8] used SVM to predct 5 nds of exchange rate ncludng GBP/USD, but hs purpose was ust to mae an comparson of SVM wth BP neural networs. Ths paper establshes a model of exchange rate predcton based on SVM learnng theory, uses the actual data to tran the model and maes one-step predcton. The result shows that the tendences of the predcted value curve are bascally dentcal to that of the actual value curve, though t rght devates from the actual one obvously. In addton, snce there s no structured way to choose the optmal parameters of SVM, the varablty n performance wth respect to the parameters s nvestgated n ths artcle.. The theory of Support vector machnes.. Support vector machnes Brefly, the tas of SVM s ust to map the nput space nto the hgh-dmensonal feature space by mplementng nonlnear transfers, and then fnd the optmal separatng Fgure the optmal separatng hyperplane hyperplane n the feature space. The so-called optmal 0-7803-909-/05/$0.00 005 IEEE 3448

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 separatng hyperplane s a super plane that no only accurately separates the entre tranng samples, but also mae the dstance of tranng samples whch s the closest to separatng hyperplane maxmum, such nd of dstance called margn. For example, n dmensons space, ust showng n Fg. [].The sold dots and hollow dots represent two nds of tranng samples respectably, H s the decson boundary, H and H are two parallel lnes; the dstance between these two lnes s the margn. Accordng to a theory from Learnng Theory, from all possble decson functons the one that maxmzes the margn of the tranng set wll mnmze the generalzaton error; here the one decson functon s the decson boundary H. We assume the functon of H s x w b = 0, after normalzaton, the tranng set (x,, =,, n, x R d, y {, } satsfes the y condton: y [( w x b] 0, =, n. At ths tme, the margn s / w, n order to maxmze the margn, you have to mnmze the w (, then the decson w and satsfes condton ( functon whch mnmze s the optmal hyperplane, whle the sample dots whch were crossed by, H called support vectors. H.. Support vector machnes n regresson approxmaton Gve a set of data ponts x, y }, d =,,, x R, y { R. The approxmated functon s as followng: f ( x = w φ ( x b ( Where w s the weght, b s the threshold, and φ ( s the nonlnear mappng functon. By usng ths φ (, SVM can map nput space nto hgh-dmensonal feature space, then n the new space, construct an optmal separatng hyperplane and mae the data lnearly separable. φ( can be replaced by the ernel functon K x, x ( whch s defned n nner space and satsfyng Mercer s condton. The coeffcents w and b are estmated by mnmzng the regularzed rs functon: Mnmze: f ( x y ς ε St: y f ( x ς ε ς, ς 0 Introducng Lagrange multplers: L( w, b, ς,ς, a, a, γ, γ = = = = a w w C = a [ ς ε y a w [ ς C l ε y ( ς γ ς γ l = ( ς ς L ( y, f ( ε f ( x ] x y f ( x ε y f ( x ε Lε ( y, f ( x = 0, otherwse In order to get the estmatons of w and b, Eq.(3 s transformed nto the followng equaton by ntroducng the slac varables ς 0, ς 0 Mnmze: w In Eq.(5, C = f ( x ] (3 ( ς ς (4 (5 and a are Lagrange multplers, they satsfy a 0 and γ, γ 0, =, and they, a, are obtaned by maxmzng the dual functon of Eq.(4, whch has the followng explct form: W ( a, a =, = ( a a ( a ( a a y = = a ( a a ( x x ε (6 ( a a = 0 St. = 0 a, a C Then we can get the approxmated functon 3449

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 Eq.( f (x as followng: f ( x = ( a a K( x, x b =.3. Kernel functon K( x, x s defned as the ernel functon. The value of the ernel functon s equal to the nner product of the two vectors x, x n the feature spaceφ x and φ x, ( ( φ( x φ( x ( that s K x, x =. SVM smles to an ANN n form of ernel functon. The output s the lnear assocaton of hdden neurons; every hdden neuron has a support vector, there are lots of ernel functons because any functon satsfyng Mercer s condton can be used as ernel functon, but only 3 are more useful, they are: The frst s the polynomal ernel: q K ( x, x = [( x x ] (8 The second s the Gaussan RBF ernel: K( x, x (7 x x = exp (9 σ The thrd s the tangent ernel: K ( x, x = tanh( v( x x c (0 3. Data Experments 3.. Research data Gven a tme seres{ x, x,, x n }, n order to mae some predcton on t usng SVM, t must be transferred nto an autocorrected dataset, that s to say f x } s the goal value of predcton, the prevous values x, x,, x should be the corrected { t t t p} varables of nput. Then we can map the autocorrected nput t = x t varables x {, x, } to the goal t x t p { t varable y, whch can be denoted as R d t = { x t } f : R, here p called embeddng dmenson. As to the choosng of embeddng dmenson, t must accord to the practcal problems. After transferrng the data le these, we can get the samples whch are sutable for SVM learnng, denotng n matrx form: xn p x = n x n X Y = p Before usng these samples to tran the SVM, the orgnal data are scaled nto the range of[ 0,]: x( t X mn x '( t =, t = n, n,, ( X max X mn The goal of lnear scalng s to ndependently normalze each feature component to the specfed range. It ensures the larger value nput attrbutes do not overwhelm the smaller value nputs, and then helps to reduce predcton errors. The predcton performance s evaluated usng MSE (mean square error, ts equaton s: here x, y N MSE = ( x y ( N = are the actual value and predcton value of each tme seres respectvely. As to the ernel functon, here we choose the Gaussan RBF ernel, because from the former wor [], we now that usng SVR to mae predcton, Gaussan RBF performances better than other ernel functons. In ths paper, we use LIBSVM software system [4] to perform our experments. 3.. Experments and results The data consdered n ths study were daly exchange rates of Brtsh Pound aganst Amercan Dollar, from Fgure the embeddng dmenson 3450

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 January of 003 to January 8 of 005; they are total daly exchange rates. Because we don t now the optmal embeddng dmenson p, after scalng the orgnal data, the frst thng we have to do s to fnd ths p. Fxng other condtons, we use p = {3,4,5,6,7,8, 9} to do our prelmnary experments respectably, the results showed n Fg.. From Fg., we now that when p = 4, MSE s the best, so we thn that 4 days lagged daly exchange rates s the most sutable for forecastng the next data s exchange rate. After wrtng the data nto matrx form mentoned above, we can get 57 sub tme seres. We dvded these 57 data nto 3 parts. The frst part nclude 350 sub tme seres called tranng set, whch are used to tran SVM, the second part nclude 00 called valdaton set, for fndng the optmal parameters of SVM, whle the test set (composed of the remanng 67 data, are used to chec the predctve power of SVM. In our experments, the ernel parameters,ε and C are selected based on the valdaton set. In the next paragraph, MSE and the number of support vectors wth respect to the three free parameters are nvestgated. Only the results of are llustrated, the same can be appled to the other two parameters. under-ft. Fgure4 the number of support vectors An approprate value for would be between 0 and 00. Fgure.4 shows that the number of support vector decreases frst and then ncrease wth as most of the data ponts are converged to the support vectors n the under-fttng cases. In the end, by summng up the analyss above and after several testng, we choose =00, C = 00, ε = 0. 00 as the best choce for our experment, and then use these parameters to tran the model agan, then to predct the test set. The fnal result are MSE=0.00300396, the number of support vectors s 9. Fg.6 shows the comparng of predcted value over actual value, the real lne presents the actual data and the dashed lne presents the predcted data. Fgure3 the MSE Fg.3 gves the MSE and the number of support vector of SVM at varous (0., 0000 respectvely, n whch C and ε are fxed at 00 and 0.00. The fgure shows that when (0.,00, the MSE decrease as ncrease, whle (00,0000, t ncrease as ncrease. Ths ndcates that too small a value of (0.,00 or too large value of (00,0000 can cause the SVM to Fgure5 the results From Fg.5, we now that: (.The tendences of the predcted value curve are basc dentcal to that of the actual value curve. ( Thought the predcted curve fts the actual curve very well, t s rght devates from the actual one obvous. 345

Proceedngs of the Fourth Internatonal Conference on Machne Learnng and Cybernetcs, Guangzhou, 8- August 005 4. Conclusons Ths paper manly ntroduces the theory of support vector machnes, uses the closng prce of GBP/USD exchange rate, establshes a predcted model based on SVM and uses ths model to mae some predcton on the exchange rate. The tme span of the research data s from January of 003 to January 8 of 005(composed of 5 data, after transferrng t nto the matrx form whch s sutable for SVM learnng, there are 57 sub tme seres left. Among them the frst 350 sub tme seres looed as tranng set, whch are used to tran SVM, the successve 00, are valdaton set, for fndng the optmal parameters of SVM, whle the remanng 67 data are the test set, used to chec the predctve power of SVM. The research results show that: the SVM model has some predctve power; t can be used to forecast the tendences of the exchange rate very well. On the other hand, thought the predcted curve ft the actual curve very well, t rght devates from the actual one obvous, and ths s also the man tas for the author n hs further research wor. In the artcle, we also nvestgate the settng of parameters of SVM, and now that these parameters play an mportant role n the performance of SVM; mproper settng of them would brng a great dfferent output. Acnowledgements The research s supported by the Natural Scence Foundaton of Guangdong Provnce (3906, the Key Programs of Scence and Technology Bureau of Guangzhou (004Z3-D03 and the Key Programs of Scence and Technology Department of Guangdong Provnce (004B00033 References [] V.N. Vapn, Statstcal Learnng Theory, Wley, New Yor, 998 [] Kyoung-ae Km, Fnancal tme seres forecastng usng support vector machnes, Neurocomputng 55(003 307-39 [3] Jngtao Yao, Chew Lm Tan, A case study on usng neural networs to perform techncal forecastng of forex, Neurocomputng 34(000 79-98 [4] C-C Chang, C-J Ln, LIBSVM: a lbrary for support vector maches, Techncal Report, Department of Computer Scence and Informaton Engneerng, Natonal Tawan Unversty, 00 [5] Nello Crtann, John Shawe-Taylor, An Introducton to Support Vector Machnes and Other Kernel-based Learnng Methods, Publshng House of Electroncs Industry [6] Francesco Ls, Rosa A. Schavo, A comparson between neural networs and chaotc models for exchage rate predcton, Computatonal Statstcs & Data Analyss 30 (999 87-0 [7] V.M.Rvas,J.J.Merelo,P.A.Castllo,M.G.Arenas,J.G..Ca stellano, Evolvng RBF neural networs for tme-seres forecastng wth EvRBF, Informaton Scences 65 (004 07-0 [8] Francs E.H.Tay, Luan Cao, Applcaton of support vector machnes n fnancal tme seres forecastng, Omega 9(00 309-37 [9] Francs E.H.Tay, Luan Cao, Improved fnancal tme seres forecastng by combnng Support Vector Machnes wth self-organzng feature map, Intellgent Data Analyss 5 (00 339-354 IOS Press [0] Mona R.EI Shazly, Hassan E.EI Shazly, Comparng the forecastng performance of neural networs and forward exchange rate, Journal of Multnatonnal Fnancal Management 7(997 345-356 [] Zhang Xuegong, Introducton to statstcal learnng theory and support vector machnes, ACTA AUTOMATICA SINICA, 6(000. 345