A Comparison on Neural Network Forecasting

Size: px

Start display at page:

Download "A Comparison on Neural Network Forecasting"

Melvyn Brooks
5 years ago
Views:

1 011 nternatonal Conference on Crcuts, System and Smulaton PCST vol.7 (011) (011) ACST Press, Sngapore A Comparson on Neural Networ Forecastng Hong-Choon Ong 1, a and Shn-Yue Chan 1, b 1 School of Mathematcal Scences, Unverst Sans Malaysa, USM, Penang, Malaysa. a hcong@cs.usm.my, b shnyue_alu@yahoo.com Abstract. Ths study compares the effectveness of the Box-Jenns model and neural networs model n mang a forecast. An eghteen years bmonthly water consumpton n Penang data set s analyzed. Multlayer perceptron (MLP) n neural networ wth sngle hdden layer and double hdden layers usng errorbacpropagaton algorthm models are used. Four MLP programs wth double hdden layers whch are the orgnal data (O), the deseasonalzed wth the smoothng movng average method (DS), the lnear detrended model (DTL) and both deseasonalzed and detrended model (DSTL) and a sngle hdden layer MLP wth both deseasonalzed and detrended (DSTL) were smulated. For tme seres analyss, a Box-Jenns SARMA model s generated. The performance of the models are measured usng three types of error measurement, whch are mean absolute error (MAE), absolute percentage error (MAPE) and root mean squared error (RMSE). The model wth smallest value of MAE, MAPE and RMSE stands out to be the best model n predctng the water consumpton n Penang for the year 008. The results showed that both SARMA and double hdden layers MLP models perform relatvely well. Furthermore, double hdden layers MLP model shows an mprovement n predcton as compared wth the sngle hdden layer MLP model. Keywords: neural networ; mult-layer perceptron; Box-Jenns SARMA model. 1. ntroducton Neural networ has the ablty to represent both lnear and non-lnear relatonshps and the networ can learn these relatonshps drectly from the raw data. Mult-layered Perceptron (MLP) model s the most common model n neural networ and s a feed-forward model (Looney, 1997). Each node n one layer connects wth a certan weght to every other node n the followng layer. n general, more hdden layers can prevent the necessty of usng unnecessary neurons to acheve hghly non-lnear classfcaton. The error bac-propagaton algorthm used adjusts the weghts to obtan more accurate forecast values. A tme seres s a sequence of data ponts (observatons), collected typcally at successve tmes, spaced at often unform tme ntervals. Data collected nconsstently or only once may not be consdered as a tme seres. Tme seres analyss conssts of methods that attempt to understand the specfc tme seres, dentfyng the underlyng context of the observatons and ft a sutable model to mae predctons. Tme seres forecastng means the use of a model to predct future events based on the past data; to forecast future data ponts before they are measured. Tme seres can be appled nto varous felds, such as economc forecastng, sales forecastng, stoc maret analyss, process and qualty control, nventory forecastng, producton plannng and many more (Chatfeld, 004).. A Bref Lterature Revew n recent years, artfcal neural networs have been wdely used as a useful tool to model many areas of engneerng applcatons. Sarıdemr (008) presented the models n artfcal neural networs (ANN) for predctng compressve strength of concretes contanng metaaoln and slca fume that had been developed at the age of 1, 3, 7, 8, 56, 90 and 180 days. The mult-layered feed forward neural networs models were fed wth these parameters to predct the compressve strength values of concretes contanng metaaoln and slca fume. The results of ths study showed that neural networs have strong potental for predcton, whch proved that artfcal neural networs are capable of learnng and generalzng from examples and 56

2 experences. Ong, et al. (008) gave on a functonal approxmaton comparson between neural networs and polynomal regresson. The results showed that approxmaton usng polynomal regresson generates lower fracton of varance unexplaned (FVU) value as compared to approxmaton usng neural networs wth sngle hdden layer mult-layered perceptron (MLP) and double hdden layer MLP except for the complcated functons. Moreover, double hdden layer MLP showed better estmaton results than sngle hdden layer MLP wthout consderng the number of parameters beng used. On the other hand, statstcal models have commonly been used n tme seres data analyss and forecastng. Zou and Yang (003) proposed combnng tme seres model for forecastng for a better performance. Lam, et al. (008) used autoregressve ntegrated movng average (ARMA) model to measure the nterventon effects and the asymptotc change n the smulaton results of the busness process reengneerng that s based on the actvty model analyss. A case study of a purchasng process of a household applance manufacturng enterprse that nvolves 0 purchasng actvtes was used as the tranng data. The results ndcated that the changes can be explctly quantfed and the effects of busness process reengneerng can be measured. n general, many tme seres are asymptotcally unstable and ntrnscally non-statonary. Box-Jenns models solve these problems by mposng on data transformatons, such as dfferencng and logarthm. Grllenzon (1998) dscussed on a method for modelng tme seres wth unstable roots and changng parameters. A method of adaptve forecastng whch s based on the optmzaton of recursve estmators was appled to well-nown data sets. He had demonstrated ts valdty n several mplementatons and for dfferent model structures. 3. Methodology 3.1. Neural networ mult-layered perceptron Mult-layered perceptron s the most common neural networ model. t s abltes n achevng hghly nonlnear classfcaton and nose tolerance mae t sutable to be used n ths study. Bpolar sgmod functon s appled n ths study. n a sngle hdden layer model, the perceptron computes assocated outputs from multple real-valued nputs by formng a lnear combnaton accordng to ts nput weghts. The data set s treated as nputs and denoted as { x }, where = number of data and has an assocated set of exemplar outputs target vector. A random ntal synaptc weght set for the hdden and output layers are selected. Fnally, the current weghts are re-terated and updated usng the error bac-propagaton algorthm to force each of the nput exemplar feature vectors to be mapped closer to the target vector that dentfes a class n the nput space. n the double hdden layered MLP, the perceptrons also computes assocated outputs from multple real-valued nputs by formng a lnear combnaton accordng to ts nput weghts and the data set s treated as nputs. The man dfference s that the synaptc weght set has an addtonal second hdden layer. Error bac-propagaton s used to update the error computed n the networ wth respect to the networ's modfable weghts. The error value whch s the dfference between output value and desred value s then computed. The errors wll be propagated bac nto the networ usng a gradent descent technque so that weghts adjustment can be done. 3.. Desgn of Box-Jenns model There are three prmary stages n buldng a Box-Jenns tme seres model, whch are model dentfcaton, model estmaton and model dagnostcs. 1) Model dentfcaton: There s a need to determne f the seres s statonary, has a trend component and f there s any sgn of seasonalty that needs to be modeled. Statonarty, trend and seasonalty can be detected by plottng the tme seres graph. f there s trend component present n the data, the graph wll not be n a random form. Statonarty can also be checed va sample auto-correlaton (ACF). n a statonary seres, the auto-covarance functon, γ and the auto-correlaton functon, ρ show the followng propertes: 1. γ γ 0.. ρ 0 = 1 and -1< ρ <1 for γ = γ and ρ = ρ for all. Seasonalty can be accessed from sample auto-correlaton (ACF) at varous lag and also sample partal auto-correlaton (PACF) at varous lag of the process. The behavor of the ACF and PACF s shown n Table 1. After statonarty and seasonalty have been addressed, the next step s to dentfy the order (the p and q) of the autoregressve and movng average terms. The behavor of ACF and PACF wll ndcate the 57

3 order form of the autoregressve and movng average terms. TABLE. BEHAVOR OF ACF AND PACF AR(p) MA(q) ARMA(p,q) ACF Tals off Cuts off after lag q Tals off PACF Cuts off after lag p Tals off Tals off ) Model estmaton: Parameters estmaton for the Box-Jenns models s a qute complcated nonlnear estmaton problem. Therefore, hgh qualty software program that fts Box-Jenns model should be used to estmate the parameter (Anderson, 1994). Consder Y t s statonary tme seres and { ε t } are ndependently and dentcally dstrbuted wth N ( 0, σ ε ), an ARMA(p,q) model wth unnown parameters ϕ = { ϕ1, ϕ,... ϕp}, θ = { θ1, θ,... θ q } and σ ( ε = E εt ) s gven as: Y = ϕy + ϕ Y ϕ Y + ε θε θ ε... θ ε t 1 t 1 t p t p t 1 t 1 t q t q The man approaches to fttng Box-Jenns models are maxmum lelhood estmaton and least squares estmaton. 3) Model dagnostcs: The ACF plot and PACF plot of the estmated resdual { ε t } have the ablty to ndcate whether the resduals are ndependent and normally dstrbuted. f the ACF plot and PACF plot show the ponts are dstrbuted randomly or do not show any pattern, then ths ndcates that the resduals are ndependent and normally dstrbuted. However, f these assumptons are not satsfed, a more approprate model needs to be ftted. Ths means that, the model dentfcaton step has to be redone to re-develop a better model that satsfes the dagnostc test. Although a good model may not be obtaned on the frst tral, however a new and better model can be easly derved (We, 1990). Correlogram of resduals and correlogram of resduals squared are suffcent for determnng the status of whte nose. Q-statstcs s useful to test normalty of the resduals wthn a range of an ndependent varable. n other words, t served as a tool to test whether the seres s whte nose. The last two columns n the correlogram represent Ljung-Box Q- statstcs and ther p-value respectvely. The Q-statstcs at lag s a test statstcs for the null hypothess where there s no auto-correlaton up to order. t s computed as follows: Q = T( T + ) ( T j) r where rj represents the j-th auto-correlaton and T represents number of observatons. f the p-value s less than 0.05, then t s sgnfcant. Therefore, the null hypothess s rejected, and the error terms are correlated. f the p-values are greater or equal to 0.05, then the p-values are not sgnfcant at all lags. Thus, the null hypothess s not rejected, and the resduals are not correlated (Harvey, 001) Crtera comparson by error measurement Three types of error measurement are used n ths study, whch are the mean absolute error (MAE), the percentage error (MAPE) and the root mean squared error (RMSE). Mean Absolute Error (MAE) MAE = Average of the absolute values of forecastng error = 1 = ˆ where = 1,,,, = orgnal data, j = 1 1 j Mean Absolute Percentage Error (MAPE) MAPE = Percentage of the mean rato of the error to the orgnal data 58

4 = ˆ = 1 where = 1,,,, 100 = orgnal data, Root Mean Squared Error (RMSE) RMSE = Square root of the average of the summng square forecastng errors = n = 1 ( ) where = 1,,,, 4. Result and Dscusson = orgnal data, The total of 108 data collected s dvded nto two groups, namely tranng data and testng data. The tranng data whch conssts of the frst 10 data s used to forecast the next 6 forecast value. The last 6 data s used as testng set, where the man purpose s to mae comparson wth the forecastng value generated by tranng set n order to calculate the error. The model wth smallest error value of the forecast s the best ftted forecast model. Fnally, the next sx forecast ponts are forecasted usng the selected best model. The neural networ models are wrtten n C++ programmng language whle Box-Jenns model was evaluated n MNTAB 14 and E-VEW. Table shows the target value, forecasted value from SARMA model and forecasted value from neural networ models. TABLE. FORECASTED VALUE FROM SARMA MODEL AND NEURAL NETWORK MODELS. Neural Networ Models (double hdden layers) Neural Networ Model (sngle hdden layer) t orgnal data O DTL DS DSTL DSTL SARMA Model Crtera comparson by error measurement The error between forecasted value and the orgnal value s compared between estmated models. MAE, MAPE and RMSE are used as error measurement. Table 3 shows the error measurements for SARMA model, four Neural Networ models wth double hdden layers and one Neural Networ model wth sngle hdden layer. TABLE. ERROR MEASUREMENT FOR SARMA MODEL AND NEURAL NETWORK MODELS Model Type MAE MAPE RMSE SARMA Neural Networ models wth double hdden layers O DTL DS DSTL Neural Networ model wth sngle hdden layer DSTL When comparsons were made, both detrendng and deseasonalzed model (DSTL) wth double hdden layers gves the smallest error n MAE and MAPE compared to other models. However, SARMA model 59

5 stands out to be a better model f RMSE s beng compared. Snce the errors are squared before they are averaged, the RMSE gves a relatvely hgh weght to large errors. Although, the MAE s very smlar to the RMSE but t s less senstve to large forecast errors. Therefore, both DSTL wth double hdden layers model and SARMA model tends to have ts own good characterstcs n generatng smallest error durng forecastng. The orgnal model (O) wth double hdden layers appears to generate the hghest MAE, MAPE and RMSE, follow by deseasonalzed model (DS) wth double hdden layers, detrendng model (DTL) wth double hdden layers and both deseasonalzed and detrendng model (DSTL) wth sngle hdden layer. When DSTL wth double hdden layers model s compared wth DSTL wth sngle hdden layer model, the double hdden layers model outperforms sngle hdden layer model. 4.. Forecastng n water consumpton SARMA model and DSTL wth double hdden layers model are both the best model n forecastng the water consumpton n Penang and the next sx data are forecasted. Ths sx data s the bmonthly water consumpton n Penang data startng from January 008 to December 008. Table 4 and 5 show the forecasted sx data of bmonthly water consumpton n Penang from January 008 to December 008 by usng SARMA model and DSTL wth double hdden layers mult-layered perceptron model respectvely. TABLE V. FORECAST OF BMONTHLY WATER CONSUMPTON N PENANG USNG SARMA MODEL Bmonthly Jan-Feb Mar-Apr May-Jun Jul-Aug Sept-Oct Nov-Dec Forecast TABLE V. FORECAST OF BMONTHLY WATER CONSUMPTON N PENANG USNG DSTL 5. Concluson Bmonthly Jan-Feb Mar-Apr May-Jun Jul-Aug Sept-Oct Nov-Dec Forecast From the results generated, t s clearly shown that both SARMA model and DSTL are the best models used. For the neural networ models, t s obvous that preprocessng helps to mprove the forecast. Furthermore, t s also shown that the double hdden layered MLP that the DSTL s able to get a better result as compared to the sngle hdden layered MLP. Acnowledgment We would le to than the Drector of the Water Wors Department of Penang, Malaysa, for provdng us the data. Ths wor was supported n part by the U.S.M. nsentve grant no. 1001/PMATH/8184. References [1] T. W. Anderson, The statstcal analyss of tme seres, Unted States, John Wley and Sons, nc., [] C. Chatfeld, The analyss of tme seres: an ntroducton, 6th ed., Florda, CRC Press, 004. [3] C. Grllenzon, Forecastng unstable and nonstatonary tme seres, Journal of Forecastng, vol. 14(4), 1st December 1998, pp [4] A. C. Harvey, Forecastng, structural tme seres models and the Kalman flter, New Yor, Cambrdge Unversty Press, [5] C. Y. Lam, W. H. p and C. W. Lau, A busness process actvty model and performance measurement usng a tme seres ARMA nterventon analyss, Expert Systems wth Applcatons, vol. 36(3), Part, Aprl 009, pp [6] M. Sandemr, Predcton of compressve strength of concretes contanng metaaoln and slca fume by artfcal neural networs, Advances n Engneerng Software, vol. 40 (5), pp [7] We and W. S. Wllam, Tme seres analyss: unvarate and multvarate methods, Calforna: Addson-Wesley Publshng Company, nc., [8] H. Zou and Y. H. Yang, Combnng tme seres models for forecastng, Journal of Forecastng, vol. 0(1), January-March 005, pp

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence